tree: 1e46e9c9e3284f02eecf9dc6a1e32f6ea6c466d7 [path history] [tgz]
  1. .clang-format
  2. build_helpers.h
  3. build_interface.h
  4. bvh.h
  5. copy.comp
  6. copy_blas_addrs_gfx12.comp
  7. encode.comp
  8. encode.h
  9. encode_gfx12.comp
  10. header.comp
  11. invocation_cluster.h
  12. leaf.comp
  13. meson.build
  14. README.md
  15. update.comp
  16. update.h
  17. update_gfx12.comp
src/amd/vulkan/bvh/README.md

GFX12

GFX12 introduces a new BVH encoding for the image_bvh_dual_intersect_ray and image_bvh8_intersect_ray instructions.

BVH8 box node

bitsize/rangenamedescription
32internal_child_offsetOffset of child BVH8 box nodes in units of 8 bytes.
32primitive_child_offsetOffset of child primitive nodes in units of 8 bytes.
32unusedUsed by amdvlk for storing the parent node ID.
32origin_xx-offset applied to all child AABBs.
32origin_yy-offset applied to all child AABBs.
32origin_zz-offset applied to all child AABBs.
8exponent_x
8exponent_y
8exponent_z
4unused
4child_count_minus_one
32obb_matrix_indexSelects a matrix for transforming the ray before performing intersection tests. 0x7F to disable OBB.
96x8children[8]

children[8] element layout:

bitsize/rangenamedescription
12min_xFixed point child AABB coordinate.
12min_y
4cull_flags
4unused
12min_z
12max_x
8cull_mask
12max_y
4node_type
4node_sizeIncrement for the child offset in units of 128 bytes.

The coordinates of child AABBs are encoded as follows:

  • min: floor((x - origin_x) / extent)
  • max: ceil((x - origin_x) / extent) - 1

image_bvh8_intersect_ray will return the node IDs of the child nodes.

Primitive node

Highlevel layout:

bitsize/rangenamedescription
52headerMisc information about this node.
vertex_prefixes[3]
dataCompressed vertex positions followed by primitive/geometry index data.
29xtriangle_pair_countpair_desc[triangle_pair_count]Misc information about a triangle pair.

header layout:

bitsize/rangenamedescription
5x_vertex_bits_minus_one
5y_vertex_bits_minus_one
5z_vertex_bits_minus_one
5trailing_zero_bits
4geometry_index_base_bits_div_2
4geometry_index_bits_div_2
3triangle_pair_count_minus_one
1vertex_type
5primitive_index_base_bits
5primitive_index_bits
10indices_midpointBit offset where the geometry and primitive indices start (geometry indices in negative direction, primitive indices in positive direction)

The data field is split in three sections:

  1. Vertex data, this is a list of floats which share the same prefix and the same number of trailing zero bits. The decompressed value (for example the x component of a vertex) is (prefix << 32 - prefix_bits_x) | read(x_vertex_bits) << trailing_zero_bits where prefix_bits_x is derived from x_vertex_bits and trailing_zero_bits (32 - x_vertex_bits - trailing_zero_bits).
  2. Geometry indices.
  3. Primitive indices.

Geometry indices are encoded the same way with the only difference being that geometry indices are read/written in negative direction starting from indices_midpoint. The indices section starts with a *_index_base_bits-bit value *_index_base which is the index of the first triangle. Subsequent triangles use indices calculated based on a *_index_bits-bit value:

  • *_index = read(*_index_bits) if *_index_bits >= *_index_base_bits
  • *_index = read(*_index_bits) | (*_index_base & ~BITFIELD_MASK(*_index_bits)) otherwise.

pair_desc(s) layout:

bitsize/rangenamedescription
1prim_range_stop
1tri1_double_sided
1tri1_opaque
4tri1_v0_indexIndices into data, 0xF for procedural nodes.
4tri1_v1_index0xF for procedural nodes.
4tri1_v2_index
tri0 has identical fields:
1tri0_double_sided
1tri0_opaque
4tri0_v0_index
4tri0_v1_index
4tri0_v2_index

image_bvh8_intersect_ray will return the following data for triangle nodes:

VGPR indexvalue
0t0
1(procedural0 << 31) | u0
2(opaque0 << 31) | v0
3(primitive_index0 << 1) | backface0
4t1
5(procedural1 << 31) | u1
6(opaque1 << 31) | v1
7(primitive_index1 << 1) | backface1
8(geometry_index0 << 2) | navigation_bits
9(geometry_index1 << 2) | navigation_bits

image_bvh8_intersect_ray will return the following data for procedural nodes:

VGPR indexvalue
3primitive_index0 << 1
8(geometry_index0 << 2) | navigation_bits
9(geometry_index1 << 2) | navigation_bits

navigation_bits is 0 if there are more triangle pairs to process, 1 if this was the last triangle pair and 3 if prim_range_stop is set.

Instance node

bitsize/rangenamedescription
32x3x4world_to_object
62bvh_addrUnits of 4 bytes.
1aabbsDoes the BLAS (only) contain AABBs? Used for pointer flag based culling.
1unused
32unused
24user_dataReturned by the intersect instruction for instance nodes.
8cull_mask
The instance node can have up to 4 quantized child nodes:
32origin_xx-offset applied to all child AABBs.
32origin_yy-offset applied to all child AABBs.
32origin_zz-offset applied to all child AABBs.
8exponent_x
8exponent_y
8exponent_z
4unused
4child_count_minus_one
96x4children[4]

image_bvh8_intersect_ray will return:

VGPR indexvalue
2BLAS addr lo
3BLAS addr hi
6user_data
7(child_ids[0] & 0xFF) | ((child_ids[1] & 0xFF) << 8) | ((child_ids[2] & 0xFF) << 16) | ((child_ids[3] & 0xFF) << 24)