| // Copyright 2015-2021 The Khronos Group, Inc. |
| // |
| // SPDX-License-Identifier: CC-BY-4.0 |
| |
| [[shaders]] |
| = Shaders |
| |
| A shader specifies programmable operations that execute for each vertex, |
| control point, tessellated vertex, primitive, fragment, or workgroup in the |
| corresponding stage(s) of the graphics and compute pipelines. |
| |
| Graphics pipelines include vertex shader execution as a result of |
| <<drawing,primitive assembly>>, followed, if enabled, by tessellation |
| control and evaluation shaders operating on <<drawing-patch-lists,patches>>, |
| geometry shaders, if enabled, operating on primitives, and fragment shaders, |
| if present, operating on fragments generated by <<primsrast,Rasterization>>. |
| In this specification, vertex, tessellation control, tessellation evaluation |
| and geometry shaders are collectively referred to as |
| <<pipeline-graphics-subsets-pre-rasterization,pre-rasterization shader |
| stage>>s and occur in the logical pipeline before rasterization. |
| The fragment shader occurs logically after rasterization. |
| |
| Only the compute shader stage is included in a compute pipeline. |
| Compute shaders operate on compute invocations in a workgroup. |
| |
| Shaders can: read from input variables, and read from and write to output |
| variables. |
| Input and output variables can: be used to transfer data between shader |
| stages, or to allow the shader to interact with values that exist in the |
| execution environment. |
| Similarly, the execution environment provides constants that describe |
| capabilities. |
| |
| Shader variables are associated with execution environment-provided inputs |
| and outputs using _built-in_ decorations in the shader. |
| The available decorations for each stage are documented in the following |
| subsections. |
| |
| |
| [[shader-modules]] |
| == Shader Modules |
| |
| [open,refpage='VkShaderModule',desc='Opaque handle to a shader module object',type='handles'] |
| -- |
| _Shader modules_ contain _shader code_ and one or more entry points. |
| Shaders are selected from a shader module by specifying an entry point as |
| part of <<pipelines,pipeline>> creation. |
| The stages of a pipeline can: use shaders that come from different modules. |
| The shader code defining a shader module must: be in the SPIR-V format, as |
| described by the <<spirvenv,Vulkan Environment for SPIR-V>> appendix. |
| |
| Shader modules are represented by sname:VkShaderModule handles: |
| |
| include::{generated}/api/handles/VkShaderModule.txt[] |
| -- |
| |
| [open,refpage='vkCreateShaderModule',desc='Creates a new shader module object',type='protos'] |
| -- |
| To create a shader module, call: |
| |
| include::{generated}/api/protos/vkCreateShaderModule.txt[] |
| |
| * pname:device is the logical device that creates the shader module. |
| * pname:pCreateInfo is a pointer to a slink:VkShaderModuleCreateInfo |
| structure. |
| * pname:pAllocator controls host memory allocation as described in the |
| <<memory-allocation, Memory Allocation>> chapter. |
| * pname:pShaderModule is a pointer to a slink:VkShaderModule handle in |
| which the resulting shader module object is returned. |
| |
| Once a shader module has been created, any entry points it contains can: be |
| used in pipeline shader stages as described in <<pipelines-compute,Compute |
| Pipelines>> and <<pipelines-graphics,Graphics Pipelines>>. |
| |
| include::{generated}/validity/protos/vkCreateShaderModule.txt[] |
| -- |
| |
| [open,refpage='VkShaderModuleCreateInfo',desc='Structure specifying parameters of a newly created shader module',type='structs'] |
| -- |
| The sname:VkShaderModuleCreateInfo structure is defined as: |
| |
| include::{generated}/api/structs/VkShaderModuleCreateInfo.txt[] |
| |
| * pname:sType is the type of this structure. |
| * pname:pNext is `NULL` or a pointer to a structure extending this |
| structure. |
| * pname:flags is reserved for future use. |
| * pname:codeSize is the size, in bytes, of the code pointed to by |
| pname:pCode. |
| * pname:pCode is a pointer to code that is used to create the shader |
| module. |
| The type and format of the code is determined from the content of the |
| memory addressed by pname:pCode. |
| |
| .Valid Usage |
| **** |
| * [[VUID-VkShaderModuleCreateInfo-codeSize-01085]] |
| pname:codeSize must: be greater than 0 |
| ifndef::VK_NV_glsl_shader[] |
| * [[VUID-VkShaderModuleCreateInfo-codeSize-01086]] |
| pname:codeSize must: be a multiple of 4 |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01087]] |
| pname:pCode must: point to valid SPIR-V code, formatted and packed as |
| described by the <<spirv-spec,Khronos SPIR-V Specification>> |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01088]] |
| pname:pCode must: adhere to the validation rules described by the |
| <<spirvenv-module-validation, Validation Rules within a Module>> section |
| of the <<spirvenv-capabilities,SPIR-V Environment>> appendix |
| endif::VK_NV_glsl_shader[] |
| ifdef::VK_NV_glsl_shader[] |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01376]] |
| If pname:pCode is a pointer to SPIR-V code, pname:codeSize must: be a |
| multiple of 4 |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01377]] |
| pname:pCode must: point to either valid SPIR-V code, formatted and |
| packed as described by the <<spirv-spec,Khronos SPIR-V Specification>> |
| or valid GLSL code which must: be written to the `GL_KHR_vulkan_glsl` |
| extension specification |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01378]] |
| If pname:pCode is a pointer to SPIR-V code, that code must: adhere to |
| the validation rules described by the <<spirvenv-module-validation, |
| Validation Rules within a Module>> section of the |
| <<spirvenv-capabilities,SPIR-V Environment>> appendix |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01379]] |
| If pname:pCode is a pointer to GLSL code, it must: be valid GLSL code |
| written to the `GL_KHR_vulkan_glsl` GLSL extension specification |
| endif::VK_NV_glsl_shader[] |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01089]] |
| pname:pCode must: declare the code:Shader capability for SPIR-V code |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01090]] |
| pname:pCode must: not declare any capability that is not supported by |
| the API, as described by the <<spirvenv-module-validation, |
| Capabilities>> section of the <<spirvenv-capabilities,SPIR-V |
| Environment>> appendix |
| * [[VUID-VkShaderModuleCreateInfo-pCode-01091]] |
| If pname:pCode declares any of the capabilities listed in the |
| <<spirvenv-capabilities-table,SPIR-V Environment>> appendix, one of the |
| corresponding requirements must: be satisfied |
| * [[VUID-VkShaderModuleCreateInfo-pCode-04146]] |
| pname:pCode must: not declare any SPIR-V extension that is not supported |
| by the API, as described by the <<spirvenv-extensions, Extension>> |
| section of the <<spirvenv-capabilities,SPIR-V Environment>> appendix |
| * [[VUID-VkShaderModuleCreateInfo-pCode-04147]] |
| If pname:pCode declares any of the SPIR-V extensions listed in the |
| <<spirvenv-extensions-table,SPIR-V Environment>> appendix, one of the |
| corresponding requirements must: be satisfied |
| **** |
| |
| include::{generated}/validity/structs/VkShaderModuleCreateInfo.txt[] |
| -- |
| |
| [open,refpage='VkShaderModuleCreateFlags',desc='Reserved for future use',type='flags'] |
| -- |
| include::{generated}/api/flags/VkShaderModuleCreateFlags.txt[] |
| |
| tname:VkShaderModuleCreateFlags is a bitmask type for setting a mask, but is |
| currently reserved for future use. |
| -- |
| |
| ifdef::VK_EXT_validation_cache[] |
| include::VK_EXT_validation_cache/shader-module-validation-cache.txt[] |
| endif::VK_EXT_validation_cache[] |
| |
| |
| [open,refpage='vkDestroyShaderModule',desc='Destroy a shader module',type='protos'] |
| -- |
| To destroy a shader module, call: |
| |
| include::{generated}/api/protos/vkDestroyShaderModule.txt[] |
| |
| * pname:device is the logical device that destroys the shader module. |
| * pname:shaderModule is the handle of the shader module to destroy. |
| * pname:pAllocator controls host memory allocation as described in the |
| <<memory-allocation, Memory Allocation>> chapter. |
| |
| A shader module can: be destroyed while pipelines created using its shaders |
| are still in use. |
| |
| .Valid Usage |
| **** |
| * [[VUID-vkDestroyShaderModule-shaderModule-01092]] |
| If sname:VkAllocationCallbacks were provided when pname:shaderModule was |
| created, a compatible set of callbacks must: be provided here |
| * [[VUID-vkDestroyShaderModule-shaderModule-01093]] |
| If no sname:VkAllocationCallbacks were provided when pname:shaderModule |
| was created, pname:pAllocator must: be `NULL` |
| **** |
| |
| include::{generated}/validity/protos/vkDestroyShaderModule.txt[] |
| -- |
| |
| |
| [[shaders-execution]] |
| == Shader Execution |
| |
| At each stage of the pipeline, multiple invocations of a shader may: execute |
| simultaneously. |
| Further, invocations of a single shader produced as the result of different |
| commands may: execute simultaneously. |
| The relative execution order of invocations of the same shader type is |
| undefined:. |
| Shader invocations may: complete in a different order than that in which the |
| primitives they originated from were drawn or dispatched by the application. |
| However, fragment shader outputs are written to attachments in |
| <<primsrast-order,rasterization order>>. |
| |
| The relative execution order of invocations of different shader types is |
| largely undefined:. |
| However, when invoking a shader whose inputs are generated from a previous |
| pipeline stage, the shader invocations from the previous stage are |
| guaranteed to have executed far enough to generate input values for all |
| required inputs. |
| |
| |
| [[shaders-execution-memory-ordering]] |
| == Shader Memory Access Ordering |
| |
| The order in which image or buffer memory is read or written by shaders is |
| largely undefined:. |
| For some shader types (vertex, tessellation evaluation, and in some cases, |
| fragment), even the number of shader invocations that may: perform loads and |
| stores is undefined:. |
| |
| In particular, the following rules apply: |
| |
| * <<shaders-vertex-execution,Vertex>> and |
| <<shaders-tessellation-evaluation-execution,tessellation evaluation>> |
| shaders will be invoked at least once for each unique vertex, as defined |
| in those sections. |
| * <<fragops-shader,Fragment>> shaders will be invoked zero or more times, |
| as defined in that section. |
| * The relative execution order of invocations of the same shader type is |
| undefined:. |
| A store issued by a shader when working on primitive B might complete |
| prior to a store for primitive A, even if primitive A is specified prior |
| to primitive B. This applies even to fragment shaders; while fragment |
| shader outputs are always written to the framebuffer in |
| <<primsrast-order, rasterization order>>, stores executed by fragment |
| shader invocations are not. |
| * The relative execution order of invocations of different shader types is |
| largely undefined:. |
| |
| [NOTE] |
| .Note |
| ==== |
| The above limitations on shader invocation order make some forms of |
| synchronization between shader invocations within a single set of primitives |
| unimplementable. |
| For example, having one invocation poll memory written by another invocation |
| assumes that the other invocation has been launched and will complete its |
| writes in finite time. |
| ==== |
| |
| ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| |
| The <<memory-model,Memory Model>> appendix defines the terminology and rules |
| for how to correctly communicate between shader invocations, such as when a |
| write is <<memory-model-visible-to,Visible-To>> a read, and what constitutes |
| a <<memory-model-access-data-race,Data Race>>. |
| |
| Applications must: not cause a data race. |
| |
| endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| |
| ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| |
| Stores issued to different memory locations within a single shader |
| invocation may: not be visible to other invocations, or may: not become |
| visible in the order they were performed. |
| |
| The code:OpMemoryBarrier instruction can: be used to provide stronger |
| ordering of reads and writes performed by a single invocation. |
| code:OpMemoryBarrier guarantees that any memory transactions issued by the |
| shader invocation prior to the instruction complete prior to the memory |
| transactions issued after the instruction. |
| Memory barriers are needed for algorithms that require multiple invocations |
| to access the same memory and require the operations to be performed in a |
| partially-defined relative order. |
| For example, if one shader invocation does a series of writes, followed by |
| an code:OpMemoryBarrier instruction, followed by another write, then the |
| results of the series of writes before the barrier become visible to other |
| shader invocations at a time earlier or equal to when the results of the |
| final write become visible to those invocations. |
| In practice it means that another invocation that sees the results of the |
| final write would also see the previous writes. |
| Without the memory barrier, the final write may: be visible before the |
| previous writes. |
| |
| Writes that are the result of shader stores through a variable decorated |
| with code:Coherent automatically have available writes to the same buffer, |
| buffer view, or image view made visible to them, and are themselves |
| automatically made available to access by the same buffer, buffer view, or |
| image view. |
| Reads that are the result of shader loads through a variable decorated with |
| code:Coherent automatically have available writes to the same buffer, buffer |
| view, or image view made visible to them. |
| The order that coherent writes to different locations become available is |
| undefined:, unless enforced by a memory barrier instruction or other memory |
| dependency. |
| |
| [NOTE] |
| .Note |
| ==== |
| Explicit memory dependencies must: still be used to guarantee availability |
| and visibility for access via other buffers, buffer views, or image views. |
| ==== |
| |
| The built-in atomic memory transaction instructions can: be used to read and |
| write a given memory address atomically. |
| While built-in atomic functions issued by multiple shader invocations are |
| executed in undefined: order relative to each other, these functions perform |
| both a read and a write of a memory address and guarantee that no other |
| memory transaction will write to the underlying memory between the read and |
| write. |
| Atomic operations ensure automatic availability and visibility for writes |
| and reads in the same way as those to code:Coherent variables. |
| |
| [NOTE] |
| .Note |
| ==== |
| Memory accesses performed on different resource descriptors with the same |
| memory backing may: not be well-defined even with the code:Coherent |
| decoration or via atomics, due to things such as image layouts or ownership |
| of the resource - as described in the <<synchronization, Synchronization and |
| Cache Control>> chapter. |
| ==== |
| |
| [NOTE] |
| .Note |
| ==== |
| Atomics allow shaders to use shared global addresses for mutual exclusion or |
| as counters, among other uses. |
| ==== |
| |
| endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| |
| The SPIR-V *SubgroupMemory*, *CrossWorkgroupMemory*, and |
| *AtomicCounterMemory* memory semantics are ignored. |
| Sequentially consistent atomics and barriers are not supported and |
| *SequentiallyConsistent* is treated as *AcquireRelease*. |
| *SequentiallyConsistent* should: not be used. |
| |
| |
| [[shaders-inputs]] |
| == Shader Inputs and Outputs |
| |
| Data is passed into and out of shaders using variables with input or output |
| storage class, respectively. |
| User-defined inputs and outputs are connected between stages by matching |
| their code:Location decorations. |
| Additionally, data can: be provided by or communicated to special functions |
| provided by the execution environment using code:BuiltIn decorations. |
| |
| In many cases, the same code:BuiltIn decoration can: be used in multiple |
| shader stages with similar meaning. |
| The specific behavior of variables decorated as code:BuiltIn is documented |
| in the following sections. |
| |
| |
| ifdef::VK_NV_mesh_shader[] |
| [[shaders-task]] |
| == Task Shaders |
| |
| Task shaders operate in conjunction with the mesh shaders to produce a |
| collection of primitives that will be processed by subsequent stages of the |
| graphics pipeline. |
| Its primary purpose is to create a variable amount of subsequent mesh shader |
| invocations. |
| |
| Task shaders are invoked via the execution of the |
| <<drawing-mesh-shading,programmable mesh shading>> pipeline. |
| |
| The task shader has no fixed-function inputs other than variables |
| identifying the specific workgroup and invocation. |
| The only fixed output of the task shader is a task count, identifying the |
| number of mesh shader workgroups to create. |
| The task shader can write additional outputs to task memory, which can be |
| read by all of the mesh shader workgroups it created. |
| |
| |
| === Task Shader Execution |
| |
| Task workloads are formed from groups of work items called workgroups and |
| processed by the task shader in the current graphics pipeline. |
| A workgroup is a collection of shader invocations that execute the same |
| shader, potentially in parallel. |
| Task shaders execute in _global workgroups_ which are divided into a number |
| of _local workgroups_ with a size that can: be set by assigning a value to |
| the code:LocalSize |
| ifdef::VK_KHR_maintenance4[or code:LocalSizeId] |
| execution mode or via an object decorated by the code:WorkgroupSize |
| decoration. |
| An invocation within a local workgroup can: share data with other members of |
| the local workgroup through shared variables and issue memory and control |
| flow barriers to synchronize with other members of the local workgroup. |
| |
| |
| [[shaders-mesh]] |
| == Mesh Shaders |
| |
| Mesh shaders operate in workgroups to produce a collection of primitives |
| that will be processed by subsequent stages of the graphics pipeline. |
| Each workgroup emits zero or more output primitives and the group of |
| vertices and their associated data required for each output primitive. |
| |
| Mesh shaders are invoked via the execution of the |
| <<drawing-mesh-shading,programmable mesh shading>> pipeline. |
| |
| The only inputs available to the mesh shader are variables identifying the |
| specific workgroup and invocation and, if applicable, any outputs written to |
| task memory by the task shader that spawned the mesh shader's workgroup. |
| The mesh shader can operate without a task shader as well. |
| |
| The invocations of the mesh shader workgroup write an output mesh, |
| comprising a set of primitives with per-primitive attributes, a set of |
| vertices with per-vertex attributes, and an array of indices identifying the |
| mesh vertices that belong to each primitive. |
| The primitives of this mesh are then processed by subsequent graphics |
| pipeline stages, where the outputs of the mesh shader form an interface with |
| the fragment shader. |
| |
| |
| === Mesh Shader Execution |
| |
| Mesh workloads are formed from groups of work items called workgroups and |
| processed by the mesh shader in the current graphics pipeline. |
| A workgroup is a collection of shader invocations that execute the same |
| shader, potentially in parallel. |
| Mesh shaders execute in _global workgroups_ which are divided into a number |
| of _local workgroups_ with a size that can: be set by assigning a value to |
| the code:LocalSize |
| ifdef::VK_KHR_maintenance4[or code:LocalSizeId] |
| execution mode or via an object decorated by the code:WorkgroupSize |
| decoration. |
| An invocation within a local workgroup can: share data with other members of |
| the local workgroup through shared variables and issue memory and control |
| flow barriers to synchronize with other members of the local workgroup. |
| |
| The _global workgroups_ may be generated explcitly via the API, or |
| implicitly through the task shader's work creation mechanism. |
| endif::VK_NV_mesh_shader[] |
| |
| |
| [[shaders-vertex]] |
| == Vertex Shaders |
| |
| Each vertex shader invocation operates on one vertex and its associated |
| <<fxvertex-attrib,vertex attribute>> data, and outputs one vertex and |
| associated data. |
| ifndef::VK_NV_mesh_shader[] |
| Graphics pipelines must: include a vertex shader, and the vertex shader |
| stage is always the first shader stage in the graphics pipeline. |
| endif::VK_NV_mesh_shader[] |
| ifdef::VK_NV_mesh_shader[] |
| Graphics pipelines using primitive shading must: include a vertex shader, |
| and the vertex shader stage is always the first shader stage in the graphics |
| pipeline. |
| endif::VK_NV_mesh_shader[] |
| |
| |
| [[shaders-vertex-execution]] |
| === Vertex Shader Execution |
| |
| A vertex shader must: be executed at least once for each vertex specified by |
| a drawing command. |
| ifdef::VK_VERSION_1_1,VK_KHR_multiview[] |
| If the subpass includes multiple views in its view mask, the shader may: be |
| invoked separately for each view. |
| endif::VK_VERSION_1_1,VK_KHR_multiview[] |
| During execution, the shader is presented with the index of the vertex and |
| instance for which it has been invoked. |
| Input variables declared in the vertex shader are filled by the |
| implementation with the values of vertex attributes associated with the |
| invocation being executed. |
| |
| If the same vertex is specified multiple times in a drawing command (e.g. by |
| including the same index value multiple times in an index buffer) the |
| implementation may: reuse the results of vertex shading if it can statically |
| determine that the vertex shader invocations will produce identical results. |
| |
| [NOTE] |
| .Note |
| ==== |
| It is implementation-dependent when and if results of vertex shading are |
| reused, and thus how many times the vertex shader will be executed. |
| This is true also if the vertex shader contains stores or atomic operations |
| (see <<features-vertexPipelineStoresAndAtomics, |
| pname:vertexPipelineStoresAndAtomics>>). |
| ==== |
| |
| |
| [[shaders-tessellation-control]] |
| == Tessellation Control Shaders |
| |
| The tessellation control shader is used to read an input patch provided by |
| the application and to produce an output patch. |
| Each tessellation control shader invocation operates on an input patch |
| (after all control points in the patch are processed by a vertex shader) and |
| its associated data, and outputs a single control point of the output patch |
| and its associated data, and can: also output additional per-patch data. |
| The input patch is sized according to the pname:patchControlPoints member of |
| slink:VkPipelineTessellationStateCreateInfo, as part of input assembly. |
| |
| ifdef::VK_EXT_extended_dynamic_state2[] |
| The input patch can also be dynamically sized with pname:patchControlPoints |
| parameter of flink:vkCmdSetPatchControlPointsEXT. |
| |
| [open,refpage='vkCmdSetPatchControlPointsEXT',desc='Specify the number of control points per patch dynamically for a command buffer',type='protos'] |
| -- |
| To <<pipelines-dynamic-state, dynamically set>> the number of control points |
| per patch, call: |
| |
| include::{generated}/api/protos/vkCmdSetPatchControlPointsEXT.txt[] |
| |
| * pname:commandBuffer is the command buffer into which the command will be |
| recorded. |
| * pname:patchControlPoints specifies the number of control points per |
| patch. |
| |
| This command sets the number of control points per patch for subsequent |
| drawing commands when the graphics pipeline is created with |
| ename:VK_DYNAMIC_STATE_PATCH_CONTROL_POINTS_EXT set in |
| slink:VkPipelineDynamicStateCreateInfo::pname:pDynamicStates. |
| Otherwise, this state is specified by the |
| slink:VkPipelineTessellationStateCreateInfo::pname:patchControlPoints value |
| used to create the currently active pipeline. |
| |
| .Valid Usage |
| **** |
| * [[VUID-vkCmdSetPatchControlPointsEXT-None-04873]] |
| The <<features-extendedDynamicState2PatchControlPoints, |
| extendedDynamicState2PatchControlPoints>> feature must: be enabled |
| * [[VUID-vkCmdSetPatchControlPointsEXT-patchControlPoints-04874]] |
| pname:patchControlPoints must: be greater than zero and less than or |
| equal to sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize |
| **** |
| |
| include::{generated}/validity/protos/vkCmdSetPatchControlPointsEXT.txt[] |
| -- |
| endif::VK_EXT_extended_dynamic_state2[] |
| |
| The size of the output patch is controlled by the code:OpExecutionMode |
| code:OutputVertices specified in the tessellation control or tessellation |
| evaluation shaders, which must: be specified in at least one of the shaders. |
| The size of the input and output patches must: each be greater than zero and |
| less than or equal to |
| sname:VkPhysicalDeviceLimits::pname:maxTessellationPatchSize. |
| |
| |
| [[shaders-tessellation-control-execution]] |
| === Tessellation Control Shader Execution |
| |
| A tessellation control shader is invoked at least once for each _output_ |
| vertex in a patch. |
| ifdef::VK_VERSION_1_1,VK_KHR_multiview[] |
| If the subpass includes multiple views in its view mask, the shader may: be |
| invoked separately for each view. |
| endif::VK_VERSION_1_1,VK_KHR_multiview[] |
| |
| Inputs to the tessellation control shader are generated by the vertex |
| shader. |
| Each invocation of the tessellation control shader can: read the attributes |
| of any incoming vertices and their associated data. |
| The invocations corresponding to a given patch execute logically in |
| parallel, with undefined: relative execution order. |
| However, the code:OpControlBarrier instruction can: be used to provide |
| limited control of the execution order by synchronizing invocations within a |
| patch, effectively dividing tessellation control shader execution into a set |
| of phases. |
| Tessellation control shaders will read undefined: values if one invocation |
| reads a per-vertex or per-patch output written by another invocation at any |
| point during the same phase, or if two invocations attempt to write |
| different values to the same per-patch output in a single phase. |
| |
| |
| [[shaders-tessellation-evaluation]] |
| == Tessellation Evaluation Shaders |
| |
| The Tessellation Evaluation Shader operates on an input patch of control |
| points and their associated data, and a single input barycentric coordinate |
| indicating the invocation's relative position within the subdivided patch, |
| and outputs a single vertex and its associated data. |
| |
| |
| [[shaders-tessellation-evaluation-execution]] |
| === Tessellation Evaluation Shader Execution |
| |
| A tessellation evaluation shader is invoked at least once for each unique |
| vertex generated by the tessellator. |
| ifdef::VK_VERSION_1_1,VK_KHR_multiview[] |
| If the subpass includes multiple views in its view mask, the shader may: be |
| invoked separately for each view. |
| endif::VK_VERSION_1_1,VK_KHR_multiview[] |
| |
| |
| [[shaders-geometry]] |
| == Geometry Shaders |
| |
| The geometry shader operates on a group of vertices and their associated |
| data assembled from a single input primitive, and emits zero or more output |
| primitives and the group of vertices and their associated data required for |
| each output primitive. |
| |
| |
| [[shaders-geometry-execution]] |
| === Geometry Shader Execution |
| |
| A geometry shader is invoked at least once for each primitive produced by |
| the tessellation stages, or at least once for each primitive generated by |
| <<drawing,primitive assembly>> when tessellation is not in use. |
| A shader can request that the geometry shader runs multiple |
| <<geometry-invocations, instances>>. |
| A geometry shader is invoked at least once for each instance. |
| ifdef::VK_VERSION_1_1,VK_KHR_multiview[] |
| If the subpass includes multiple views in its view mask, the shader may: be |
| invoked separately for each view. |
| endif::VK_VERSION_1_1,VK_KHR_multiview[] |
| |
| |
| [[shaders-fragment]] |
| == Fragment Shaders |
| |
| Fragment shaders are invoked as a <<fragops-shader, fragment operation>> in |
| a graphics pipeline. |
| Each fragment shader invocation operates on a single fragment and its |
| associated data. |
| With few exceptions, fragment shaders do not have access to any data |
| associated with other fragments and are considered to execute in isolation |
| of fragment shader invocations associated with other fragments. |
| |
| |
| [[shaders-compute]] |
| == Compute Shaders |
| |
| Compute shaders are invoked via flink:vkCmdDispatch and |
| flink:vkCmdDispatchIndirect commands. |
| In general, they have access to similar resources as shader stages executing |
| as part of a graphics pipeline. |
| |
| Compute workloads are formed from groups of work items called workgroups and |
| processed by the compute shader in the current compute pipeline. |
| A workgroup is a collection of shader invocations that execute the same |
| shader, potentially in parallel. |
| Compute shaders execute in _global workgroups_ which are divided into a |
| number of _local workgroups_ with a size that can: be set by assigning a |
| value to the code:LocalSize |
| ifdef::VK_KHR_maintenance4[or code:LocalSizeId] |
| execution mode or via an object decorated by the code:WorkgroupSize |
| decoration. |
| An invocation within a local workgroup can: share data with other members of |
| the local workgroup through shared variables and issue memory and control |
| flow barriers to synchronize with other members of the local workgroup. |
| |
| |
| ifdef::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] |
| [[shaders-raytracing-shaders]] |
| [[shaders-ray-generation]] |
| == Ray Generation Shaders |
| |
| A ray generation shader is similar to a compute shader. |
| Its main purpose is to execute ray tracing queries using code:OpTraceRayKHR |
| instructions and process the results. |
| |
| |
| [[shaders-ray-generation-execution]] |
| === Ray Generation Shader Execution |
| |
| One ray generation shader is executed per ray tracing dispatch. |
| Its location in the shader binding table (see <<shader-binding-table,Shader |
| Binding Table>> for details) is passed directly into fname:vkCmdTraceRaysKHR |
| using the pname:raygenShaderBindingTableBuffer and |
| pname:raygenShaderBindingOffset parameters. |
| |
| |
| [[shaders-intersection]] |
| == Intersection Shaders |
| |
| Intersection shaders enable the implementation of arbitrary, application |
| defined geometric primitives. |
| An intersection shader for a primitive is executed whenever its axis-aligned |
| bounding box is hit by a ray. |
| |
| Like other ray tracing shader domains, an intersection shader operates on a |
| single ray at a time. |
| It also operates on a single primitive at a time. |
| It is therefore the purpose of an intersection shader to compute the |
| ray-primitive intersections and report them. |
| To report an intersection, the shader calls the code:OpReportIntersectionKHR |
| instruction. |
| |
| An intersection shader communicates with any-hit and closest shaders by |
| generating attribute values that they can: read. |
| Intersection shaders cannot: read or modify the ray payload. |
| |
| |
| [[shaders-intersection-execution]] |
| === Intersection Shader Execution |
| The order in which intersections are found along a ray, and therefore the |
| order in which intersection shaders are executed, is unspecified. |
| |
| The intersection shader of the closest AABB which intersects the ray is |
| guaranteed to be executed at some point during traversal, unless the ray is |
| forcibly terminated. |
| |
| |
| [[shaders-any-hit]] |
| == Any-Hit Shaders |
| |
| The any-hit shader is executed after the intersection shader reports an |
| intersection that lies within the current [eq]#[t~min~,t~max~]# of the ray. |
| The main use of any-hit shaders is to programmatically decide whether or not |
| an intersection will be accepted. |
| The intersection will be accepted unless the shader calls the |
| code:OpIgnoreIntersectionKHR instruction. |
| Any-hit shaders have read-only access to the attributes generated by the |
| corresponding intersection shader, and can: read or modify the ray payload. |
| |
| |
| [[shaders-any-hit-execution]] |
| === Any-Hit Shader Execution |
| |
| The order in which intersections are found along a ray, and therefore the |
| order in which any-hit shaders are executed, is unspecified. |
| |
| The any-hit shader of the closest hit is guaranteed to be executed at some |
| point during traversal, unless the ray is forcibly terminated. |
| |
| |
| [[shaders-closest-hit]] |
| == Closest Hit Shaders |
| |
| Closest hit shaders have read-only access to the attributes generated by the |
| corresponding intersection shader, and can: read or modify the ray payload. |
| They also have access to a number of system-generated values. |
| Closest hit shaders can: call code:OpTraceRayKHR to recursively trace rays. |
| |
| |
| [[shaders-closest-hit-execution]] |
| === Closest Hit Shader Execution |
| |
| Exactly one closest hit shader is executed when traversal is finished and an |
| intersection has been found and accepted. |
| |
| |
| [[shaders-miss]] |
| == Miss Shaders |
| |
| Miss shaders can: access the ray payload and can: trace new rays through the |
| code:OpTraceRayKHR instruction, but cannot: access attributes since they are |
| not associated with an intersection. |
| |
| |
| [[shaders-miss-execution]] |
| === Miss Shader Execution |
| |
| A miss shader is executed instead of a closest hit shader if no intersection |
| was found during traversal. |
| |
| |
| [[shaders-callable]] |
| == Callable Shaders |
| |
| Callable shaders can: access a callable payload that works similarly to ray |
| payloads to do subroutine work. |
| |
| |
| [[shaders-callable-execution]] |
| === Callable Shader Execution |
| |
| A callable shader is executed by calling code:OpExecuteCallableKHR from an |
| allowed shader stage. |
| |
| endif::VK_NV_ray_tracing,VK_KHR_ray_tracing_pipeline[] |
| |
| |
| [[shaders-interpolation-decorations]] |
| == Interpolation Decorations |
| |
| Interpolation decorations control the behavior of attribute interpolation in |
| the fragment shader stage. |
| Interpolation decorations can: be applied to code:Input storage class |
| variables in the fragment shader stage's interface, and control the |
| interpolation behavior of those variables. |
| |
| Inputs that could be interpolated can: be decorated by at most one of the |
| following decorations: |
| |
| * code:Flat: no interpolation |
| * code:NoPerspective: linear interpolation (for |
| <<line_linear_interpolation,lines>> and |
| <<triangle_linear_interpolation,polygons>>) |
| ifdef::VK_NV_fragment_shader_barycentric[] |
| * code:PerVertexNV: values fetched from shader-specified primitive vertex |
| endif::VK_NV_fragment_shader_barycentric[] |
| |
| Fragment input variables decorated with neither code:Flat nor |
| code:NoPerspective use perspective-correct interpolation (for |
| <<line_perspective_interpolation,lines>> and |
| <<triangle_perspective_interpolation,polygons>>). |
| |
| The presence of and type of interpolation is controlled by the above |
| interpolation decorations as well as the auxiliary decorations code:Centroid |
| and code:Sample. |
| |
| A variable decorated with code:Flat will not be interpolated. |
| Instead, it will have the same value for every fragment within a triangle. |
| This value will come from a single <<vertexpostproc-flatshading,provoking |
| vertex>>. |
| A variable decorated with code:Flat can: also be decorated with |
| code:Centroid or code:Sample, which will mean the same thing as decorating |
| it only as code:Flat. |
| |
| For fragment shader input variables decorated with neither code:Centroid nor |
| code:Sample, the assigned variable may: be interpolated anywhere within the |
| fragment and a single value may: be assigned to each sample within the |
| fragment. |
| |
| If a fragment shader input is decorated with code:Centroid, a single value |
| may: be assigned to that variable for all samples in the fragment, but that |
| value must: be interpolated to a location that lies in both the fragment and |
| in the primitive being rendered, including any of the fragment's samples |
| covered by the primitive. |
| Because the location at which the variable is interpolated may: be different |
| in neighboring fragments, and derivatives may: be computed by computing |
| differences between neighboring fragments, derivatives of centroid-sampled |
| inputs may: be less accurate than those for non-centroid interpolated |
| variables. |
| ifdef::VK_EXT_post_depth_coverage[] |
| The code:PostDepthCoverage execution mode does not affect the determination |
| of the centroid location. |
| endif::VK_EXT_post_depth_coverage[] |
| |
| If a fragment shader input is decorated with code:Sample, a separate value |
| must: be assigned to that variable for each covered sample in the fragment, |
| and that value must: be sampled at the location of the individual sample. |
| When pname:rasterizationSamples is ename:VK_SAMPLE_COUNT_1_BIT, the fragment |
| center must: be used for code:Centroid, code:Sample, and undecorated |
| attribute interpolation. |
| |
| Fragment shader inputs that are signed or unsigned integers, integer |
| vectors, or any double-precision floating-point type must: be decorated with |
| code:Flat. |
| |
| ifdef::VK_AMD_shader_explicit_vertex_parameter[] |
| When the `apiext:VK_AMD_shader_explicit_vertex_parameter` device extension |
| is enabled inputs can: be also decorated with the code:CustomInterpAMD |
| interpolation decoration, including fragment shader inputs that are signed |
| or unsigned integers, integer vectors, or any double-precision |
| floating-point type. |
| Inputs decorated with code:CustomInterpAMD can: only be accessed by the |
| extended instruction code:InterpolateAtVertexAMD and allows accessing the |
| value of the input for individual vertices of the primitive. |
| endif::VK_AMD_shader_explicit_vertex_parameter[] |
| |
| ifdef::VK_NV_fragment_shader_barycentric[] |
| [[shaders-interpolation-decorations-pervertexnv]] |
| When the pname:fragmentShaderBarycentric feature is enabled, inputs can: be |
| also decorated with the code:PerVertexNV interpolation decoration, including |
| fragment shader inputs that are signed or unsigned integers, integer |
| vectors, or any double-precision floating-point type. |
| Inputs decorated with code:PerVertexNV can: only be accessed using an extra |
| array dimension, where the extra index identifies one of the vertices of the |
| primitive that produced the fragment. |
| endif::VK_NV_fragment_shader_barycentric[] |
| |
| |
| [[shaders-staticuse]] |
| == Static Use |
| |
| A SPIR-V module declares a global object in memory using the code:OpVariable |
| instruction, which results in a pointer code:x to that object. |
| A specific entry point in a SPIR-V module is said to _statically use_ that |
| object if that entry point's call tree contains a function containing a |
| memory instruction or image instruction with code:x as an code:id operand. |
| See the "`Memory Instructions`" and "`Image Instructions`" subsections of |
| section 3 "`Binary Form`" of the SPIR-V specification for the complete list |
| of SPIR-V memory instructions. |
| |
| Static use is not used to control the behavior of variables with code:Input |
| and code:Output storage. |
| The effects of those variables are applied based only on whether they are |
| present in a shader entry point's interface. |
| |
| |
| [[shaders-scope]] |
| == Scope |
| |
| A _scope_ describes a set of shader invocations, where each such set is a |
| _scope instance_. |
| Each invocation belongs to one or more scope instances, but belongs to no |
| more than one scope instance for each scope. |
| |
| The operations available between invocations in a given scope instance vary, |
| with smaller scopes generally able to perform more operations, and with |
| greater efficiency. |
| |
| |
| [[shaders-scope-cross-device]] |
| === Cross Device |
| |
| All invocations executed in a Vulkan instance fall into a single _cross |
| device scope instance_. |
| |
| Whilst the code:CrossDevice scope is defined in SPIR-V, it is disallowed in |
| Vulkan. |
| API <<synchronization, synchronization>> commands can: be used to |
| communicate between devices. |
| |
| |
| [[shaders-scope-device]] |
| === Device |
| |
| All invocations executed on a single device form a _device scope instance_. |
| |
| ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| If the <<features-vulkanMemoryModel,pname:vulkanMemoryModel>> and |
| <<features-vulkanMemoryModelDeviceScope, |
| pname:vulkanMemoryModelDeviceScope>> features are enabled, this scope is |
| represented in SPIR-V by the code:Device code:Scope, which can: be used as a |
| code:Memory code:Scope for barrier and atomic operations. |
| |
| ifdef::VK_KHR_shader_clock[] |
| If both the <<features-shaderDeviceClock, pname:shaderDeviceClock>> and |
| <<features-vulkanMemoryModelDeviceScope, |
| pname:vulkanMemoryModelDeviceScope>> features are enabled, using the |
| code:Device code:Scope with the code:OpReadClockKHR instruction will read |
| from a clock that is consistent across invocations in the same device scope |
| instance. |
| endif::VK_KHR_shader_clock[] |
| endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| |
| There is no method to synchronize the execution of these invocations within |
| SPIR-V, and this can: only be done with API synchronization primitives. |
| |
| ifdef::VK_VERSION_1_1,VK_KHR_device_group[] |
| Invocations executing on different devices in a device group operate in |
| separate device scope instances. |
| endif::VK_VERSION_1_1,VK_KHR_device_group[] |
| |
| ifndef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| The scope only extends to the queue family, not the whole device. |
| endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| |
| |
| [[shaders-scope-queue-family]] |
| === Queue Family |
| |
| Invocations executed by queues in a given queue family form a _queue family |
| scope instance_. |
| |
| This scope is identified in SPIR-V as the |
| ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| code:QueueFamily code:Scope if the |
| <<features-vulkanMemoryModel,pname:vulkanMemoryModel>> feature is enabled, |
| or if not, the |
| endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| code:Device code:Scope, which can: be used as a code:Memory code:Scope for |
| barrier and atomic operations. |
| |
| ifdef::VK_KHR_shader_clock[] |
| If the <<features-shaderDeviceClock, pname:shaderDeviceClock>> feature is |
| enabled, |
| ifdef::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| but the <<features-vulkanMemoryModelDeviceScope, |
| pname:vulkanMemoryModelDeviceScope>> feature is not enabled, |
| endif::VK_VERSION_1_2,VK_KHR_vulkan_memory_model[] |
| using the code:Device code:Scope with the code:OpReadClockKHR instruction |
| will read from a clock that is consistent across invocations in the same |
| queue family scope instance. |
| endif::VK_KHR_shader_clock[] |
| |
| There is no method to synchronize the execution of these invocations within |
| SPIR-V, and this can: only be done with API synchronization primitives. |
| |
| Each invocation in a queue family scope instance must: be in the same |
| <<shaders-scope-device, device scope instance>>. |
| |
| |
| [[shaders-scope-command]] |
| === Command |
| |
| Any shader invocations executed as the result of a single command such as |
| flink:vkCmdDispatch or flink:vkCmdDraw form a _command scope instance_. |
| For indirect drawing commands with pname:drawCount greater than one, |
| invocations from separate draws are in separate command scope instances. |
| ifdef::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] |
| For ray tracing shaders, an invocation group is an implementation-dependent |
| subset of the set of shader invocations of a given shader stage which are |
| produced by a single trace rays command. |
| endif::VK_KHR_ray_tracing_pipeline,VK_NV_ray_tracing[] |
| |
| There is no specific code:Scope for communication across invocations in a |
| command scope instance. |
| As this has a clear boundary at the API level, coordination here can: be |
| performed in the API, rather than in SPIR-V. |
| |
| Each invocation in a command scope instance must: be in the same |
| <<shaders-scope-queue-family, queue-family scope instance>>. |
| |
| For shaders without defined <<shaders-scope-workgroup, workgroups>>, this |
| set of invocations forms an _invocation group_ as defined in the |
| <<spirv-spec,SPIR-V specification>>. |
| |
| |
| [[shaders-scope-primitive]] |
| === Primitive |
| |
| Any fragment shader invocations executed as the result of rasterization of a |
| single primitive form a _primitive scope instance_. |
| |
| There is no specific code:Scope for communication across invocations in a |
| primitive scope instance. |
| |
| Any generated <<shaders-helper-invocations, helper invocations>> are |
| included in this scope instance. |
| |
| Each invocation in a primitive scope instance must: be in the same |
| <<shaders-scope-command, command scope instance>>. |
| |
| Any input variables decorated with code:Flat are uniform within a primitive |
| scope instance. |
| |
| |
| // intentionally no VK_NV_ray_tracing here since this scope does not exist there |
| ifdef::VK_KHR_ray_tracing_pipeline[] |
| [[shaders-scope-shadercall]] |
| === Shader Call |
| |
| Any <<shader-call-related,shader-call-related>> invocations that are |
| executed in one or more ray tracing execution models form a _shader call |
| scope instance_. |
| |
| The code:ShaderCallKHR code:Scope can be used as code:Memory code:Scope for |
| barrier and atomic operations. |
| |
| Each invocation in a shader call scope instance must: be in the same |
| <<shaders-scope-queue-family, queue family scope instance>>. |
| endif::VK_KHR_ray_tracing_pipeline[] |
| |
| |
| [[shaders-scope-workgroup]] |
| === Workgroup |
| |
| A _local workgroup_ is a set of invocations that can synchronize and share |
| data with each other using memory in the code:Workgroup storage class. |
| |
| The code:Workgroup code:Scope can be used as both an code:Execution |
| code:Scope and code:Memory code:Scope for barrier and atomic operations. |
| |
| Each invocation in a local workgroup must: be in the same |
| <<shaders-scope-command, command scope instance>>. |
| |
| Only |
| ifdef::VK_NV_mesh_shader[] |
| task, mesh, and |
| endif::VK_NV_mesh_shader[] |
| compute shaders have defined workgroups - other shader types cannot: use |
| workgroup functionality. |
| For shaders that have defined workgroups, this set of invocations forms an |
| _invocation group_ as defined in the <<spirv-spec,SPIR-V specification>>. |
| |
| |
| ifdef::VK_VERSION_1_1[] |
| [[shaders-scope-subgroup]] |
| === Subgroup |
| |
| A _subgroup_ (see the subsection "`Control Flow`" of section 2 of the SPIR-V |
| 1.3 Revision 1 specification) is a set of invocations that can synchronize |
| and share data with each other efficiently. |
| |
| The code:Subgroup code:Scope can be used as both an code:Execution |
| code:Scope and code:Memory code:Scope for barrier and atomic operations. |
| Other <<VkSubgroupFeatureFlagBits, subgroup features>> allow the use of |
| <<shaders-group-operations, group operations>> with subgroup scope. |
| |
| ifdef::VK_KHR_shader_clock[] |
| If the <<features-shaderSubgroupClock, pname:shaderSubgroupClock>> feature |
| is enabled, using the code:Subgroup code:Scope with the code:OpReadClockKHR |
| instruction will read from a clock that is consistent across invocations in |
| the same subgroup. |
| endif::VK_KHR_shader_clock[] |
| |
| For <<shaders-scope-workgroup, shaders that have defined workgroups>>, each |
| invocation in a subgroup must: be in the same <<shaders-scope-workgroup, |
| local workgroup>>. |
| |
| In other shader stages, each invocation in a subgroup must: be in the same |
| <<shaders-scope-device, device scope instance>>. |
| |
| Only <<limits-subgroup-supportedStages, shader stages that support subgroup |
| operations>> have defined subgroups. |
| endif::VK_VERSION_1_1[] |
| |
| |
| [[shaders-scope-quad]] |
| === Quad |
| |
| A _quad scope instance_ is formed of four shader invocations. |
| |
| In a fragment shader, each invocation in a quad scope instance is formed of |
| invocations in neighboring framebuffer locations [eq]#(x~i~, y~i~)#, where: |
| |
| * [eq]#i# is the index of the invocation within the scope instance. |
| * [eq]#w# and [eq]#h# are the number of pixels the fragment covers in the |
| [eq]#x# and [eq]#y# axes. |
| * [eq]#w# and [eq]#h# are identical for all participating invocations. |
| * [eq]#(x~0~) = (x~1~ - w) = (x~2~) = (x~3~ - w)# |
| * [eq]#(y~0~) = (y~1~) = (y~2~ - h) = (y~3~ - h)# |
| * Each invocation has the same layer and sample indices. |
| |
| ifdef::VK_NV_compute_shader_derivatives[] |
| In a compute shader, if the code:DerivativeGroupQuadsNV execution mode is |
| specified, each invocation in a quad scope instance is formed of invocations |
| with adjacent local invocation IDs [eq]#(x~i~, y~i~)#, where: |
| |
| * [eq]#i# is the index of the invocation within the quad scope instance. |
| * [eq]#(x~0~) = (x~1~ - 1) = (x~2~) = (x~3~ - 1)# |
| * [eq]#(y~0~) = (y~1~) = (y~2~ - 1) = (y~3~ - 1)# |
| * [eq]#x~0~# and [eq]#y~0~# are integer multiples of 2. |
| * Each invocation has the same [eq]#z# coordinate. |
| |
| In a compute shader, if the code:DerivativeGroupLinearNV execution mode is |
| specified, each invocation in a quad scope instance is formed of invocations |
| with adjacent local invocation indices [eq]#(l~i~)#, where: |
| |
| * [eq]#i# is the index of the invocation within the quad scope instance. |
| * [eq]#(l~0~) = (l~1~ - 1) = (l~2~ - 2) = (l~3~ - 3)# |
| * [eq]#l~0~# is an integer multiple of 4. |
| |
| endif::VK_NV_compute_shader_derivatives[] |
| |
| ifdef::VK_VERSION_1_1[] |
| In all shaders, each invocation in a quad scope instance is formed of |
| invocations in adjacent subgroup invocation indices [eq]#(s~i~)#, where: |
| |
| * [eq]#i# is the index of the invocation within the quad scope instance. |
| * [eq]#(s~0~) = (s~1~ - 1) = (s~2~ - 2) = (s~3~ - 3)# |
| * [eq]#s~0~# is an integer multiple of 4. |
| |
| Each invocation in a quad scope instance must: be in the same |
| <<shaders-scope-subgroup, subgroup>>. |
| endif::VK_VERSION_1_1[] |
| |
| ifndef::VK_VERSION_1_1[] |
| The specific set of invocations that make up a quad scope instance in other |
| shader stages is undefined:. |
| endif::VK_VERSION_1_1[] |
| |
| In a fragment shader, each invocation in a quad scope instance must: be in |
| the same <<shaders-scope-primitive, primitive scope instance>>. |
| |
| ifndef::VK_VERSION_1_1[] |
| For <<shaders-scope-workgroup, shaders that have defined workgroups>>, each |
| invocation in a quad scope instance must: be in the same |
| <<shaders-scope-workgroup, local workgroup>>. |
| |
| In other shader stages, each invocation in a quad scope instance must: be in |
| the same <<shaders-scope-device, device scope instance>>. |
| endif::VK_VERSION_1_1[] |
| |
| Fragment |
| ifdef::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[] |
| and compute |
| endif::VK_NV_compute_shader_derivatives,VK_VERSION_1_1[] |
| shaders have defined quad scope instances. |
| ifdef::VK_VERSION_1_1[] |
| If the <<limits-subgroup-quadOperationsInAllStages, |
| pname:quadOperationsInAllStages>> limit is supported, any |
| <<limits-subgroup-supportedStages, shader stages that support subgroup |
| operations>> also have defined quad scope instances. |
| endif::VK_VERSION_1_1[] |
| |
| |
| ifdef::VK_EXT_fragment_shader_interlock[] |
| [[shaders-scope-fragment-interlock]] |
| === Fragment Interlock |
| |
| A _fragment interlock scope instance_ is formed of fragment shader |
| invocations based on their framebuffer locations [eq]#(x,y,layer,sample)#, |
| executed by commands inside a single <<renderpass,subpass>>. |
| |
| The specific set of invocations included varies based on the execution mode |
| as follows: |
| |
| * If the code:SampleInterlockOrderedEXT or |
| code:SampleInterlockUnorderedEXT execution modes are used, only |
| invocations with identical framebuffer locations |
| [eq]#(x,y,layer,sample)# are included. |
| * If the code:PixelInterlockOrderedEXT or code:PixelInterlockUnorderedEXT |
| execution modes are used, fragments with different sample ids are also |
| included. |
| ifdef::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[] |
| * If the code:ShadingRateInterlockOrderedEXT or |
| code:ShadingRateInterlockUnorderedEXT execution modes are used, |
| fragments from neighbouring framebuffer locations are also included, as |
| <<primsrast-shading-rate-image,determined by the shading rate>>. |
| endif::VK_NV_shading_rate_image,VK_KHR_fragment_shading_rate[] |
| |
| Only fragment shaders with one of the above execution modes have defined |
| fragment interlock scope instances. |
| |
| There is no specific code:Scope value for communication across invocations |
| in a fragment interlock scope instance. |
| However, this is implicitly used as a memory scope by |
| code:OpBeginInvocationInterlockEXT and code:OpEndInvocationInterlockEXT. |
| |
| Each invocation in a fragment interlock scope instance must: be in the same |
| <<shaders-scope-queue-family, queue family scope instance>>. |
| endif::VK_EXT_fragment_shader_interlock[] |
| |
| |
| [[shaders-scope-invocation]] |
| === Invocation |
| |
| The smallest _scope_ is a single invocation; this is represented by the |
| code:Invocation code:Scope in SPIR-V. |
| |
| Fragment shader invocations must: be in a <<shaders-scope-primitive, |
| primitive scope instance>>. |
| |
| ifdef::VK_EXT_fragment_shader_interlock[] |
| Invocations in <<shaders-scope-fragment-interlock, fragment shaders that |
| have a defined fragment interlock scope>> must: be in a |
| <<shaders-scope-fragment-interlock, fragment interlock scope instance>>. |
| endif::VK_EXT_fragment_shader_interlock[] |
| |
| Invocations in <<shaders-scope-workgroup, shaders that have defined |
| workgroups>> must: be in a <<shaders-scope-workgroup, local workgroup>>. |
| |
| ifdef::VK_VERSION_1_1[] |
| Invocations in <<shaders-scope-subgroup, shaders that have a defined |
| subgroup scope>> must: be in a <<shaders-scope-subgroup, subgroup>>. |
| endif::VK_VERSION_1_1[] |
| |
| Invocations in <<shaders-scope-quad, shaders that have a defined quad |
| scope>> must: be in a <<shaders-scope-quad, quad scope instance>>. |
| |
| All invocations in all stages must: be in a <<shaders-scope-command,command |
| scope instance>>. |
| |
| |
| ifdef::VK_VERSION_1_1[] |
| [[shaders-group-operations]] |
| == Group Operations |
| |
| _Group operations_ are executed by multiple invocations within a |
| <<shaders-scope, scope instance>>; with each invocation involved in |
| calculating the result. |
| This provides a mechanism for efficient communication between invocations in |
| a particular scope instance. |
| |
| Group operations all take a code:Scope defining the desired |
| <<shaders-scope,scope instance>> to operate within. |
| Only the code:Subgroup scope can: be used for these operations; the |
| <<limits-subgroupSupportedOperations, pname:subgroupSupportedOperations>> |
| limit defines which types of operation can: be used. |
| |
| |
| [[shaders-group-operations-basic]] |
| === Basic Group Operations |
| |
| Basic group operations include the use of code:OpGroupNonUniformElect, |
| code:OpControlBarrier, code:OpMemoryBarrier, and atomic operations. |
| |
| code:OpGroupNonUniformElect can: be used to choose a single invocation to |
| perform a task for the whole group. |
| Only the invocation with the lowest id in the group will return code:true. |
| |
| The <<memory-model,Memory Model>> appendix defines the operation of barriers |
| and atomics. |
| |
| |
| [[shaders-group-operations-vote]] |
| === Vote Group Operations |
| |
| The vote group operations allow invocations within a group to compare values |
| across a group. |
| The types of votes enabled are: |
| |
| * Do all active group invocations agree that an expression is true? |
| * Do any active group invocations evaluate an expression to true? |
| * Do all active group invocations have the same value of an expression? |
| |
| [NOTE] |
| .Note |
| ==== |
| These operations are useful in combination with control flow in that they |
| allow for developers to check whether conditions match across the group and |
| choose potentially faster code-paths in these cases. |
| ==== |
| |
| |
| [[shaders-group-operations-arithmetic]] |
| === Arithmetic Group Operations |
| |
| The arithmetic group operations allow invocations to perform scans and |
| reductions across a group. |
| The operators supported are add, mul, min, max, and, or, xor. |
| |
| For reductions, every invocation in a group will obtain the cumulative |
| result of these operators applied to all values in the group. |
| For exclusive scans, each invocation in a group will obtain the cumulative |
| result of these operators applied to all values in invocations with a lower |
| index in the group. |
| Inclusive scans are identical to exclusive scans, except the cumulative |
| result includes the operator applied to the value in the current invocation. |
| |
| The order in which these operators are applied is implementation-dependent. |
| |
| |
| [[shaders-group-operations-ballot]] |
| === Ballot Group Operations |
| |
| The ballot group operations allow invocations to perform more complex votes |
| across the group. |
| The ballot functionality allows all invocations within a group to provide a |
| boolean value and get as a result what each invocation provided as their |
| boolean value. |
| The broadcast functionality allows values to be broadcast from an invocation |
| to all other invocations within the group. |
| |
| |
| [[shaders-group-operations-shuffle]] |
| === Shuffle Group Operations |
| |
| The shuffle group operations allow invocations to read values from other |
| invocations within a group. |
| |
| |
| [[shaders-group-operations-shuffle-relative]] |
| === Shuffle Relative Group Operations |
| |
| The shuffle relative group operations allow invocations to read values from |
| other invocations within the group relative to the current invocation in the |
| group. |
| The relative operations supported allow data to be shifted up and down |
| through the invocations within a group. |
| |
| |
| [[shaders-group-operations-clustered]] |
| === Clustered Group Operations |
| |
| The clustered group operations allow invocations to perform an operation |
| among partitions of a group, such that the operation is only performed |
| within the group invocations within a partition. |
| The partitions for clustered group operations are consecutive power-of-two |
| size groups of invocations and the cluster size must: be known at pipeline |
| creation time. |
| The operations supported are add, mul, min, max, and, or, xor. |
| |
| |
| [[shaders-quad-operations]] |
| == Quad Group Operations |
| |
| Quad group operations (code:OpGroupNonUniformQuad*) are a specialized type |
| of <<shaders-group-operations, group operations>> that only operate on |
| <<shaders-scope-quad, quad scope instances>>. |
| Whilst these instructions do include a code:Scope parameter, this scope is |
| always overridden; only the <<shaders-scope-quad, quad scope instance>> is |
| included in its execution scope. |
| |
| Fragment shaders that statically execute quad group operations must: launch |
| sufficient invocations to ensure their correct operation; additional |
| <<shaders-helper-invocations, helper invocations>> are launched for |
| framebuffer locations not covered by rasterized fragments if necessary. |
| |
| The index used to select participating invocations is [eq]#i#, as described |
| for a <<shaders-scope-quad, quad scope instance>>, defined as the _quad |
| index_ in the <<spirv-spec,SPIR-V specification>>. |
| |
| For code:OpGroupNonUniformQuadBroadcast this value is equal to code:Index. |
| For code:OpGroupNonUniformQuadSwap, it is equal to the implicit code:Index |
| used by each participating invocation. |
| endif::VK_VERSION_1_1[] |
| |
| |
| [[shaders-derivative-operations]] |
| == Derivative Operations |
| |
| Derivative operations calculate the partial derivative for an expression |
| [eq]#P# as a function of an invocation's [eq]#x# and [eq]#y# coordinates. |
| |
| Derivative operations operate on a set of invocations known as a _derivative |
| group_ as defined in the <<spirv-spec,SPIR-V specification>>. |
| A derivative group is equivalent to |
| ifdef::VK_NV_compute_shader_derivatives[] |
| the <<shaders-scope-quad, quad scope instance>> for a compute shader |
| invocation, or |
| endif::VK_NV_compute_shader_derivatives[] |
| the <<shaders-scope-primitive, primitive scope instance>> for a fragment |
| shader invocation. |
| |
| Derivatives are calculated assuming that [eq]#P# is piecewise linear and |
| continuous within the derivative group. |
| All dynamic instances of explicit derivative instructions (code:OpDPdx*, |
| code:OpDPdy*, and code:OpFwidth*) must: be executed in control flow that is |
| uniform within a derivative group. |
| For other derivative operations, results are undefined: if a dynamic |
| instance is executed in control flow that is not uniform within the |
| derivative group. |
| |
| Fragment shaders that statically execute derivative operations must: launch |
| sufficient invocations to ensure their correct operation; additional |
| <<shaders-helper-invocations, helper invocations>> are launched for |
| framebuffer locations not covered by rasterized fragments if necessary. |
| |
| ifdef::VK_NV_compute_shader_derivatives[] |
| [NOTE] |
| .Note |
| ==== |
| In a compute shader, it is the application's responsibility to ensure that |
| sufficient invocations are launched. |
| ==== |
| endif::VK_NV_compute_shader_derivatives[] |
| |
| Derivative operations calculate their results as the difference between the |
| result of [eq]#P# across invocations in the quad. |
| For fine derivative operations (code:OpDPdxFine and code:OpDPdyFine), the |
| values of [eq]#DPdx(P~i~)# are calculated as |
| |
| {empty}:: [eq]#DPdx(P~0~) = DPdx(P~1~) = P~1~ - P~0~# |
| {empty}:: [eq]#DPdx(P~2~) = DPdx(P~3~) = P~3~ - P~2~# |
| |
| and the values of [eq]#DPdy(P~i~)# are calculated as |
| |
| {empty}:: [eq]#DPdy(P~0~) = DPdy(P~2~) = P~2~ - P~0~# |
| {empty}:: [eq]#DPdy(P~1~) = DPdy(P~3~) = P~3~ - P~1~# |
| |
| where [eq]#i# is the index of each invocation as described in |
| <<shaders-scope-quad>>. |
| |
| Coarse derivative operations (code:OpDPdxCoarse and code:OpDPdyCoarse), |
| calculate their results in roughly the same manner, but may: only calculate |
| two values instead of four (one for each of [eq]#DPdx# and [eq]#DPdy#), |
| reusing the same result no matter the originating invocation. |
| If an implementation does this, it should: use the fine derivative |
| calculations described for [eq]#P~0~#. |
| |
| [NOTE] |
| .Note |
| ==== |
| Derivative values are calculated between fragments rather than pixels. |
| If the fragment shader invocations involved in the calculation cover |
| multiple pixels, these operations cover a wider area, resulting in larger |
| derivative values. |
| This in turn will result in a coarser level of detail being selected for |
| image sampling operations using derivatives. |
| |
| Applications may want to account for this when using multi-pixel fragments; |
| if pixel derivatives are desired, applications should use explicit |
| derivative operations and divide the results by the size of the fragment in |
| each dimension as follows: |
| |
| {empty}:: [eq]#DPdx(P~n~)' = DPdx(P~n~) / w# |
| {empty}:: [eq]#DPdy(P~n~)' = DPdy(P~n~) / h# |
| |
| where [eq]#w# and [eq]#h# are the size of the fragments in the quad, and |
| [eq]#DPdx(P~n~)'# and [eq]#DPdy(P~n~)'# are the pixel derivatives. |
| ==== |
| |
| The results for code:OpDPdx and code:OpDPdy may: be calculated as either |
| fine or coarse derivatives, with implementations favouring the most |
| efficient approach. |
| Implementations must: choose coarse or fine consistently between the two. |
| |
| Executing code:OpFwidthFine, code:OpFwidthCoarse, or code:OpFwidth is |
| equivalent to executing the corresponding code:OpDPdx* and code:OpDPdy* |
| instructions, taking the absolute value of the results, and summing them. |
| |
| Executing an code:OpImage*Sample*ImplicitLod instruction is equivalent to |
| executing code:OpDPdx(code:Coordinate) and code:OpDPdy(code:Coordinate), and |
| passing the results as the code:Grad operands code:dx and code:dy. |
| |
| [NOTE] |
| .Note |
| ==== |
| It is expected that using the code:ImplicitLod variants of sampling |
| functions will be substantially more efficient than using the |
| code:ExplicitLod variants with explicitly generated derivatives. |
| ==== |
| |
| |
| [[shaders-helper-invocations]] |
| == Helper Invocations |
| |
| When performing <<shaders-derivative-operations, derivative>> |
| ifdef::VK_VERSION_1_1[] |
| or <<shaders-quad-operations, quad group>> |
| endif::VK_VERSION_1_1[] |
| operations in a fragment shader, additional invocations may: be spawned in |
| order to ensure correct results. |
| These additional invocations are known as _helper invocations_ and can: be |
| identified by a non-zero value in the code:HelperInvocation built-in. |
| Stores and atomics performed by helper invocations must: not have any effect |
| on memory, and values returned by atomic instructions in helper invocations |
| are undefined:. |
| |
| Helper invocations may: become inactive at any time for any reason, with one |
| exception. |
| If a helper invocation would be active if it were not a helper invocation, |
| it must: be active for <<shaders-derivative-operations, derivative>> |
| ifdef::VK_VERSION_1_1[] |
| and <<shaders-quad-operations, quad group>> |
| endif::VK_VERSION_1_1[] |
| operations. |
| |
| ifdef::VK_EXT_shader_demote_to_helper_invocation[] |
| Helper invocations may: become permanently inactive if all invocations in a |
| quad scope instance become helper invocations. |
| endif::VK_EXT_shader_demote_to_helper_invocation[] |
| |
| |
| ifdef::VK_NV_cooperative_matrix[] |
| == Cooperative Matrices |
| |
| A _cooperative matrix_ type is a SPIR-V type where the storage for and |
| computations performed on the matrix are spread across the invocations in a |
| scope instance. |
| These types give the implementation freedom in how to optimize matrix |
| multiplies. |
| |
| SPIR-V defines the types and instructions, but does not specify rules about |
| what sizes/combinations are valid, and it is expected that different |
| implementations may: support different sizes. |
| |
| [open,refpage='vkGetPhysicalDeviceCooperativeMatrixPropertiesNV',desc='Returns properties describing what cooperative matrix types are supported',type='protos'] |
| -- |
| To enumerate the supported cooperative matrix types and operations, call: |
| |
| include::{generated}/api/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.txt[] |
| |
| * pname:physicalDevice is the physical device. |
| * pname:pPropertyCount is a pointer to an integer related to the number of |
| cooperative matrix properties available or queried. |
| * pname:pProperties is either `NULL` or a pointer to an array of |
| slink:VkCooperativeMatrixPropertiesNV structures. |
| |
| If pname:pProperties is `NULL`, then the number of cooperative matrix |
| properties available is returned in pname:pPropertyCount. |
| Otherwise, pname:pPropertyCount must: point to a variable set by the user to |
| the number of elements in the pname:pProperties array, and on return the |
| variable is overwritten with the number of structures actually written to |
| pname:pProperties. |
| If pname:pPropertyCount is less than the number of cooperative matrix |
| properties available, at most pname:pPropertyCount structures will be |
| written, and ename:VK_INCOMPLETE will be returned instead of |
| ename:VK_SUCCESS, to indicate that not all the available cooperative matrix |
| properties were returned. |
| |
| include::{generated}/validity/protos/vkGetPhysicalDeviceCooperativeMatrixPropertiesNV.txt[] |
| -- |
| |
| [open,refpage='VkCooperativeMatrixPropertiesNV',desc='Structure specifying cooperative matrix properties',type='structs'] |
| -- |
| Each sname:VkCooperativeMatrixPropertiesNV structure describes a single |
| supported combination of types for a matrix multiply/add operation |
| (code:OpCooperativeMatrixMulAddNV). |
| The multiply can: be described in terms of the following variables and types |
| (in SPIR-V pseudocode): |
| |
| [source,c] |
| ~~~~ |
| %A is of type OpTypeCooperativeMatrixNV %AType %scope %MSize %KSize |
| %B is of type OpTypeCooperativeMatrixNV %BType %scope %KSize %NSize |
| %C is of type OpTypeCooperativeMatrixNV %CType %scope %MSize %NSize |
| %D is of type OpTypeCooperativeMatrixNV %DType %scope %MSize %NSize |
| |
| %D = %A * %B + %C // using OpCooperativeMatrixMulAddNV |
| ~~~~ |
| |
| A matrix multiply with these dimensions is known as an _MxNxK_ matrix |
| multiply. |
| |
| The sname:VkCooperativeMatrixPropertiesNV structure is defined as: |
| |
| include::{generated}/api/structs/VkCooperativeMatrixPropertiesNV.txt[] |
| |
| * pname:sType is the type of this structure. |
| * pname:pNext is `NULL` or a pointer to a structure extending this |
| structure. |
| * pname:MSize is the number of rows in matrices A, C, and D. |
| * pname:KSize is the number of columns in matrix A and rows in matrix B. |
| * pname:NSize is the number of columns in matrices B, C, D. |
| * pname:AType is the component type of matrix A, of type |
| elink:VkComponentTypeNV. |
| * pname:BType is the component type of matrix B, of type |
| elink:VkComponentTypeNV. |
| * pname:CType is the component type of matrix C, of type |
| elink:VkComponentTypeNV. |
| * pname:DType is the component type of matrix D, of type |
| elink:VkComponentTypeNV. |
| * pname:scope is the scope of all the matrix types, of type |
| elink:VkScopeNV. |
| |
| If some types are preferred over other types (e.g. for performance), they |
| should: appear earlier in the list enumerated by |
| flink:vkGetPhysicalDeviceCooperativeMatrixPropertiesNV. |
| |
| At least one entry in the list must: have power of two values for all of |
| pname:MSize, pname:KSize, and pname:NSize. |
| |
| include::{generated}/validity/structs/VkCooperativeMatrixPropertiesNV.txt[] |
| -- |
| |
| [open,refpage='VkScopeNV',desc='Specify SPIR-V scope',type='enums'] |
| -- |
| Possible values for elink:VkScopeNV include: |
| |
| include::{generated}/api/enums/VkScopeNV.txt[] |
| |
| * ename:VK_SCOPE_DEVICE_NV corresponds to SPIR-V code:Device scope. |
| * ename:VK_SCOPE_WORKGROUP_NV corresponds to SPIR-V code:Workgroup scope. |
| * ename:VK_SCOPE_SUBGROUP_NV corresponds to SPIR-V code:Subgroup scope. |
| * ename:VK_SCOPE_QUEUE_FAMILY_NV corresponds to SPIR-V code:QueueFamily |
| scope. |
| |
| All enum values match the corresponding SPIR-V value. |
| -- |
| |
| [open,refpage='VkComponentTypeNV',desc='Specify SPIR-V cooperative matrix component type',type='enums'] |
| -- |
| Possible values for elink:VkComponentTypeNV include: |
| |
| include::{generated}/api/enums/VkComponentTypeNV.txt[] |
| |
| * ename:VK_COMPONENT_TYPE_FLOAT16_NV corresponds to SPIR-V |
| code:OpTypeFloat 16. |
| * ename:VK_COMPONENT_TYPE_FLOAT32_NV corresponds to SPIR-V |
| code:OpTypeFloat 32. |
| * ename:VK_COMPONENT_TYPE_FLOAT64_NV corresponds to SPIR-V |
| code:OpTypeFloat 64. |
| * ename:VK_COMPONENT_TYPE_SINT8_NV corresponds to SPIR-V code:OpTypeInt 8 1. |
| * ename:VK_COMPONENT_TYPE_SINT16_NV corresponds to SPIR-V code:OpTypeInt |
| 16 1. |
| * ename:VK_COMPONENT_TYPE_SINT32_NV corresponds to SPIR-V code:OpTypeInt |
| 32 1. |
| * ename:VK_COMPONENT_TYPE_SINT64_NV corresponds to SPIR-V code:OpTypeInt |
| 64 1. |
| * ename:VK_COMPONENT_TYPE_UINT8_NV corresponds to SPIR-V code:OpTypeInt 8 0. |
| * ename:VK_COMPONENT_TYPE_UINT16_NV corresponds to SPIR-V code:OpTypeInt |
| 16 0. |
| * ename:VK_COMPONENT_TYPE_UINT32_NV corresponds to SPIR-V code:OpTypeInt |
| 32 0. |
| * ename:VK_COMPONENT_TYPE_UINT64_NV corresponds to SPIR-V code:OpTypeInt |
| 64 0. |
| -- |
| endif::VK_NV_cooperative_matrix[] |
| |
| |
| ifdef::VK_EXT_validation_cache[] |
| [[shaders-validation-cache]] |
| == Validation Cache |
| |
| [open,refpage='VkValidationCacheEXT',desc='Opaque handle to a validation cache object',type='handles'] |
| -- |
| Validation cache objects allow the result of internal validation to be |
| reused, both within a single application run and between multiple runs. |
| Reuse within a single run is achieved by passing the same validation cache |
| object when creating supported Vulkan objects. |
| Reuse across runs of an application is achieved by retrieving validation |
| cache contents in one run of an application, saving the contents, and using |
| them to preinitialize a validation cache on a subsequent run. |
| The contents of the validation cache objects are managed by the validation |
| layers. |
| Applications can: manage the host memory consumed by a validation cache |
| object and control the amount of data retrieved from a validation cache |
| object. |
| |
| Validation cache objects are represented by sname:VkValidationCacheEXT |
| handles: |
| |
| include::{generated}/api/handles/VkValidationCacheEXT.txt[] |
| -- |
| |
| [open,refpage='vkCreateValidationCacheEXT',desc='Creates a new validation cache',type='protos'] |
| -- |
| To create validation cache objects, call: |
| |
| include::{generated}/api/protos/vkCreateValidationCacheEXT.txt[] |
| |
| * pname:device is the logical device that creates the validation cache |
| object. |
| * pname:pCreateInfo is a pointer to a slink:VkValidationCacheCreateInfoEXT |
| structure containing the initial parameters for the validation cache |
| object. |
| * pname:pAllocator controls host memory allocation as described in the |
| <<memory-allocation, Memory Allocation>> chapter. |
| * pname:pValidationCache is a pointer to a slink:VkValidationCacheEXT |
| handle in which the resulting validation cache object is returned. |
| |
| [NOTE] |
| .Note |
| ==== |
| Applications can: track and manage the total host memory size of a |
| validation cache object using the pname:pAllocator. |
| Applications can: limit the amount of data retrieved from a validation cache |
| object in fname:vkGetValidationCacheDataEXT. |
| Implementations should: not internally limit the total number of entries |
| added to a validation cache object or the total host memory consumed. |
| ==== |
| |
| Once created, a validation cache can: be passed to the |
| fname:vkCreateShaderModule command by adding this object to the |
| slink:VkShaderModuleCreateInfo structure's pname:pNext chain. |
| If a slink:VkShaderModuleValidationCacheCreateInfoEXT object is included in |
| the slink:VkShaderModuleCreateInfo::pname:pNext chain, and its |
| pname:validationCache field is not dlink:VK_NULL_HANDLE, the implementation |
| will query it for possible reuse opportunities and update it with new |
| content. |
| The use of the validation cache object in these commands is internally |
| synchronized, and the same validation cache object can: be used in multiple |
| threads simultaneously. |
| |
| [NOTE] |
| .Note |
| ==== |
| Implementations should: make every effort to limit any critical sections to |
| the actual accesses to the cache, which is expected to be significantly |
| shorter than the duration of the fname:vkCreateShaderModule command. |
| ==== |
| |
| include::{generated}/validity/protos/vkCreateValidationCacheEXT.txt[] |
| -- |
| |
| [open,refpage='VkValidationCacheCreateInfoEXT',desc='Structure specifying parameters of a newly created validation cache',type='structs'] |
| -- |
| The sname:VkValidationCacheCreateInfoEXT structure is defined as: |
| |
| include::{generated}/api/structs/VkValidationCacheCreateInfoEXT.txt[] |
| |
| * pname:sType is the type of this structure. |
| * pname:pNext is `NULL` or a pointer to a structure extending this |
| structure. |
| * pname:flags is reserved for future use. |
| * pname:initialDataSize is the number of bytes in pname:pInitialData. |
| If pname:initialDataSize is zero, the validation cache will initially be |
| empty. |
| * pname:pInitialData is a pointer to previously retrieved validation cache |
| data. |
| If the validation cache data is incompatible (as defined below) with the |
| device, the validation cache will be initially empty. |
| If pname:initialDataSize is zero, pname:pInitialData is ignored. |
| |
| .Valid Usage |
| **** |
| * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01534]] |
| If pname:initialDataSize is not `0`, it must: be equal to the size of |
| pname:pInitialData, as returned by fname:vkGetValidationCacheDataEXT |
| when pname:pInitialData was originally retrieved |
| * [[VUID-VkValidationCacheCreateInfoEXT-initialDataSize-01535]] |
| If pname:initialDataSize is not `0`, pname:pInitialData must: have been |
| retrieved from a previous call to fname:vkGetValidationCacheDataEXT |
| **** |
| |
| include::{generated}/validity/structs/VkValidationCacheCreateInfoEXT.txt[] |
| -- |
| |
| [open,refpage='VkValidationCacheCreateFlagsEXT',desc='Reserved for future use',type='flags'] |
| -- |
| include::{generated}/api/flags/VkValidationCacheCreateFlagsEXT.txt[] |
| |
| tname:VkValidationCacheCreateFlagsEXT is a bitmask type for setting a mask, |
| but is currently reserved for future use. |
| -- |
| |
| [open,refpage='vkMergeValidationCachesEXT',desc='Combine the data stores of validation caches',type='protos'] |
| -- |
| Validation cache objects can: be merged using the command: |
| |
| include::{generated}/api/protos/vkMergeValidationCachesEXT.txt[] |
| |
| * pname:device is the logical device that owns the validation cache |
| objects. |
| * pname:dstCache is the handle of the validation cache to merge results |
| into. |
| * pname:srcCacheCount is the length of the pname:pSrcCaches array. |
| * pname:pSrcCaches is a pointer to an array of validation cache handles, |
| which will be merged into pname:dstCache. |
| The previous contents of pname:dstCache are included after the merge. |
| |
| [NOTE] |
| .Note |
| ==== |
| The details of the merge operation are implementation-dependent, but |
| implementations should: merge the contents of the specified validation |
| caches and prune duplicate entries. |
| ==== |
| |
| .Valid Usage |
| **** |
| * [[VUID-vkMergeValidationCachesEXT-dstCache-01536]] |
| pname:dstCache must: not appear in the list of source caches |
| **** |
| |
| include::{generated}/validity/protos/vkMergeValidationCachesEXT.txt[] |
| -- |
| |
| [open,refpage='vkGetValidationCacheDataEXT',desc='Get the data store from a validation cache',type='protos'] |
| -- |
| Data can: be retrieved from a validation cache object using the command: |
| |
| include::{generated}/api/protos/vkGetValidationCacheDataEXT.txt[] |
| |
| * pname:device is the logical device that owns the validation cache. |
| * pname:validationCache is the validation cache to retrieve data from. |
| * pname:pDataSize is a pointer to a value related to the amount of data in |
| the validation cache, as described below. |
| * pname:pData is either `NULL` or a pointer to a buffer. |
| |
| If pname:pData is `NULL`, then the maximum size of the data that can: be |
| retrieved from the validation cache, in bytes, is returned in |
| pname:pDataSize. |
| Otherwise, pname:pDataSize must: point to a variable set by the user to the |
| size of the buffer, in bytes, pointed to by pname:pData, and on return the |
| variable is overwritten with the amount of data actually written to |
| pname:pData. |
| If pname:pDataSize is less than the maximum size that can: be retrieved by |
| the validation cache, at most pname:pDataSize bytes will be written to |
| pname:pData, and fname:vkGetValidationCacheDataEXT will return |
| ename:VK_INCOMPLETE instead of ename:VK_SUCCESS, to indicate that not all of |
| the validation cache was returned. |
| |
| Any data written to pname:pData is valid and can: be provided as the |
| pname:pInitialData member of the slink:VkValidationCacheCreateInfoEXT |
| structure passed to fname:vkCreateValidationCacheEXT. |
| |
| Two calls to fname:vkGetValidationCacheDataEXT with the same parameters |
| must: retrieve the same data unless a command that modifies the contents of |
| the cache is called between them. |
| |
| [[validation-cache-header]] |
| Applications can: store the data retrieved from the validation cache, and |
| use these data, possibly in a future run of the application, to populate new |
| validation cache objects. |
| The results of validation, however, may: depend on the vendor ID, device ID, |
| driver version, and other details of the device. |
| To enable applications to detect when previously retrieved data is |
| incompatible with the device, the initial bytes written to pname:pData must: |
| be a header consisting of the following members: |
| |
| .Layout for validation cache header version ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT |
| [width="85%",cols="8%,21%,71%",options="header"] |
| |==== |
| | Offset | Size | Meaning |
| | 0 | 4 | length in bytes of the entire validation cache header |
| written as a stream of bytes, with the least |
| significant byte first |
| | 4 | 4 | a elink:VkValidationCacheHeaderVersionEXT value |
| written as a stream of bytes, with the least |
| significant byte first |
| | 8 | ename:VK_UUID_SIZE | a layer commit ID expressed as a UUID, which uniquely |
| identifies the version of the validation layers used |
| to generate these validation results |
| |==== |
| |
| The first four bytes encode the length of the entire validation cache |
| header, in bytes. |
| This value includes all fields in the header including the validation cache |
| version field and the size of the length field. |
| |
| The next four bytes encode the validation cache version, as described for |
| elink:VkValidationCacheHeaderVersionEXT. |
| A consumer of the validation cache should: use the cache version to |
| interpret the remainder of the cache header. |
| |
| If pname:pDataSize is less than what is necessary to store this header, |
| nothing will be written to pname:pData and zero will be written to |
| pname:pDataSize. |
| |
| include::{generated}/validity/protos/vkGetValidationCacheDataEXT.txt[] |
| -- |
| |
| [open,refpage='VkValidationCacheHeaderVersionEXT',desc='Encode validation cache version',type='enums',xrefs='vkCreateValidationCacheEXT vkGetValidationCacheDataEXT'] |
| -- |
| Possible values of the second group of four bytes in the header returned by |
| flink:vkGetValidationCacheDataEXT, encoding the validation cache version, |
| are: |
| |
| include::{generated}/api/enums/VkValidationCacheHeaderVersionEXT.txt[] |
| |
| * ename:VK_VALIDATION_CACHE_HEADER_VERSION_ONE_EXT specifies version one |
| of the validation cache. |
| -- |
| |
| [open,refpage='vkDestroyValidationCacheEXT',desc='Destroy a validation cache object',type='protos'] |
| -- |
| To destroy a validation cache, call: |
| |
| include::{generated}/api/protos/vkDestroyValidationCacheEXT.txt[] |
| |
| * pname:device is the logical device that destroys the validation cache |
| object. |
| * pname:validationCache is the handle of the validation cache to destroy. |
| * pname:pAllocator controls host memory allocation as described in the |
| <<memory-allocation, Memory Allocation>> chapter. |
| |
| .Valid Usage |
| **** |
| * [[VUID-vkDestroyValidationCacheEXT-validationCache-01537]] |
| If sname:VkAllocationCallbacks were provided when pname:validationCache |
| was created, a compatible set of callbacks must: be provided here |
| * [[VUID-vkDestroyValidationCacheEXT-validationCache-01538]] |
| If no sname:VkAllocationCallbacks were provided when |
| pname:validationCache was created, pname:pAllocator must: be `NULL` |
| **** |
| |
| include::{generated}/validity/protos/vkDestroyValidationCacheEXT.txt[] |
| -- |
| endif::VK_EXT_validation_cache[] |