| Name |
| |
| INTEL_shader_atomic_float_minmax |
| |
| Name Strings |
| |
| GL_INTEL_shader_atomic_float_minmax |
| |
| Contact |
| |
| Ian Romanick (ian . d . romanick 'at' intel . com) |
| |
| Contributors |
| |
| |
| Status |
| |
| In progress |
| |
| Version |
| |
| Last Modified Date: 06/22/2018 |
| Revision: 4 |
| |
| Number |
| |
| TBD |
| |
| Dependencies |
| |
| OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or |
| ARB_compute_shader is required. |
| |
| This extension is written against version 4.60 of the OpenGL Shading |
| Language Specification. |
| |
| Overview |
| |
| This extension provides GLSL built-in functions allowing shaders to |
| perform atomic read-modify-write operations to floating-point buffer |
| variables and shared variables. Minimum, maximum, exchange, and |
| compare-and-swap are enabled. |
| |
| |
| New Procedures and Functions |
| |
| None. |
| |
| New Tokens |
| |
| None. |
| |
| IP Status |
| |
| None. |
| |
| Modifications to the OpenGL Shading Language Specification, Version 4.60 |
| |
| Including the following line in a shader can be used to control the |
| language features described in this extension: |
| |
| #extension GL_INTEL_shader_atomic_float_minmax : <behavior> |
| |
| where <behavior> is as specified in section 3.3. |
| |
| New preprocessor #defines are added to the OpenGL Shading Language: |
| |
| #define GL_INTEL_shader_atomic_float_minmax 1 |
| |
| Additions to Chapter 8 of the OpenGL Shading Language Specification |
| (Built-in Functions) |
| |
| Modify Section 8.11, "Atomic Memory Functions" |
| |
| (add a new row after the existing "atomicMin" table row, p. 179) |
| |
| float atomicMin(inout float mem, float data) |
| |
| |
| Computes a new value by taking the minimum of the value of data and |
| the contents of mem. If one of these is an IEEE signaling NaN (i.e., |
| a NaN with the most-significant bit of the mantissa cleared), it is |
| always considered smaller. If one of these is an IEEE quiet NaN |
| (i.e., a NaN with the most-significant bit of the mantissa set), it is |
| always considered larger. If both are IEEE quiet NaNs or both are |
| IEEE signaling NaNs, the result of the comparison is undefined. |
| |
| (add a new row after the exiting "atomicMax" table row, p. 179) |
| |
| float atomicMax(inout float mem, float data) |
| |
| Computes a new value by taking the maximum of the value of data and |
| the contents of mem. If one of these is an IEEE signaling NaN (i.e., |
| a NaN with the most-significant bit of the mantissa cleared), it is |
| always considered larger. If one of these is an IEEE quiet NaN (i.e., |
| a NaN with the most-significant bit of the mantissa set), it is always |
| considered smaller. If both are IEEE quiet NaNs or both are IEEE |
| signaling NaNs, the result of the comparison is undefined. |
| |
| (add to "atomicExchange" table cell, p. 180) |
| |
| float atomicExchange(inout float mem, float data) |
| |
| (add to "atomicCompSwap" table cell, p. 180) |
| |
| float atomicCompSwap(inout float mem, float compare, float data) |
| |
| Interactions with OpenGL 4.6 and ARB_gl_spirv |
| |
| If OpenGL 4.6 or ARB_gl_spirv is supported, then |
| SPV_INTEL_shader_atomic_float_minmax must also be supported. |
| |
| The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or |
| OpenGL ES implementation supports INTEL_shader_atomic_float_minmax. |
| |
| Issues |
| |
| 1) Why call this extension INTEL_shader_atomic_float_minmax? |
| |
| RESOLVED: Several other extensions already set the precedent of |
| VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions |
| that enable floating-point atomic operations. Using that as a base for |
| the name seems logical. |
| |
| There already exists NV_shader_atomic_float, but the two extensions have |
| nearly zero overlap in functionality. NV_shader_atomic_float adds |
| atomicAdd and image atomic operations that currently shipping Intel GPUs |
| do not support. Calling this extension INTEL_shader_atomic_float would |
| likely have been confusing. |
| |
| Adding something to describe the actual functions added by this extension |
| seemed reasonable. INTEL_shader_atomic_float_compare was considered, but |
| that name was deemed to be not properly descriptive. Calling this |
| extension INTEL_shader_atomic_float_min_max_exchange_compswap is right |
| out. |
| |
| 2) What atomic operations should we support for floating-point targets? |
| |
| RESOLVED. Exchange, min, max, and compare-swap make sense, and these are |
| all supported by the hardware. Future extensions may add other functions. |
| |
| For buffer variables and shared variables it is not possible to bit-cast |
| the memory location in GLSL, so existing integer operations, such as |
| atomicOr, cannot be used. However, the underlying hardware implementation |
| can do this by treating the memory as an integer. It would be possible to |
| implement atomicNegate using this technique with atomicXor. It is unclear |
| whether this provides any actual utility. |
| |
| 3) What should be said about the NaN behavior? |
| |
| RESOLVED. There are several aspects of NaN behavior that should be |
| documented in this extension. However, some of this behavior varies based |
| on NaN concepts that do not exist in the GLSL specification. |
| |
| * atomicCompSwap performs the comparison as the floating-point equality |
| operator (==). That is, if either 'mem' or 'compare' is NaN, the |
| comparison result is always false. |
| |
| * atomicMin and atomicMax implement the IEEE specification with respect to |
| NaN. IEEE considers two different kinds of NaN: signaling NaN and quiet |
| NaN. A quiet NaN has the most significant bit of the mantissa set, and |
| a signaling NaN does not. This concept does not exist in SPIR-V, |
| Vulkan, or OpenGL. Let qNaN denote a quiet NaN and sNaN denote a |
| signaling NaN. atomicMin and atomicMax specifically implement |
| |
| - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x |
| - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN |
| - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) = |
| fmax(qNaN, sNaN) = sNaN |
| - fmin(sNaN, sNaN) = sNaN. This specification does not define which of |
| the two arguments is stored. |
| - fmax(sNaN, sNaN) = sNaN. This specification does not define which of |
| the two arguments is stored. |
| - fmin(qNaN, qNaN) = qNaN. This specification does not define which of |
| the two arguments is stored. |
| - fmax(qNaN, qNaN) = qNaN. This specification does not define which of |
| the two arguments is stored. |
| |
| Further details are available in the Skylake Programmer's Reference |
| Manuals available at |
| https://01.org/linuxgraphics/documentation/hardware-specification-prms. |
| |
| 4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0) |
| arguments? |
| |
| RESOLVED. atomicMin should store -0.0, and atomicMax should store +0.0. |
| Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is |
| stored. This behavior may change in later GPUs. |
| |
| Revision History |
| |
| Rev Date Author Changes |
| --- ---------- -------- --------------------------------------------- |
| 1 04/19/2018 idr Initial version |
| 2 05/05/2018 idr Describe interactions with the capabilities |
| added by SPV_INTEL_shader_atomic_float_minmax. |
| 3 05/29/2018 idr Remove mention of 64-bit float support. |
| 4 06/22/2018 idr Resolve issue #2. |
| Add issue #3 (regarding NaN behavior). |
| Add issue #4 (regarding atomicMin(-0, +0). |