| Name |
| |
| MESA_shader_integer_functions |
| |
| Name Strings |
| |
| GL_MESA_shader_integer_functions |
| |
| Contact |
| |
| Ian Romanick <ian.d.romanick@intel.com> |
| |
| Contributors |
| |
| All the contributors of GL_ARB_gpu_shader5 |
| |
| Status |
| |
| Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later |
| |
| Version |
| |
| Version 3, March 31, 2017 |
| |
| Number |
| |
| OpenGL Extension #495 |
| |
| Dependencies |
| |
| This extension is written against the OpenGL 3.2 (Compatibility Profile) |
| Specification. |
| |
| This extension is written against Version 1.50 (Revision 09) of the OpenGL |
| Shading Language Specification. |
| |
| GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required. |
| |
| This extension interacts with ARB_gpu_shader5. |
| |
| This extension interacts with ARB_gpu_shader_fp64. |
| |
| This extension interacts with NV_gpu_shader5. |
| |
| Overview |
| |
| GL_ARB_gpu_shader5 extends GLSL in a number of useful ways. Much of this |
| added functionality requires significant hardware support. There are many |
| aspects, however, that can be easily implmented on any GPU with "real" |
| integer support (as opposed to simulating integers using floating point |
| calculations). |
| |
| This extension provides a set of new features to the OpenGL Shading |
| Language to support capabilities of these GPUs, extending the |
| capabilities of version 1.30 of the OpenGL Shading Language and version |
| 3.00 of the OpenGL ES Shading Language. Shaders using the new |
| functionality provided by this extension should enable this |
| functionality via the construct |
| |
| #extension GL_MESA_shader_integer_functions : require (or enable) |
| |
| This extension provides a variety of new features for all shader types, |
| including: |
| |
| * support for implicitly converting signed integer types to unsigned |
| types, as well as more general implicit conversion and function |
| overloading infrastructure to support new data types introduced by |
| other extensions; |
| |
| * new built-in functions supporting: |
| |
| * splitting a floating-point number into a significand and exponent |
| (frexp), or building a floating-point number from a significand and |
| exponent (ldexp); |
| |
| * integer bitfield manipulation, including functions to find the |
| position of the most or least significant set bit, count the number |
| of one bits, and bitfield insertion, extraction, and reversal; |
| |
| * extended integer precision math, including add with carry, subtract |
| with borrow, and extenended multiplication; |
| |
| The resulting extension is a strict subset of GL_ARB_gpu_shader5. |
| |
| IP Status |
| |
| No known IP claims. |
| |
| New Procedures and Functions |
| |
| None |
| |
| New Tokens |
| |
| None |
| |
| Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (OpenGL Operation) |
| |
| None. |
| |
| Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (Rasterization) |
| |
| None. |
| |
| Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (Per-Fragment Operations and the Frame Buffer) |
| |
| None. |
| |
| Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (Special Functions) |
| |
| None. |
| |
| Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification |
| (State and State Requests) |
| |
| None. |
| |
| Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) |
| Specification (Invariance) |
| |
| None. |
| |
| Additions to the AGL/GLX/WGL Specifications |
| |
| None. |
| |
| Modifications to The OpenGL Shading Language Specification, Version 1.50 |
| (Revision 09) |
| |
| Including the following line in a shader can be used to control the |
| language features described in this extension: |
| |
| #extension GL_MESA_shader_integer_functions : <behavior> |
| |
| where <behavior> is as specified in section 3.3. |
| |
| New preprocessor #defines are added to the OpenGL Shading Language: |
| |
| #define GL_MESA_shader_integer_functions 1 |
| |
| |
| Modify Section 4.1.10, Implicit Conversions, p. 27 |
| |
| (modify table of implicit conversions) |
| |
| Can be implicitly |
| Type of expression converted to |
| --------------------- ----------------- |
| int uint, float |
| ivec2 uvec2, vec2 |
| ivec3 uvec3, vec3 |
| ivec4 uvec4, vec4 |
| |
| uint float |
| uvec2 vec2 |
| uvec3 vec3 |
| uvec4 vec4 |
| |
| (modify second paragraph of the section) No implicit conversions are |
| provided to convert from unsigned to signed integer types or from |
| floating-point to integer types. There are no implicit array or structure |
| conversions. |
| |
| (insert before the final paragraph of the section) When performing |
| implicit conversion for binary operators, there may be multiple data types |
| to which the two operands can be converted. For example, when adding an |
| int value to a uint value, both values can be implicitly converted to uint |
| and float. In such cases, a floating-point type is chosen if either |
| operand has a floating-point type. Otherwise, an unsigned integer type is |
| chosen if either operand has an unsigned integer type. Otherwise, a |
| signed integer type is chosen. |
| |
| |
| Modify Section 5.9, Expressions, p. 57 |
| |
| (modify bulleted list as follows, adding support for implicit conversion |
| between signed and unsigned types) |
| |
| Expressions in the shading language are built from the following: |
| |
| * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector |
| types, and all matrix types. |
| |
| ... |
| |
| * The operator modulus (%) operates on signed or unsigned integer scalars |
| or vectors. If the fundamental types of the operands do not match, the |
| conversions from Section 4.1.10 "Implicit Conversions" are applied to |
| produce matching types. ... |
| |
| |
| Modify Section 6.1, Function Definitions, p. 63 |
| |
| (modify description of overloading, beginning at the top of p. 64) |
| |
| Function names can be overloaded. The same function name can be used for |
| multiple functions, as long as the parameter types differ. If a function |
| name is declared twice with the same parameter types, then the return |
| types and all qualifiers must also match, and it is the same function |
| being declared. For example, |
| |
| vec4 f(in vec4 x, out vec4 y); // (A) |
| vec4 f(in vec4 x, out uvec4 y); // (B) okay, different argument type |
| vec4 f(in ivec4 x, out uvec4 y); // (C) okay, different argument type |
| |
| int f(in vec4 x, out ivec4 y); // error, only return type differs |
| vec4 f(in vec4 x, in vec4 y); // error, only qualifier differs |
| vec4 f(const in vec4 x, out vec4 y); // error, only qualifier differs |
| |
| When function calls are resolved, an exact type match for all the |
| arguments is sought. If an exact match is found, all other functions are |
| ignored, and the exact match is used. If no exact match is found, then |
| the implicit conversions in Section 4.1.10 (Implicit Conversions) will be |
| applied to find a match. Mismatched types on input parameters (in or |
| inout or default) must have a conversion from the calling argument type |
| to the formal parameter type. Mismatched types on output parameters (out |
| or inout) must have a conversion from the formal parameter type to the |
| calling argument type. |
| |
| If implicit conversions can be used to find more than one matching |
| function, a single best-matching function is sought. To determine a best |
| match, the conversions between calling argument and formal parameter |
| types are compared for each function argument and pair of matching |
| functions. After these comparisons are performed, each pair of matching |
| functions are compared. A function definition A is considered a better |
| match than function definition B if: |
| |
| * for at least one function argument, the conversion for that argument |
| in A is better than the corresponding conversion in B; and |
| |
| * there is no function argument for which the conversion in B is better |
| than the corresponding conversion in A. |
| |
| If a single function definition is considered a better match than every |
| other matching function definition, it will be used. Otherwise, a |
| semantic error occurs and the shader will fail to compile. |
| |
| To determine whether the conversion for a single argument in one match is |
| better than that for another match, the following rules are applied, in |
| order: |
| |
| 1. An exact match is better than a match involving any implicit |
| conversion. |
| |
| 2. A match involving an implicit conversion from float to double is |
| better than a match involving any other implicit conversion. |
| |
| 3. A match involving an implicit conversion from either int or uint to |
| float is better than a match involving an implicit conversion from |
| either int or uint to double. |
| |
| If none of the rules above apply to a particular pair of conversions, |
| neither conversion is considered better than the other. |
| |
| For the function prototypes (A), (B), and (C) above, the following |
| examples show how the rules apply to different sets of calling argument |
| types: |
| |
| f(vec4, vec4); // exact match of vec4 f(in vec4 x, out vec4 y) |
| f(vec4, uvec4); // exact match of vec4 f(in vec4 x, out ivec4 y) |
| f(vec4, ivec4); // matched to vec4 f(in vec4 x, out vec4 y) |
| // (C) not relevant, can't convert vec4 to |
| // ivec4. (A) better than (B) for 2nd |
| // argument (rule 2), same on first argument. |
| f(ivec4, vec4); // NOT matched. All three match by implicit |
| // conversion. (C) is better than (A) and (B) |
| // on the first argument. (A) is better than |
| // (B) and (C). |
| |
| |
| Modify Section 8.3, Common Functions, p. 84 |
| |
| (add support for single-precision frexp and ldexp functions) |
| |
| Syntax: |
| |
| genType frexp(genType x, out genIType exp); |
| genType ldexp(genType x, in genIType exp); |
| |
| The function frexp() splits each single-precision floating-point number in |
| <x> into a binary significand, a floating-point number in the range [0.5, |
| 1.0), and an integral exponent of two, such that: |
| |
| x = significand * 2 ^ exponent |
| |
| The significand is returned by the function; the exponent is returned in |
| the parameter <exp>. For a floating-point value of zero, the significant |
| and exponent are both zero. For a floating-point value that is an |
| infinity or is not a number, the results of frexp() are undefined. |
| |
| If the input <x> is a vector, this operation is performed in a |
| component-wise manner; the value returned by the function and the value |
| written to <exp> are vectors with the same number of components as <x>. |
| |
| The function ldexp() builds a single-precision floating-point number from |
| each significand component in <x> and the corresponding integral exponent |
| of two in <exp>, returning: |
| |
| significand * 2 ^ exponent |
| |
| If this product is too large to be represented as a single-precision |
| floating-point value, the result is considered undefined. |
| |
| If the input <x> is a vector, this operation is performed in a |
| component-wise manner; the value passed in <exp> and returned by the |
| function are vectors with the same number of components as <x>. |
| |
| |
| (add support for new integer built-in functions) |
| |
| Syntax: |
| |
| genIType bitfieldExtract(genIType value, int offset, int bits); |
| genUType bitfieldExtract(genUType value, int offset, int bits); |
| |
| genIType bitfieldInsert(genIType base, genIType insert, int offset, |
| int bits); |
| genUType bitfieldInsert(genUType base, genUType insert, int offset, |
| int bits); |
| |
| genIType bitfieldReverse(genIType value); |
| genUType bitfieldReverse(genUType value); |
| |
| genIType bitCount(genIType value); |
| genIType bitCount(genUType value); |
| |
| genIType findLSB(genIType value); |
| genIType findLSB(genUType value); |
| |
| genIType findMSB(genIType value); |
| genIType findMSB(genUType value); |
| |
| The function bitfieldExtract() extracts bits <offset> through |
| <offset>+<bits>-1 from each component in <value>, returning them in the |
| least significant bits of corresponding component of the result. For |
| unsigned data types, the most significant bits of the result will be set |
| to zero. For signed data types, the most significant bits will be set to |
| the value of bit <offset>+<base>-1. If <bits> is zero, the result will be |
| zero. The result will be undefined if <offset> or <bits> is negative, or |
| if the sum of <offset> and <bits> is greater than the number of bits used |
| to store the operand. Note that for vector versions of bitfieldExtract(), |
| a single pair of <offset> and <bits> values is shared for all components. |
| |
| The function bitfieldInsert() inserts the <bits> least significant bits of |
| each component of <insert> into the corresponding component of <base>. |
| The result will have bits numbered <offset> through <offset>+<bits>-1 |
| taken from bits 0 through <bits>-1 of <insert>, and all other bits taken |
| directly from the corresponding bits of <base>. If <bits> is zero, the |
| result will simply be <base>. The result will be undefined if <offset> or |
| <bits> is negative, or if the sum of <offset> and <bits> is greater than |
| the number of bits used to store the operand. Note that for vector |
| versions of bitfieldInsert(), a single pair of <offset> and <bits> values |
| is shared for all components. |
| |
| The function bitfieldReverse() reverses the bits of <value>. The bit |
| numbered <n> of the result will be taken from bit (<bits>-1)-<n> of |
| <value>, where <bits> is the total number of bits used to represent |
| <value>. |
| |
| The function bitCount() returns the number of one bits in the binary |
| representation of <value>. |
| |
| The function findLSB() returns the bit number of the least significant one |
| bit in the binary representation of <value>. If <value> is zero, -1 will |
| be returned. |
| |
| The function findMSB() returns the bit number of the most significant bit |
| in the binary representation of <value>. For positive integers, the |
| result will be the bit number of the most significant one bit. For |
| negative integers, the result will be the bit number of the most |
| significant zero bit. For a <value> of zero or negative one, -1 will be |
| returned. |
| |
| |
| (support for unsigned integer add/subtract with carry-out) |
| |
| Syntax: |
| |
| genUType uaddCarry(genUType x, genUType y, out genUType carry); |
| genUType usubBorrow(genUType x, genUType y, out genUType borrow); |
| |
| The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and |
| <y>, returning the sum modulo 2^32. The value <carry> is set to zero if |
| the sum was less than 2^32, or one otherwise. |
| |
| The function usubBorrow() subtracts the 32-bit unsigned integer or vector |
| <y> from <x>, returning the difference if non-negative or 2^32 plus the |
| difference, otherwise. The value <borrow> is set to zero if x >= y, or |
| one otherwise. |
| |
| |
| (support for signed and unsigned multiplies, with 32-bit inputs and a |
| 64-bit result spanning two 32-bit outputs) |
| |
| Syntax: |
| |
| void umulExtended(genUType x, genUType y, out genUType msb, |
| out genUType lsb); |
| void imulExtended(genIType x, genIType y, out genIType msb, |
| out genIType lsb); |
| |
| The functions umulExtended() and imulExtended() multiply 32-bit unsigned |
| or signed integers or vectors <x> and <y>, producing a 64-bit result. The |
| 32 least significant bits are returned in <lsb>; the 32 most significant |
| bits are returned in <msb>. |
| |
| |
| GLX Protocol |
| |
| None. |
| |
| Dependencies on ARB_gpu_shader_fp64 |
| |
| This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set |
| of implicit conversions supported in the OpenGL Shading Language. If more |
| than one of these extensions is supported, an expression of one type may |
| be converted to another type if that conversion is allowed by any of these |
| specifications. |
| |
| If ARB_gpu_shader_fp64 or a similar extension introducing new data types |
| is not supported, the function overloading rule in the GLSL specification |
| preferring promotion an input parameters to smaller type to a larger type |
| is never applicable, as all data types are of the same size. That rule |
| and the example referring to "double" should be removed. |
| |
| |
| Dependencies on NV_gpu_shader5 |
| |
| This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set |
| of implicit conversions supported in the OpenGL Shading Language. If more |
| than one of these extensions is supported, an expression of one type may |
| be converted to another type if that conversion is allowed by any of these |
| specifications. |
| |
| If NV_gpu_shader5 is supported, integer data types are supported with four |
| different precisions (8-, 16, 32-, and 64-bit) and floating-point data |
| types are supported with three different precisions (16-, 32-, and |
| 64-bit). The extension adds the following rule for output parameters, |
| which is similar to the one present in this extension for input |
| parameters: |
| |
| 5. If the formal parameters in both matches are output parameters, a |
| conversion from a type with a larger number of bits per component is |
| better than a conversion from a type with a smaller number of bits |
| per component. For example, a conversion from an "int16_t" formal |
| parameter type to "int" is better than one from an "int8_t" formal |
| parameter type to "int". |
| |
| Such a rule is not provided in this extension because there is no |
| combination of types in this extension and ARB_gpu_shader_fp64 where this |
| rule has any effect. |
| |
| |
| Errors |
| |
| None |
| |
| |
| New State |
| |
| None |
| |
| New Implementation Dependent State |
| |
| None |
| |
| Issues |
| |
| (1) What should this extension be called? |
| |
| UNRESOLVED. This extension borrows from GL_ARB_gpu_shader5, so creating |
| some sort of a play on that name would be viable. However, nothing in |
| this extension should require SM5 hardware, so such a name would be a |
| little misleading and weird. |
| |
| Since the primary purpose is to add integer related functions from |
| GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions |
| for now. |
| |
| (2) Why is some of the formatting in this extension weird? |
| |
| RESOLVED: This extension is formatted to minimize the differences (as |
| reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5 |
| specification. |
| |
| (3) Should ldexp and frexp be included? |
| |
| RESOLVED: Yes. Few GPUs have native instructions to implement these |
| functions. These are generally implemented using existing GLSL built-in |
| functions and the other functions provided by this extension. |
| |
| (4) Should umulExtended and imulExtended be included? |
| |
| RESOLVED: Yes. These functions should be implementable on any GPU that |
| can support the rest of this extension, but the implementation may be |
| complex. The implementation on a GPU that only supports 32bit x 32bit = |
| 32bit multiplication would be quite expensive. However, many GPUs |
| (including OpenGL 4.0 GPUs that already support this function) have a |
| 32bit x 16bit = 48bit multiplier. The implementation there is only |
| trivially more expensive than regular 32bit multiplication. |
| |
| (5) Should the pack and unpack functions be included? |
| |
| RESOLVED: No. These functions are already available via |
| GL_ARB_shading_language_packing. |
| |
| (6) Should the "BitsTo" functions be included? |
| |
| RESOLVED: No. These functions are already available via |
| GL_ARB_shader_bit_encoding. |
| |
| Revision History |
| |
| Rev. Date Author Changes |
| ---- ----------- -------- ----------------------------------------- |
| 3 31-Mar-2017 Jon Leech Add ES support (OpenGL-Registry/issues/3) |
| 2 7-Jul-2016 idr Fix typo in #extension line |
| 1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5. |