fuchsia / third_party / mesa / 80c5429c69c53b9bc3e62a08618c2d427ba61e53 / . / docs / _extra / specs / MESA_shader_integer_functions.txt

Name | |

MESA_shader_integer_functions | |

Name Strings | |

GL_MESA_shader_integer_functions | |

Contact | |

Ian Romanick <ian.d.romanick@intel.com> | |

Contributors | |

All the contributors of GL_ARB_gpu_shader5 | |

Status | |

Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later | |

Version | |

Version 3, March 31, 2017 | |

Number | |

OpenGL Extension #495 | |

Dependencies | |

This extension is written against the OpenGL 3.2 (Compatibility Profile) | |

Specification. | |

This extension is written against Version 1.50 (Revision 09) of the OpenGL | |

Shading Language Specification. | |

GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required. | |

This extension interacts with ARB_gpu_shader5. | |

This extension interacts with ARB_gpu_shader_fp64. | |

This extension interacts with NV_gpu_shader5. | |

Overview | |

GL_ARB_gpu_shader5 extends GLSL in a number of useful ways. Much of this | |

added functionality requires significant hardware support. There are many | |

aspects, however, that can be easily implmented on any GPU with "real" | |

integer support (as opposed to simulating integers using floating point | |

calculations). | |

This extension provides a set of new features to the OpenGL Shading | |

Language to support capabilities of these GPUs, extending the | |

capabilities of version 1.30 of the OpenGL Shading Language and version | |

3.00 of the OpenGL ES Shading Language. Shaders using the new | |

functionality provided by this extension should enable this | |

functionality via the construct | |

#extension GL_MESA_shader_integer_functions : require (or enable) | |

This extension provides a variety of new features for all shader types, | |

including: | |

* support for implicitly converting signed integer types to unsigned | |

types, as well as more general implicit conversion and function | |

overloading infrastructure to support new data types introduced by | |

other extensions; | |

* new built-in functions supporting: | |

* splitting a floating-point number into a significand and exponent | |

(frexp), or building a floating-point number from a significand and | |

exponent (ldexp); | |

* integer bitfield manipulation, including functions to find the | |

position of the most or least significant set bit, count the number | |

of one bits, and bitfield insertion, extraction, and reversal; | |

* extended integer precision math, including add with carry, subtract | |

with borrow, and extenended multiplication; | |

The resulting extension is a strict subset of GL_ARB_gpu_shader5. | |

IP Status | |

No known IP claims. | |

New Procedures and Functions | |

None | |

New Tokens | |

None | |

Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(OpenGL Operation) | |

None. | |

Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(Rasterization) | |

None. | |

Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(Per-Fragment Operations and the Frame Buffer) | |

None. | |

Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(Special Functions) | |

None. | |

Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification | |

(State and State Requests) | |

None. | |

Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) | |

Specification (Invariance) | |

None. | |

Additions to the AGL/GLX/WGL Specifications | |

None. | |

Modifications to The OpenGL Shading Language Specification, Version 1.50 | |

(Revision 09) | |

Including the following line in a shader can be used to control the | |

language features described in this extension: | |

#extension GL_MESA_shader_integer_functions : <behavior> | |

where <behavior> is as specified in section 3.3. | |

New preprocessor #defines are added to the OpenGL Shading Language: | |

#define GL_MESA_shader_integer_functions 1 | |

Modify Section 4.1.10, Implicit Conversions, p. 27 | |

(modify table of implicit conversions) | |

Can be implicitly | |

Type of expression converted to | |

--------------------- ----------------- | |

int uint, float | |

ivec2 uvec2, vec2 | |

ivec3 uvec3, vec3 | |

ivec4 uvec4, vec4 | |

uint float | |

uvec2 vec2 | |

uvec3 vec3 | |

uvec4 vec4 | |

(modify second paragraph of the section) No implicit conversions are | |

provided to convert from unsigned to signed integer types or from | |

floating-point to integer types. There are no implicit array or structure | |

conversions. | |

(insert before the final paragraph of the section) When performing | |

implicit conversion for binary operators, there may be multiple data types | |

to which the two operands can be converted. For example, when adding an | |

int value to a uint value, both values can be implicitly converted to uint | |

and float. In such cases, a floating-point type is chosen if either | |

operand has a floating-point type. Otherwise, an unsigned integer type is | |

chosen if either operand has an unsigned integer type. Otherwise, a | |

signed integer type is chosen. | |

Modify Section 5.9, Expressions, p. 57 | |

(modify bulleted list as follows, adding support for implicit conversion | |

between signed and unsigned types) | |

Expressions in the shading language are built from the following: | |

* Constants of type bool, int, int64_t, uint, uint64_t, float, all vector | |

types, and all matrix types. | |

... | |

* The operator modulus (%) operates on signed or unsigned integer scalars | |

or vectors. If the fundamental types of the operands do not match, the | |

conversions from Section 4.1.10 "Implicit Conversions" are applied to | |

produce matching types. ... | |

Modify Section 6.1, Function Definitions, p. 63 | |

(modify description of overloading, beginning at the top of p. 64) | |

Function names can be overloaded. The same function name can be used for | |

multiple functions, as long as the parameter types differ. If a function | |

name is declared twice with the same parameter types, then the return | |

types and all qualifiers must also match, and it is the same function | |

being declared. For example, | |

vec4 f(in vec4 x, out vec4 y); // (A) | |

vec4 f(in vec4 x, out uvec4 y); // (B) okay, different argument type | |

vec4 f(in ivec4 x, out uvec4 y); // (C) okay, different argument type | |

int f(in vec4 x, out ivec4 y); // error, only return type differs | |

vec4 f(in vec4 x, in vec4 y); // error, only qualifier differs | |

vec4 f(const in vec4 x, out vec4 y); // error, only qualifier differs | |

When function calls are resolved, an exact type match for all the | |

arguments is sought. If an exact match is found, all other functions are | |

ignored, and the exact match is used. If no exact match is found, then | |

the implicit conversions in Section 4.1.10 (Implicit Conversions) will be | |

applied to find a match. Mismatched types on input parameters (in or | |

inout or default) must have a conversion from the calling argument type | |

to the formal parameter type. Mismatched types on output parameters (out | |

or inout) must have a conversion from the formal parameter type to the | |

calling argument type. | |

If implicit conversions can be used to find more than one matching | |

function, a single best-matching function is sought. To determine a best | |

match, the conversions between calling argument and formal parameter | |

types are compared for each function argument and pair of matching | |

functions. After these comparisons are performed, each pair of matching | |

functions are compared. A function definition A is considered a better | |

match than function definition B if: | |

* for at least one function argument, the conversion for that argument | |

in A is better than the corresponding conversion in B; and | |

* there is no function argument for which the conversion in B is better | |

than the corresponding conversion in A. | |

If a single function definition is considered a better match than every | |

other matching function definition, it will be used. Otherwise, a | |

semantic error occurs and the shader will fail to compile. | |

To determine whether the conversion for a single argument in one match is | |

better than that for another match, the following rules are applied, in | |

order: | |

1. An exact match is better than a match involving any implicit | |

conversion. | |

2. A match involving an implicit conversion from float to double is | |

better than a match involving any other implicit conversion. | |

3. A match involving an implicit conversion from either int or uint to | |

float is better than a match involving an implicit conversion from | |

either int or uint to double. | |

If none of the rules above apply to a particular pair of conversions, | |

neither conversion is considered better than the other. | |

For the function prototypes (A), (B), and (C) above, the following | |

examples show how the rules apply to different sets of calling argument | |

types: | |

f(vec4, vec4); // exact match of vec4 f(in vec4 x, out vec4 y) | |

f(vec4, uvec4); // exact match of vec4 f(in vec4 x, out ivec4 y) | |

f(vec4, ivec4); // matched to vec4 f(in vec4 x, out vec4 y) | |

// (C) not relevant, can't convert vec4 to | |

// ivec4. (A) better than (B) for 2nd | |

// argument (rule 2), same on first argument. | |

f(ivec4, vec4); // NOT matched. All three match by implicit | |

// conversion. (C) is better than (A) and (B) | |

// on the first argument. (A) is better than | |

// (B) and (C). | |

Modify Section 8.3, Common Functions, p. 84 | |

(add support for single-precision frexp and ldexp functions) | |

Syntax: | |

genType frexp(genType x, out genIType exp); | |

genType ldexp(genType x, in genIType exp); | |

The function frexp() splits each single-precision floating-point number in | |

<x> into a binary significand, a floating-point number in the range [0.5, | |

1.0), and an integral exponent of two, such that: | |

x = significand * 2 ^ exponent | |

The significand is returned by the function; the exponent is returned in | |

the parameter <exp>. For a floating-point value of zero, the significant | |

and exponent are both zero. For a floating-point value that is an | |

infinity or is not a number, the results of frexp() are undefined. | |

If the input <x> is a vector, this operation is performed in a | |

component-wise manner; the value returned by the function and the value | |

written to <exp> are vectors with the same number of components as <x>. | |

The function ldexp() builds a single-precision floating-point number from | |

each significand component in <x> and the corresponding integral exponent | |

of two in <exp>, returning: | |

significand * 2 ^ exponent | |

If this product is too large to be represented as a single-precision | |

floating-point value, the result is considered undefined. | |

If the input <x> is a vector, this operation is performed in a | |

component-wise manner; the value passed in <exp> and returned by the | |

function are vectors with the same number of components as <x>. | |

(add support for new integer built-in functions) | |

Syntax: | |

genIType bitfieldExtract(genIType value, int offset, int bits); | |

genUType bitfieldExtract(genUType value, int offset, int bits); | |

genIType bitfieldInsert(genIType base, genIType insert, int offset, | |

int bits); | |

genUType bitfieldInsert(genUType base, genUType insert, int offset, | |

int bits); | |

genIType bitfieldReverse(genIType value); | |

genUType bitfieldReverse(genUType value); | |

genIType bitCount(genIType value); | |

genIType bitCount(genUType value); | |

genIType findLSB(genIType value); | |

genIType findLSB(genUType value); | |

genIType findMSB(genIType value); | |

genIType findMSB(genUType value); | |

The function bitfieldExtract() extracts bits <offset> through | |

<offset>+<bits>-1 from each component in <value>, returning them in the | |

least significant bits of corresponding component of the result. For | |

unsigned data types, the most significant bits of the result will be set | |

to zero. For signed data types, the most significant bits will be set to | |

the value of bit <offset>+<base>-1. If <bits> is zero, the result will be | |

zero. The result will be undefined if <offset> or <bits> is negative, or | |

if the sum of <offset> and <bits> is greater than the number of bits used | |

to store the operand. Note that for vector versions of bitfieldExtract(), | |

a single pair of <offset> and <bits> values is shared for all components. | |

The function bitfieldInsert() inserts the <bits> least significant bits of | |

each component of <insert> into the corresponding component of <base>. | |

The result will have bits numbered <offset> through <offset>+<bits>-1 | |

taken from bits 0 through <bits>-1 of <insert>, and all other bits taken | |

directly from the corresponding bits of <base>. If <bits> is zero, the | |

result will simply be <base>. The result will be undefined if <offset> or | |

<bits> is negative, or if the sum of <offset> and <bits> is greater than | |

the number of bits used to store the operand. Note that for vector | |

versions of bitfieldInsert(), a single pair of <offset> and <bits> values | |

is shared for all components. | |

The function bitfieldReverse() reverses the bits of <value>. The bit | |

numbered <n> of the result will be taken from bit (<bits>-1)-<n> of | |

<value>, where <bits> is the total number of bits used to represent | |

<value>. | |

The function bitCount() returns the number of one bits in the binary | |

representation of <value>. | |

The function findLSB() returns the bit number of the least significant one | |

bit in the binary representation of <value>. If <value> is zero, -1 will | |

be returned. | |

The function findMSB() returns the bit number of the most significant bit | |

in the binary representation of <value>. For positive integers, the | |

result will be the bit number of the most significant one bit. For | |

negative integers, the result will be the bit number of the most | |

significant zero bit. For a <value> of zero or negative one, -1 will be | |

returned. | |

(support for unsigned integer add/subtract with carry-out) | |

Syntax: | |

genUType uaddCarry(genUType x, genUType y, out genUType carry); | |

genUType usubBorrow(genUType x, genUType y, out genUType borrow); | |

The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and | |

<y>, returning the sum modulo 2^32. The value <carry> is set to zero if | |

the sum was less than 2^32, or one otherwise. | |

The function usubBorrow() subtracts the 32-bit unsigned integer or vector | |

<y> from <x>, returning the difference if non-negative or 2^32 plus the | |

difference, otherwise. The value <borrow> is set to zero if x >= y, or | |

one otherwise. | |

(support for signed and unsigned multiplies, with 32-bit inputs and a | |

64-bit result spanning two 32-bit outputs) | |

Syntax: | |

void umulExtended(genUType x, genUType y, out genUType msb, | |

out genUType lsb); | |

void imulExtended(genIType x, genIType y, out genIType msb, | |

out genIType lsb); | |

The functions umulExtended() and imulExtended() multiply 32-bit unsigned | |

or signed integers or vectors <x> and <y>, producing a 64-bit result. The | |

32 least significant bits are returned in <lsb>; the 32 most significant | |

bits are returned in <msb>. | |

GLX Protocol | |

None. | |

Dependencies on ARB_gpu_shader_fp64 | |

This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set | |

of implicit conversions supported in the OpenGL Shading Language. If more | |

than one of these extensions is supported, an expression of one type may | |

be converted to another type if that conversion is allowed by any of these | |

specifications. | |

If ARB_gpu_shader_fp64 or a similar extension introducing new data types | |

is not supported, the function overloading rule in the GLSL specification | |

preferring promotion an input parameters to smaller type to a larger type | |

is never applicable, as all data types are of the same size. That rule | |

and the example referring to "double" should be removed. | |

Dependencies on NV_gpu_shader5 | |

This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set | |

of implicit conversions supported in the OpenGL Shading Language. If more | |

than one of these extensions is supported, an expression of one type may | |

be converted to another type if that conversion is allowed by any of these | |

specifications. | |

If NV_gpu_shader5 is supported, integer data types are supported with four | |

different precisions (8-, 16, 32-, and 64-bit) and floating-point data | |

types are supported with three different precisions (16-, 32-, and | |

64-bit). The extension adds the following rule for output parameters, | |

which is similar to the one present in this extension for input | |

parameters: | |

5. If the formal parameters in both matches are output parameters, a | |

conversion from a type with a larger number of bits per component is | |

better than a conversion from a type with a smaller number of bits | |

per component. For example, a conversion from an "int16_t" formal | |

parameter type to "int" is better than one from an "int8_t" formal | |

parameter type to "int". | |

Such a rule is not provided in this extension because there is no | |

combination of types in this extension and ARB_gpu_shader_fp64 where this | |

rule has any effect. | |

Errors | |

None | |

New State | |

None | |

New Implementation Dependent State | |

None | |

Issues | |

(1) What should this extension be called? | |

UNRESOLVED. This extension borrows from GL_ARB_gpu_shader5, so creating | |

some sort of a play on that name would be viable. However, nothing in | |

this extension should require SM5 hardware, so such a name would be a | |

little misleading and weird. | |

Since the primary purpose is to add integer related functions from | |

GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions | |

for now. | |

(2) Why is some of the formatting in this extension weird? | |

RESOLVED: This extension is formatted to minimize the differences (as | |

reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5 | |

specification. | |

(3) Should ldexp and frexp be included? | |

RESOLVED: Yes. Few GPUs have native instructions to implement these | |

functions. These are generally implemented using existing GLSL built-in | |

functions and the other functions provided by this extension. | |

(4) Should umulExtended and imulExtended be included? | |

RESOLVED: Yes. These functions should be implementable on any GPU that | |

can support the rest of this extension, but the implementation may be | |

complex. The implementation on a GPU that only supports 32bit x 32bit = | |

32bit multiplication would be quite expensive. However, many GPUs | |

(including OpenGL 4.0 GPUs that already support this function) have a | |

32bit x 16bit = 48bit multiplier. The implementation there is only | |

trivially more expensive than regular 32bit multiplication. | |

(5) Should the pack and unpack functions be included? | |

RESOLVED: No. These functions are already available via | |

GL_ARB_shading_language_packing. | |

(6) Should the "BitsTo" functions be included? | |

RESOLVED: No. These functions are already available via | |

GL_ARB_shader_bit_encoding. | |

Revision History | |

Rev. Date Author Changes | |

---- ----------- -------- ----------------------------------------- | |

3 31-Mar-2017 Jon Leech Add ES support (OpenGL-Registry/issues/3) | |

2 7-Jul-2016 idr Fix typo in #extension line | |

1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5. |