In addition to many extensions to the classical fixed-function OpenGL rendering pipeline, Onyx4 and Silicon Graphics Prism systems support the following extensions for vertex and fragment programs:
Collectively, vertex and fragment programs are referred to as graphics pipeline programs or just pipeline programs.
These extensions allow applications to replace most of the normal fixed-function transformation, lighting, rasterization, and texturing operations with application-defined programs that execute on the graphics hardware. The extensions enable a nearly unlimited range of effects previously available only through offline rendering or by multipass fixed-function algorithms.
This chapter describes how to define and use vertex and fragment programs and includes an overview of the programming language in which these programs are specified. This chapter also briefly describes the following obsolete (legacy) vertex and fragment program extensions supported only for compatibility:
ATI_fragment_shader
EXT_vertex_shader
The structure of this chapter differs from that of the other chapters that describe extensions because of the level of detail given to programming the vertex and fragament programs. This chapter uses the following structure:
The ARB_vertex_program and ARB_fragment_program extensions allow applications to replace respectively the fixed-function vertex processing and fragment processing pipeline of OpenGL 1.3 with user-defined programs.
The fixed-function rendering pipeline of OpenGL 1.3 together with the wide range of OpenGL extensions supported by Onyx4 and Silicon Graphics Prism systems is very flexible, but the achievable rendering effects are constrained by the hardwired algorithms of the fixed-function pipeline. If an application needs to use a custom lighting model, to combine multiple textures in a way not expressable by register combiners, or to do anything else that is difficult to express within the fixed-function pipeline; it should consider if the desired effect can be expressed as a vertex and/or fragment program.
While pipeline programs are not yet expressable in a fully general-purpose, Turing-complete language, the limited programmability provided is more than adequate for many advanced rendering algorithms. The capabilities of pipeline programs are rapidly growing as more general-purpose languages are supported by graphics hardware.
Before pipeline programs, the most common way to implement advanced rendering algorithms, while still taking advantage of graphics hardware, was to decompose the algorithm into a series of steps, each expressable as a single rendering pass of the fixed-function OpenGL pipeline. By accumulating intermediate results in pixel buffers, aux buffers, or textures, very complex effects could be built up by such multipass rendering.
This approach is widely used in older programs and in languages such as the SGI OpenGL Shader, a compiler which turns a high-level shading language program into a equivalent series of fixed-function rendering passes.
The disadvantages of multipass rendering are the following:
Performance
Multiple rendering passes usually require re-transforming geometry for each pass. The multiple passes consume additional CPU-to-graphics bandwidth for re-copying geometry and consume additional graphics memory and bandwidth for storing intermediate results; all of these requirements reduce performance.
Complexity
Converting a complex rendering algorithm into multiple fixed-function passes can be a tedious task that requires a deep understanding of the capabilities of the graphics pipeline. The meaning of the resulting passes is difficult to infer even with knowledge of the algorithm. Also, restructuring applications to perform multipass rendering is often necessary. While software like the SGI OpenGL Shader can assist in these steps, it is still less obvious to do complex multipass rendering than to simply express the algorithm as a single vertex or fragment program.
Accuracy
The accuracy achievable with multipass rendering is constrained by the limited precision of the intermediate storage (for example, pixel buffers, aux buffers, textures, etc.) used to accumulate intermediate results between passes. Typically, the internal precision of the vertex and fragment processing pipelines is much higher than the external precision (8–12 bits/color component) in which intermediate data can be stored. Errors are generated when clamping intermediate data to external precision, and those errors can rapidly accumulate in the later rendering passes.
For all these reasons, pipeline programs are the preferred way of expressing rendering algorithms too complex to fit in a single fixed-function rendering pass. However, in many cases, the fixed-function pipeline is still more than adequate for application needs. You must also be cautious because current graphics hardware only supports pipeline programs of a limited length and complexity and because performance may degrade rapidly if certain types of programmable operations are combined or expressed in the wrong order.
The vertex and fragment program extensions are much more complicated than most fixed-function OpenGL extensions. This section describes the extensions in the following subsections:
Pipeline programs are represented by object names (of type GLuint) that are managed in exactly the same fashion as texture and display list names with the following routines for allocating unused program names, deleting programs, and testing if a name refers to a valid program:
void glGenProgramsARB(GLsizei n, GLuint *programs); void glDeleteProgramsARB(GLsizei n, const GLuint *programs); GLboolean glIsProgramARB(GLuint program); |
To bind a program name as the currently active vertex or fragment program, make the following call:
void glBindProgramARB(GLenum target, GLuint programs); |
Set the argument target to GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB. Similar to texture objects, there is a default program name of 0 bound for each type of program in the event that the application does not bind a generated name.
To define the contents of a vertex or fragment program for the currently bound program name, make the following call:
void glProgramStringARB(GLenum target, GLenum format, GLsizei length, const GLvoid *string); |
The arguement values are defined as follows:
target | ||
format | Specifies the encoding of the program string and must be GL_PROGRAM_FORMAT_ASCII_ARB, indicating a 7-bit ASCII character string. | |
string | Contains the program string. If string is a valid program (as described in section “Structure of Pipeline Programs”), the program bound to target will be updated to execute the program when the corresponding target is enabled. | |
length | Specifies the length of the string. Because the length is specified in the call, string need not have a trailing NULL byte, unlike most C language strings. |
To use the currently bound vertex or fragment program (substituting it for the corresponding fixed functionality, as described in the next section) or to return to using the fixed-function pipeline, call glEnable() or glDisable(), respectively, with parameters GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB.
Vertex programs substitute for the following OpenGL fixed vertex processing functionality:
Modelview and projection matrix vertex transformations
Vertex weighting and blending (if the ARB_vertex_blend extension is supported)
Normal transformation, rescaling, and normalization
Color material
Per-vertex lighting
Texture coordinate generation and texture matrix transformations
Per-vertex point size computations (if the ARB_point_parameters extension is supported)
Per-vertex fog coordinate computations (if the EXT_fog_coord extension is supported)
User-defined clip planes
Normalization of GL_AUTO_NORMAL evaluated normals
All of the preceding functionality when computing the current raster position
The following fixed vertex processing functionality is always performed even when using vertex programs:
Clipping to the view frustum
Perspective divide (division by w)
The viewport transformation
The depth range transformation
Front and back color selection (for two-sided lighting and coloring)
Clamping the primary and secondary colors to [0,1]
Primitive assembly and subsequent operations
Evaluators (except for GL_AUTO_NORMAL)
Fragment programs substitute for the following OpenGL fixed fragment processing functionality:
Texture application (including multitexture, texture combiner, shadow mapping, and any other fixed-function texturing extensions)
Color sum (if the EXT_secondary_color extension is supported)
Fog application
Both vertex and fragment programs are expressed in a low-level, register-based language similar to a traditional CPU assembler language. However, the registers are four-element vectors, supporting vector data types such as homogeneous coordinates (X, Y, Z, W components) and colors (R, G, B, A components). The instruction sets for both types of programs are augmented to support common mathematical and graphical operations on these four-element vectors.
A pipeline program has the following structure:
program-type statement1 statement2 . . . statementn END |
For vertex programs, program-type must be !!ARBvp1.0. For fragment programs, it must be !!ARBfp1.0.
Statements may be one of the following:
Program options
Naming statements
Program instructions
The statements must be terminated by semicolons (;). Whitespace (spaces, tabs, newlines, and carriage returns) is ignored, although programs are typically written with one statement per line for clarity. Comments, which are ignored, are introduced with the # character and continue to the next newline or carriage return.
When executing programs, instructions are processed in the order they appear in the program string. There are no looping or branching constructs in either vertex or fragment programs.
Statements that control extended language features are called option statements. The following is an example:
# Vertex program is position-invariant OPTION ARB_position_invariant; |
The following are the currently defined program options:
Fog application options (fragment programs only)
Precision hint options (fragment programs only)
Position-Invariant option (vertex programs only)
Future OpenGL extensions may introduce additional program options; such options are only valid if the corresponding extension is supported by the implementation.
Fog Application Options (fragment programs only)
These options allow use of the OpenGL fixed-function fog model in a fragment program without explicitly performing the fog computation.
If a fragment program specifies one of the options ARB_fog_exp, ARB_fog_exp2, or ARB_fog_linear, the program will apply fog to the program's final clamped color output using a fog mode of GL_EXP, GL_EXP2, or GL_LINEAR, respectively.
Using fog in this fashion consumes extra program resources. The program will fail to load under the following conditions:
You specify a fog option and the number of temporaries the program contains exceeds the implementation-dependent limit minus one.
You specify a fog option and the number of attributes the program contains exceeds the implementation-dependent limit minus one.
You specify a fog option and the number of parameters the program contains exceeds the implementation-dependent limit minus two.
You specify the ARB_fog_exp option and the number of instructions or ALU instructions the program contains exceeds the implementation-dependent limit minus three.
You specify the ARB_fog_exp2 option and the number of instructions or ALU instructions the program contains exceeds the implementation-dependent limit minus four.
You specify the ARB_fog_linear option and the number of instructions or ALU instructions the program contains exceeds the implementation-dependent limit minus two.
You specify more than one of the fog options.
Precision Hint Options (fragment programs only)
Fragment program computations are carried out at an implementation- dependent precision. However, some implementations may be able to perform fragment program computations at more than one precision and may be able to trade off computation precision for performance.
If a fragment program specifies the ARB_precision_hint_fastest program option, implementations should select precision to minimize program execution time with possibly reduced precision. If a fragment program specifies the ARB_precision_hint_nicest program option, implementations should maximize the precision with a longer execution time.
Only one precision control option may be specified by any given fragment program. A fragment program that specifies both the ARB_precision_hint_fastest and ARB_precision_hint_nicest program options will fail to load.
Position-Invariant Option (vertex programs only)
If a vertex program specifies the ARB_position_invariant option, the program is used to generate all transformed vertex attributes, except for position. Instead, clip coordinates are computed, and user clipping is performed as in the fixed-function OpenGL pipeline. Use of position-invariant vertex programs should be used when the transformed position of a vertex will be the same whether vertex program mode is enabled or fixed-function vertex processing is performed. This allows mixing both types of vertex processing in multipass rendering algorithms.
When the position-invariant option is specified in a vertex program, vertex programs are not allowed to produce a transformed position. Therefore, result.position may not be bound or written by such a program. Additionally, the vertex program will fail to load if the number of instructions it contains exceeds the implementation-dependent limit minus four.
Statements that associate identifiers with attributes, parameters, temporaries, or program output are called naming statements. The following are the six types of naming statements:
Attribute statements
Parameter statements
Temporary statements
Address statements
Alias statements
Output statements
Attribute Statements
Attribute statements bind an identifier to a vertex or fragment attribute supplied to the program. Attributes are associated with the particular vertex or fragment being processed, and their values typically vary for every invocation of a program. They are defined in OpenGL through commands such as glVertex3f() or glColor4ub(), or in the case of fragments, generated by vertex processing.
A few examples of attributes are vertex position, color, and texture coordinates; or fragment position, color, and fog coordinate. Section “Vertex and Fragment Attributes” provides a complete list of vertex and fragment attributes. The following are examples of attribute statements:
# Bind vertex position (e.g. glVertex) to attribute `position' ATTRIB position = vertex.position; # # Bind fragment texture coordinate set one to attribute `texcoord' ATTRIB texcoord = fragment.texcoord[1]; |
Attributes are read-only within a program.
Parameter Statements
Parameter statements bind an identifier to a program parameter. Parameters have the following four types:
Program environment parameters
Constants that are shared by all vertex programs.
Program local parameters
Constants that are restricted to a single vertex or fragment program.
OpenGL state values
Items such as transformation matrices, or lighting, material, and texture coordinate generation parameters.
Constants declared within a program
Comma-delimited lists of one to four values enclosed in braces. If fewer than four values are specified, the second, third, and fourth values default to 0.0, 0.0, and 1.0, respectively; or they are single values not enclosed in braces, in which case all four components of the parameter are initialized to the specified value.
Parameter statements may also be declared as arrays, which are initialized from subranges of program parameters or state, which are themselves arrays.
Section “Vertex and Fragment Program Parameters” provides a complete list of program environment parameters, program local parameters, and OpenGL state that may be used as parameters. The following are examples of parameter statements:
# `var' is bound to program environment parameter 1 PARAM var = program.env[1]; # `vars' is bound to program environment parameters 0-3 PARAM vars[4] = program.env[0..3]; # `lvar' is bound to program local parameter 2 PARAM lvar = program.local[2]; # `ambient' is bound to the ambient color of light 1 PARAM ambient = state.light[1].ambient; # `cplane' is bound to the coefficients of user clip plane 0 PARAM cplane = state.clip[0].plane; # `coeffs' is bound to the four constant values -1,0 1.0, e, pi PARAM coeffs = { -1.0, 1.0, 2.71828, 3.14159 }; # `ones' is bound to the constant values 1.0, 1.0, 1.0, 1.0 PARAM ones = 1.0; |
Parameters are read-only within a program.
Temporary Statements
Temporary statements declare temporary variables; these are read/write registers used only within a single execution of a program. Initially, the contents of temporaries are undefined. Temporaries are declared as in the following examples:
# Declare a single temporary TEMP temp1; # Declare multiple temporaries in a single statement TEMP temp2, temp3; |
The maximum number of temporaries that can be declared in a single program is implementation-dependent and is described further in section “Program Resource Limits and Usage”.
Address Statements
Address statements declare address registers; these are read/write registers used only within a single execution of a vertex program and allow a form of indirect accessing into parameter arrays. Address statements are only supported in vertex programs.
Address registers are declared either singly or a in multiple fashion like temporaries but using the ADDRESS statement, as in the following example:
# Declare two address registers `addr' and `index' ADDRESS addr, index; |
Only the first component of an address register (.x) is used. For an address register addr, this component is referred to as addr.x. Section “Program Instructions” further describes register component selection. As shown in the following example, address registers are loaded with the ARL command:
# Load address register with the 2nd component (.y) of temporary temp0 ARL addr.x, temp0.y; |
The value loaded is converted to an integer by clamping towards negative infinity.
Given a parameter array and an address register, a particular element of the array can be selected based on the address register by using the subscript notation [addr.x+offset], where offset is a value in the range –64..63. The following example illustrates the use of the subscript notation:
# Params is bound to the first 8 elements of the program local # parameters. PARAM params[8] = program.local[0..7]; # Move parameter at index addr.x+2 into register temp0 MOV temp0, params[addr.x+2]; |
Alias Statements
Alias statements declare identifiers that are defined to have the same meaning as another already declared identifier of any type. They do not count towards program resource limits. For example, a temporary can be aliased as follows:
# Declare temporary `temp0' TEMP temp0; # Declare alias `alias' for temp0. `alias' and `temp0' may be used # interchangably ALIAS alias = temp0; |
Output Statements
Output statements declare identifiers that bind to program output. Output depends on the type of program.
For vertex programs, output includes values such as transformed vertex position, primary and secondary colors, transformed texture coordinates, which are passed on to rasterization. After rasterization, interpolated results of this output are available as attributes of fragment programs or are used in fixed-function fragment processing in place of the attributes resulting from fixed-function vertex processing.
For fragment programs, output includes color(s) and depth values, which are passed on to raster processing in place of the colors and depths generated by fixed-function fragment processing. Section “Pipeline Program Input and Output” describes program output further.
The following are examples of output statements:
# Bind vertex output position (e.g., transformed vertex coordinates) # to register `windowpos' OUTPUT windowpos = result.position; # Bind fragment output depth (e.g., Z value) to register `depth' OUTPUT depth = result.depth; |
Output is write-only within a program.
Pipeline program instructions are either four-element vector or scalar operations performed on one, two, or three source operands and one destination operand. The operands may be either attribute, parameter, or temporary registers. The general format of instructions is one of the following:
mnemonic dstreg,srcreg1 mnemonic dstreg,srcreg1,srcreg2 mnemonic dstreg,srcreg1,srcreg2,srcreg3 |
The fields are defined as follows:
mnemonic | The instruction name | |
dstreg | The destination register name | |
srcregi | Source register names |
Section “Program Instruction Summary” provides a complete list of instructions supported by vertex and fragment programs.
Scalar Component Selection
When a scalar source operand is required, identify it by appending one of .x, .y, .z, or .w to the register name to select the first, second, third, or fourth components, respectively, of the source register. These selectors are intended to refer to the X, Y, Z, and W components of a register being used as a XYZW vector. The following example computes the cosine of the second component of the source register coord:
COS result, coord.y; |
In fragment programs, but not in vertex programs, the selectors .r, .g, .b, and .a may be used interchangably with the corresponding .x, .y, .z, and .w selectors. These selectors are intended to refer to the red, green, blue, and alpha components of a register being used as an RGBA color. The following example computes the base 2 logarithm of the fourth component of the source register color:
LG2 result, color.a; |
Vector Component Negation and Swizzling
Any source register may be modified by prepending a minus sign (-) to the register name. Each component is negated and the resulting vector used as input to the instruction. For example, the following two statements are equivalent:
# Compute result = src0 - src1 SUB result, src0, src1; # Compute result = src0 + (-src1) = src0 - src1 ADD result, src0, -src1; |
In addition, components of a source register may be arbitrarily selected and reordered before being used as input to an instruction. This operation is called swizzling. To swizzle a source register, append a four-letter suffix of .???? to the register name, where each ? may be one of the component selectors x, y, z, or w. In fragment programs, but not in vertex programs , the selectors r, g, b, or a may also be used.
The selectors map components of the source register; the first, second, third, and fourth selectors determine the source of the first, second, third, and fourth components, respectively, of the actual register value passed to the instruction. For example, the following code reverses the components of a register:
PARAM src = { 1.0, 2.0, 3.0, 4.0 }; TEMP result; MOV result, src.wzyx; # result now contains { 4.0, 3.0, 2.0, 1.0 } |
Swizzling may copy a component of the source register into multiple components of the instruction input by replicating selectors. For example, the following code replicates the first and third components of a register:
PARAM src = { 1.0, 2.0, 3.0, 4.0 }; TEMP result; MOV result, src.xxzz; # result now contains { 1.0, 1.0, 3.0, 3.0 } |
To replicate a single component of a register into all four components of the instruction input, a shorthand notation using a single component selector may be used. The following code is equivalent to replicating the same component selector four times:
PARAM src = { 1.0, 2.0, 3.0, 4.0 }; TEMP result; # src.y is equivalent to src.yyyy MOV result, src.y; # result now contains { 2.0, 2.0, 2.0, 2.0 } |
Destination Write Masking
Program instructions write a four-component result vector to a single destination register. Writes to individual components of the destination register may be controlled by specifying a component write mask. To mask a destination register, append a period (.) followed by selectors for the components to be written (between one and four). The selectors must be unique and must appear in the order xyzw. In fragment programs, but not in vertex programs, the rgba selectors may also be used. For example, the following line writes only the first and third components of a vector and leaves the second and fourth components unchanged:
MOV result.xz, src |
Fragment Program Destination Clamping
In fragment programs, but not in vertex programs, instructions may be modified to clamp values to the range [0,1] before writing to the unmasked components of a destination register. Clamping is particularly useful when operating in the [0,1] color space limits of the output framebuffer, when using texture coordinates, when computing address register offsets, or for other purposes. Fragment program instructions support clamping by appending the suffix _SAT to the instruction mnemonic. Clamping the RGB components to [0,1] and using write masks to leave the A component of the destination unchanged, the following example copies a color vector:
PARAM color = { -0.1, 0.7, 1.2, 1.0 }; TEMP result; MOV_SAT result.rgb, color; # result now contains { 0.0, 0.7, 1.0, ??? } |
Constants
Numeric constants may be used in place of source registers. For instructions requiring scalar input, replace the register name with a single, floating point number. For instructions requiring vector input, replace the register name with a constant vector defined in the same fashion as constants in parameter statements. The following are examples of scalar and vector constants in instructions:
# Compute cosine of constant value 2.0 COS result, 2.0; # Subtract 1.0 from each element of src SUB result, src, { 1.0, 1.0, 1.0, 1.0 }; |
The preceding description of program structure includes mechanisms for binding input to and output from programs. This section describes the complete set of input and output available to pipeline programs. It is important to remember that vertex and fragment programs have different input and output, because they replace different portions of the OpenGL fixed-function pipeline.
The input available to programs includes attributes specific to a vertex or fragment (such as position, color, or texture coordinates) and parameters, which are constant values associated with a single program or collectively with all programs.
The output that is generated by programs are results passed on to later stages of the graphics pipeline, such as transformed vertices and texture coordinates, lit colors, or fragment depth values.
This section lists all possible attributes for vertex and fragment programs and includes a description of the component usage and examples of commands creating the corresponding OpenGL state.
Vertex attributes are specified by the application using OpenGL immediate-mode commands such as glVertex3f() or vertex array commands such as glNormalPointer(). Attributes of a vertex are made available to a vertex program when it is executing for that vertex and can be accessed in instructions either by binding their names with the ATTRIB naming statement or directly by use of the attribute name.
In addition to the builtin OpenGL attributes such as position, normal, color, and texture coordinates, vertex programs may be passed additional per-vertex values by using generic vertex attributes. Generic attributes are four-component vectors specified using a new set of OpenGL commands. The maximum number of generic attributes supported is implementation-dependent, but must be at least 16. Generic attributes are specified in OpenGL using the commands described in section “Generic Vertex Attribute Specification”.
In a vertex program, many generic attributes are aliased onto builtin OpenGL attributes. When declaring attributes in the program, only one of the builtin attribute or the corresponding generic attribute aliased onto the builtin may be bound. Attempting to bind both a builtin and the corresponding generic attribute results in an error when loading the program. Not all generic attributes have builtin attribute aliases, and conversely so.
Table 13-1 lists the vertex program attributes. In the table, the notation n refers to additional implementation-specific resources beyond those explicitly numbered. The possible values of n depend on the maximum number of texture coordinate sets, generic vertex attributes, vertex weights, or vertex indices supported by the implementation. For example, if the implementation supports 24 generic vertex attributes, values of n for vertex.attrib[n] range from 0 to 23.
Table 13-1. Builtin and Generic Vertex Program Attributes
Generic Binding | Builtin Binding | Builtin Component Usage | Builtin Description | OpenGL Command |
---|---|---|---|---|
vertex.attrib [0] | vertex.position | (x,y,z,w) | Object-space vertex position | glVertex3f() |
vertex.attrib [1] | vertex.weight | (w,w,w,w) | Vertex weights 0–3 | glWeightfARB() |
vertex.attrib [1] | vertex.weight[n] | (w,w,w,w) | Additional vertex weights n-n+3 | glWeightfARB() |
vertex.attrib [2] | vertex.normal | (x,y,z,1) | Normal vector | glNormal3f() |
vertex.attrib [3] | vertex.color | (r,g,b,a) | Primary color | glColor4ub() |
vertex.attrib [3] | vertex.color. | (r,g,b,a) | Primary color | glColor4ub() |
vertex.attrib [4] | vertex.color. | (r,g,b,a) | Secondary color | glSecondaryColor3u bEXT() |
vertex.attrib [5] | vertex.fogcoord | (f,0,0,1) | Fog coordinate | glFogCoordEXT() |
vertex.attrib [6] |
|
| Generic attribute 6 (not aliased) |
|
vertex.attrib [7] |
|
| Generic attribute 7 (not aliased) |
|
vertex.attrib [8] | vertex.texcoord | (s,t,r,q) | Texture coordinate set 0 | glTexCoord3f() |
vertex.attrib [8] | vertex.texcoord[0] | (s,t,r,q) | Texture coordinate set 0 | glTexCoord3f() |
vertex.attrib [9] | vertex.texcoord[1] | (s,t,r,q) | Texture coordinate set 1 | glMultiTexCoord(TE XTURE1,...) |
vertex.attrib [10] | vertex.texcoord[2] | (s,t,r,q) | Texture coordinate set 2 | glMultiTexCoord(TE XTURE2,...) |
vertex.attrib [11] | vertex.texcoord[3] | (s,t,r,q) | Texture coordinate set 3 | glMultiTexCoord(TE XTURE3,...) |
vertex.attrib [12] | vertex.texcoord[4] | (s,t,r,q) | Texture coordinate set 4 | glMultiTexCoord(TE XTURE4,...) |
vertex.attrib [13] | vertex.texcoord[5] | (s,t,r,q) | Texture coordinate set 5 | glMultiTexCoord(TE XTURE5,...) |
vertex.attrib [14] | vertex.texcoord[6] | (s,t,r,q) | Texture coordinate set 6 | glMultiTexCoord(TE XTURE6,...) |
vertex.attrib [15] | vertex.texcoord[7] | (s,t,r,q) | Texture coordinate set 7 | glMultiTexCoord(TE XTURE7,...) |
vertex.attrib [8+n] | vertex.texcoord[n] | (s,t,r,q) | Additional texture coordinate sets | glMultiTexCoord(TE XTURE0+ n ,...) |
| vertex.matrixindex | (i,i,i,i) | Vertex matrix indices 0–3 | glMatrixIndexubvA RB() |
| vertex.matrixindex [n] | (i,i,i,i) | Additional vertex matrix n–n+3 | glMatrixIndexubvA RB() |
vertex.attrib [n] | Depends on n | (x,y,z,w) | Additional generic attributes |
|
Fragment attributes are initially generated by either vertex program output or by the fixed-function OpenGL vertex pipeline if vertex programs are disabled.
Depending on the type of primitive being drawn and on the shading model (GL_FLAT or GL_SMOOTH) selected, the resulting values may be interpolated on a per-fragment basis during rasterization and fragment generation. Unlike vertex attributes, there are no generic fragment attributes.
Attributes of a fragment are made available to a fragment program when it is executing for that fragment and can be accessed in instructions either by binding their names with the ATTRIB naming statement or directly by use of the attribute name.
In Table 13-2, the notation n refers to additional implementation-specific texture coordinates beyond those explicitly numbered. The possible values of n range from zero up to the maximum number of texture coordinate sets supported by the implementation minus one.
Table 13-2. Fragment Program Attributes
Attribute Binding | Component Usage | Description |
---|---|---|
fragment.color | (r,g,b,a) | Primary color |
fragment.color.primary | (r,g,b,a) | Primary color |
fragment.color.secondary | (r,g,b,a) | Secondary color |
fragment.texcoord | (s,t,r,q) | Texture coordinate set 0 |
fragment.texcoord[n] | (s,t,r,q) | Texture coordinate set n |
fragment.fogcoord | (f,0,0,1) | Fog distance/coordinate |
fragment.position | (x,y,z,1/w) | Window position |
Parameters are additional values made available during the execution of programs. When rendering a single primitive, such as a triangle, the vertex and fragment attribute values will differ for every vertex making up the triangle and for every fragment generated by triangle rasterization. However, parameter values will be the same for every vertex and for every fragment.
As cited earlier, the following are the four types of parameters:
Program environment parameters
Shared by all programs of a particular type; that is, there is one set of environment parameters for vertex programs and a different set for fragment programs.
Program local parameters
Specific to a single bound program.
OpenGL state values
Items such as matrices, and material and light properties are available.
Constants
Special cases of program local parameters
Program environment and local parameters are four-component vectors specified using a new set of OpenGL commands described in section “Program Parameter Specification”. The maximum number of parameters supported is implementation-dependent but must be at least 96 each for the vertex program environment and program locals and 24 each for the fragment program environment and program locals. Constants may be specified otherwise.
Program parameters can be accessed in instructions either by binding their names with the PARAM naming statement or directly by use of the parameter name. Parameter names are identical for vertex and fragment programs, although the values differ for the two types of programs. Table 13-3 shows the parameter names.
Table 13-3. Program Environment and Local Parameters
Parameter Binding | Component Usage | Description |
---|---|---|
program.env[a] | (x,y,z,w) | Program environment parameter a |
program.env[a..b] | (x,y,z,w) | Program environment parameters a through b |
program.local[a] | (x,y,z,w) | Program local parameter a |
program.local[a..b] | (x,y,z,w) | Program local parameters a through b |
In Table 13-3, the notation [a] refers to a single parameter indexed by the constant value a, and the notation [a..b] refers to an array of parameters indexed by the constant values a and b, which may be bound to a corresponding array using the PARAM statement. When specifying arrays, b must be greater than a, and both a and b must be within the range of supported parameter indices for that type of parameter.
Most OpenGL state can be accessed in instructions either by binding state names with the PARAM naming statement or directly by use of the state name. OpenGL state falls into several different categories, which are discussed separately in the following subsections:
Material Property Bindings
Light Property Bindings
Texture Coordinate Generation Property Bindings
Texture Enviroment Property Bindings
Fog Property Bindings
Clip Plan Property Bindings
Point Property Bindings
Depth Property Bindings
Matrix Property Bindings
Most OpenGL state categories are available to both vertex and fragment programs, but a few categories are available only to vertex programs, or only to fragment programs. OpenGL state categories restricted to one type of program are identified in their respective subsection.
Material Property Bindings
Material property bindings provide access to the OpenGL state specified with glMaterialf(). Table 13-4 shows the possible bindings.
Table 13-4. Material Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.material.ambient | (r,g,b,a) | Front ambient material color |
state.material.diffuse | (r,g,b,a) | Front diffuse material color |
state.material.specular | (r,g,b,a) | Front specular material color |
state.material.emission | (r,g,b,a) | Front emissive material color |
state.material.shininess | (s,0,0,1) | Front material shininess |
state.material.front.ambient | (r,g,b,a) | Front ambient material color |
state.material.front.diffuse | (r,g,b,a) | Front diffuse material color |
state.material.front.specular | (r,g,b,a) | Front specular material color |
state.material.front.emission | (r,g,b,a) | Front emissive material color |
state.material.front.shininess | (s,0,0,1) | Front material shininess |
state.material.back.ambient | (r,g,b,a) | Back ambient material color |
state.material.back.diffuse | (r,g,b,a) | Back diffuse material color |
state.material.back.specular | (r,g,b,a) | Back specular material color |
state.material.back.emission | (r,g,b,a) | Back emissive material color |
state.material.back.shininess | (s,0,0,1) | Back material shininess |
For material shininess, the .x component is filled with the material's specular exponent, and the .y, .z, and .w components are filled with 0, 0, and 1, respectively. Bindings containing .back refer to the back material; all other bindings refer to the front material.
Material properties can be changed between glBegin() and glEnd(), either directly by calling glMaterialf() or indirectly through color material. However, such property changes are not guaranteed to update parameter bindings until the following glEnd() command. Parameter variables bound to material properties changed between glBegin() and glEnd() are undefined until the following glEnd() command.
Light Property Bindings
Light property bindings provide access to the OpenGL state specified with glLightf() and glLightModelf() and to some derived properties generated from light and light model state values. Table 13-5 shows the possible light property bindings.
Table 13-5. Light Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.light[n].ambient | (r,g,b,a) | Light n ambient color |
state.light[n].diffuse | (r,g,b,a) | Light n diffuse color |
state.light[n].specular | (r,g,b,a) | Light n specular color |
state.light[n].position | (x,y,z,w) | Light n position light n attenuation |
state.light[n].attenuation | (a,b,c,e) | Light n attenuation constants and spot light exponent |
state.light[n].spot.direction | (x,y,z,c) | Light n spot direction and cutoff angle cosine |
state.light[n].half | (x,y,z,1) | Light n infinite half-angle |
state.lightmodel.ambient | (r,g,b,a) | Light model ambient color |
state.lightmodel.scenecolor | (r,g,b,a) | Light model front scene color |
state.lightmodel.front.scenecolor | (r,g,b,a) | Light model front scene color |
state.lightmodel.back.scenecolor | (r,g,b,a) | Light model back scene color |
state.lightprod[n].ambient | (r,g,b,a) | Light n / front material ambient color product |
state.lightprod[n].diffuse | (r,g,b,a) | Light n / front material diffuse color product |
state.lightprod[n].specular | (r,g,b,a) | Light n / front material specular color product |
state.lightprod[n].front.ambient | (r,g,b,a) | Light n / front material ambient color product |
state.lightprod[n].front.diffuse | (r,g,b,a) | Light n / front material diffuse color product |
state.lightprod[n].front.specular | (r,g,b,a) | Light n / front material specular color product |
state.lightprod[n].back.ambient | (r,g,b,a) | Light n / back material ambient color product |
state.lightprod[n].back.diffuse | (r,g,b,a) | Light n / back material diffuse color product |
state.lightprod[n].back.specular | (r,g,b,a) | Light n / back material specular color product |
The [n] syntax indicates a specific light (GL_LIGHTn).
For the following bindings, the .x, .y, .z, and .w components are filled with the red, green, blue, and alpha components, respectively, of the corresponding light color:
state.light[n].ambient
state.light[n].diffuse
state.light[n].specular
For state.light[n].position, the .x, .y, .z, and .w components are filled with the X, Y, Z, and W components, respectively, of the corresponding light position.
For state.light[n].attenuation, the .x, .y, and .z components are filled with the corresponding light constant, linear, and quadratic attenuation parameters. The .w component is filled with the spot light exponent of the corresponding light.
For state.light[n].spot.direction, the .x, .y, and .z components variable are filled with the .x, .y, and .z components of the spot light direction of the corresponding light, respectively. The .w component is filled with the cosine of the spot light cutoff angle of the corresponding light.
For state.light[n].half, the .x, .y, and .z components of the program parameter variable are filled with the x, y, and z components, respectively, of the following normalized infinite half-angle vector:
h_inf = || P + (0, 0, 1) || |
The .w component of is filled with 1. In the computation of h_inf, P consists of the X, Y, and Z coordinates of the normalized vector from the eye position to the eye-space light position. h_inf is defined to correspond to the normalized half-angle vector when using an infinite light (W coordinate of the position is zero) and an infinite viewer. For local lights or a local viewer, h_inf is well-defined but does not match the normalized half-angle vector, which will vary depending on the vertex position.
For state.lightmodel.ambient, the .x, .y, .z, and .w components of the program parameter variable are filled with the red, green, blue, and alpha components of the light model ambient color, respectively.
For state.lightmodel.scenecolor or state.lightmodel.front.scenecolor, the .x, .y, and .z components of the program parameter variable are filled with the red, green, and blue components respectively of the front scene color, defined by the following:
c_scene = a_cs * a_cm + e_cm |
The operand a_cs is the light model ambient color, a_cm is the front ambient material color, and e_cm is the front emissive material color with computations performed separately for each color component. The .w component of the program parameter variable is filled with the alpha component of the front diffuse material color.
For state.lightmodel.back.scenecolor, a similar back scene color computed using back-facing material properties is used. The front and back scene colors match the values that would be assigned to vertices using conventional lighting if all lights were disabled.
For bindings beginning with state.lightprod[n], the .x, .y, and .z components of the program parameter variable are filled with the red, green, and blue components, respectively, of the corresponding light product. The three light product components are the products of the corresponding color components of the specified material property and the light color of the corresponding light (see Table 13-5). The .w component of the program parameter variable is filled with the alpha component of the specified material property.
Light products depend on material properties, which can be changed between glBegin() and glEnd(). Such property changes are not guaranteed to take effect until the following glEnd() command. Program parameter variables bound to light products whose corresponding material property changes between glBegin() and glEnd() are undefined until the following glEnd() command.
Texture Coordinate Generation Property Bindings
Texture coordinate generation property bindings are only available within vertex programs. They provide access to the OpenGL state specified with glTexGenf(). Table 13-6 shows the possible texture coordinate generation property bindings.
Table 13-6. Texture Coordinate Generation Property Bindings
ParameterBinding | Component Usage | Description |
---|---|---|
state.texgen[n].eye.s | (a,b,c,d) | glTexGen() eye linear plane coefficients, s coord, unit n |
state.texgen[n].eye.t | (a,b,c,d) | glTexGen() eye linear plane coefficients, t coord, unit n |
state.texgen[n].eye.r | (a,b,c,d) | glTexGen() eye linear plane coefficients, r coord, unit n |
state.texgen[n].eye.q | (a,b,c,d) | glTexGen() eye linear plane coefficients, q coord, unit n |
state.texgen[n].object.s | (a,b,c,d) | glTexGen() object linear plane coefficients, s coord, unit n |
state.texgen[n].object.t | (a,b,c,d) | glTexGen() object linear plane coefficients, t coord, unit n |
state.texgen[n].object.r | (a,b,c,d) | glTexGen() object linear plane coefficients, r coord, unit n |
state.texgen[n].object.q | (a,b,c,d) | glTexGen() object linear plane coefficients, q coord, unit n |
The [n] syntax indicates a specific texture unit. If omitted, values for texture unit zero will be bound.
For the state.texgen[n].object bindings, the .x, .y, .z, and .w components of the parameter variable are filled with the p1, p2, p3, and p4 values, respectively, specified to glTexGen() as the GL_OBJECT_LINEAR coefficients for the specified texture coordinate .s, .t, .r, .q.
For the state.texgen[n].eye bindings, the .x, .y, .z, and .w components of the parameter variable are filled with the p1', p2', p3', and p4' values, respectively, specified to glTexGen() as the GL_EYE_LINEAR coefficients for the specified texture coordinate .s, .t, .r, .q.
Texture Environment Property Bindings
Texture environment property bindings are only available within fragment programs. They provide access to the texture environment color specified with glTexEnvf(). Table 13-7 shows the possible texture environment property bindings.
Table 13-7. Texture Environment Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.texenv.color | (r,g,b,a) | Texture environment zero color |
state.texenv[n].color | (r,g,b,a) | Texture environment n color |
The [n] syntax indicates a specific texture unit. If omitted, values for texture unit zero will be bound.
For state.texenv[n].color, the .x, .y, .z, and .w components of the parameter variable are filled with the red, green, blue, and alpha components, respectively, of the corresponding texture environment color. Note that only legacy texture units within the range specified by GL_MAX_TEXTURE_UNITS have texture environment state. Texture image units and texture coordinate sets do not have associated texture environment state.
Fog Property Bindings
Fog property bindings provide access to the OpenGL state specified with glFogf(). Table 13-8 shows the possible fog property bindings.
Table 13-8. Fog Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.fog.color | (r,g,b,a) | RGB fog color |
state.fog.params | (d,s,e,r) | Fog density, linear start and end, and 1/(end – start) |
For state.fog.color, the .x, .y, .z, and .w components of the parameter variable are filled with the red, green, blue, and alpha, respectively, of the fog color.
For state.fog.params, the .x, .y, and .z components of the parameter variable are filled with the fog density, linear fog start, and linear fog end parameters, respectively. The .w component is filled with 1 / (end – start), where end and start are the linear fog end and start parameters, respectively.
Clip Plane Property Bindings
Clip plane property bindings are only available within vertex programs. They provide access to the OpenGL state specified with glClipPlane(). Table 13-9 shows the possible clip plane property bindings.
Table 13-9. Clip Plane Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.clip[n].plane | (a,b,c,d) | Clip plane n coefficients |
The [n] syntax indicates a specific clip plane (GL_CLIP_PLANEn).
For state.clip[n].plane, the .x, .y, .z, and .w components of the parameter variable are filled with the eye-space transformed coefficients p1', p2', p3', and p4', respectively, of the corresponding clip plane.
Point Property Bindings
Point property bindings are only available within vertex programs. They provide access to the OpenGL state specified with the glPointParameterfvARB() command (if the ARB_point_parameters extension is supported). Table 13-10 shows the possible point property bindings.
Table 13-10. Point Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.point.size | (s,n,x,f) | Point size, minimum and maximum size clamps, and fade threshold |
state.point.attenuation | (a,b,c,1) | Point size attenuation constants |
For state.point.size, the .x, .y, .z, and .w components of the parameter variable are filled with the point size, minimum point size, maximum point size, and fade threshold, respectively.
For state.point.attenuation, the .x, .y, and .z components of the parameter variable are filled with the constant, linear, and quadratic point size distance attenuation parameters (a, b, and c), respectively. The .w component is filled with 1.
Depth Property Bindings
Depth property bindings are only available within fragment programs. They provide access to the OpenGL state specified with glDepthRange(). Table 13-11 shows the possible depth property bindings.
Table 13-11. Depth Property Bindings
Parameter Binding | Component Usage | Description |
---|---|---|
state.depth.range | (n,f,d,1) | Depth range near, far, and far – near (d) |
For state.depth.range, the .x and .y components of the parameter variable are filled with the mappings of near and far clipping planes to window coordinates, respectively. The .z component is filled with the difference of the mappings of near and far clipping planes, far – near. The .w component is filled with 1.
Matrix Property Bindings
Matrix property bindings provide access to the OpenGL state specified with commands that load and multiply matrices, such as glMatrixMode() and glLoadMatrixf(). Table 13-12 shows the possible matrix property bindings.
Table 13-12. Matrix Property Bindings
Parameter Binding | Description |
---|---|
state.matrix.modelview[n] | Modelview matrix n |
state.matrix.projection | Projection matrix |
state.matrix.mvp | Modelview projection matrix |
state.matrix.texture[n] | Texture matrix n |
state.matrix.palette[n] | Modelview palette matrix n |
state.matrix.program[n] | Program matrix n |
The [n] syntax indicates a specific matrix number. For .modelview and .texture, a matrix number is optional, and matrix zero will be bound if the matrix number is omitted. The field .program refers to generic program matrices, which are defined as described in section “Generic Program Matrix Specification”. The field .palette refers to the matrix palette defined with the ARB_matrix_palette extension; since this extension is not currently supported on Onyx4 and Silicon Graphics Prism systems, these state values may not be bound.
The base matrix bindings may be further modified by a inverse/transpose selector and a row selector. If the beginning of a parameter binding matches any of the matrix binding names listed in Table 13-12, the binding corresponds to a 4x4 matrix (instead of a four-element vector, as is true of other parameter bindings). If the parameter binding is followed by .inverse, .transpose, or .invtrans, the inverse, transpose, or transpose of the inverse, respectively, of the specified matrix is selected. Otherwise, the specified matrix selected. If the specified matrix is poorly conditioned (singular or nearly so), its inverse matrix is undefined.
The binding name state.matrix.mvp refers to the product of modelview matrix zero and the projection matrix, as defined in the following:
MVP = P * M0 |
The operand P is the projection matrix and M0 is the modelview matrix zero.
If the selected matrix is followed by .row[a], the .x, .y, .z, and .w components of the parameter variable are filled with the four entries of row a of the selected matrix. In the following example, the variable m0 is set to the first row (row 0) of modelview matrix 1, and m1 is set to the last row (row 3) of the transpose of the projection matrix:
PARAM m0 = state.matrix.modelview[1].row[0]; PARAM m1 = state.matrix.projection.transpose.row[3]; |
For parameter array bindings, multiple rows of the selected matrix can be bound. If the selected matrix binding is followed by .row[a..b], the result is equivalent to specifying matrix rows a through b in order. A program will fail to load if a is greater than b. If no row selection is specified, rows 0 through 3 are bound in order. In the following code, the array m2 has two entries, containing rows 1 and 2 of program matrix zero, and m3 has four entries, containing all four rows of the transpose of program matrix zero:
PARAM m2[] = { state.matrix.program[0].row[1..2] }; PARAM m3[] = { state.matrix.program[0].transpose }; |
Output used by later stages of the OpenGL pipeline can be accessed in instructions either by binding output names with the OUTPUT statement or directly by use of the output name. Output from vertex and fragment programs is described in this section.
Table 13-13 lists the possible types of vertex program output. Components labelled * are unused.
Table 13-13. Vertex Program Output
Output Binding | Component Usage | Description |
---|---|---|
result.position | (x,y,z,w) | Position in clip coordinates |
result.color | (r,g,b,a) | Front-facing primary color |
result.color.primary | (r,g,b,a) | Front-facing primary color |
result.color.secondary | (r,g,b,a) | Front-facing secondary color |
result.color.front | (r,g,b,a) | Front-facing primary color |
result.color.front.primary | (r,g,b,a) | Front-facing primary color |
result.color.front.secondary | (r,g,b,a) | Front-facing secondary color |
result.color.back | (r,g,b,a) | Back-facing primary color |
result.color.back.primary | (r,g,b,a) | Back-facing primary color |
result.color.back.secondary | (r,g,b,a) | Back-facing secondary color |
result.fogcoord | (f,*,*,*) | Fog coordinate |
result.pointsize | (s,*,*,*) | Point size |
result.texcoord | (s,t,r,q) | Texture coordinate, unit 0 |
result.texcoord[n] | (s,t,r,q) | Texture coordinate, unit n |
For result.position, updates to the .x, .y, .z, and .w components of the result variable modify the X, Y, Z, and W components, respectively, of the transformed vertex's clip coordinates. Final window coordinates are generated for the vertex based on its clip coordinates.
For bindings beginning with result.color, updates to the .x, .y, .z, and .w components of the result variable modify the red, green, blue, and alpha components, respectively, of the corresponding vertex color attribute. Color bindings that do not specify front or back are consided to refer to front-facing colors. Color bindings that do not specify primary or secondary are considered to refer to primary colors.
For result.fogcoord, updates to the .x component of the result variable set the transformed vertex's fog coordinate. Updates to the .y, .z, and .w components of the result variable have no effect.
For result.pointsize, updates to the .x component of the result variable set the transformed vertex's point size. Updates to the .y, .z, and .w components of the result variable have no effect.
For result.texcoord or result.texcoord[n], updates to the .x, .y, .z, and .w components of the result variable set the s, t, r, and q components, respectively, of the transformed vertex's texture coordinates for texture unit n. If [n] is omitted, texture unit zero is selected.
All output is undefined at each vertex program invocation. Any results, or even individual components of results, that are not written during vertex program execution remain undefined.
Table 13-14 lists the possible types of fragment program output. Components labelled * are unused.
Table 13-14. Fragment Program Output
Output Binding | Component Usage | Description |
---|---|---|
result.color | (r,g,b,a) | Color |
result.color[n] | (r,g,b,a) | Color for draw buffer n |
result.depth | (*,*,d,*) | Depth coordinate |
For result.color or result.color[n], updates to the .x, .y, .z, and .w components of the result variable modify the red, green, blue, and alpha components, respectively, of the fragment's output color for draw buffer n. If [n] is omitted, the output color for draw buffer zero is modified. However, note that the [n] notation is only supported if program option ATI_draw_buffers is specified and if the ATI_draw_buffers extension is supported.
If result.color is not both bound by the fragment program and written by some instruction of the program, the output color of the fragment program is undefined.
Each color output is clamped to the range [0,1] and converted to fixed-point before being passed on to further fixed-function processing.
For result.depth, updates to the .z component of the result variable modify the fragment's output depth value. If result.depth is not both bound by the fragment program and written by some instruction of the program, the interpolated depth value produced by rasterization is used as if fragment program mode is not enabled. Writes to any component of depth other than the .z component have no effect.
The depth output is clamped to the range [0,1] and converted to fixed-point, as if it were a window Z value before being passed on to further fixed-function processing.
The preceding section “Vertex and Fragment Program Parameters” describes program parameters in terms of how they are accessed within a pipeline program. To set the value of a program parameter, call one of the following commands:
void glProgramLocalParameter4fARB(GLenum target, GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w); void glProgramLocalParameter4fvARB(GLenum target, GLuint index, const GLfloat *params); void glProgramLocalParameter4dARB(GLenum target, GLuint index, GLdouble x, GLdouble y, GLdouble z, GLdouble w); void glProgramLocalParameter4dvARB(GLenum target, GLuint index, const GLdouble *params); void glProgramEnvParameter4fARB(GLenum target, GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w); void glProgramEnvParameter4fvARB(GLenum target, GLuint index, const GLfloat *params); void glProgramEnvParameter4dARB(GLenum target, GLuint index, GLdouble x, GLdouble y, GLdouble z, GLdouble w); void glProgramEnvParameter4dvARB(GLenum target, GLuint index, const GLdouble *params); |
The glProgramLocal*() commands update the value of the program local parameter numbered index belonging to the program currently bound to target, and the glProgramEnv*() commands update the value of the program environment parameter numbered index for target. The argument target may be either GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB.
The scalar forms of the commands set the first, second, third, and fourth components of the specified parameter to the passed x, y, z and w values. The vector forms of the commands set the values of the specified parameter to the four values pointed to by params.
To query the value of a program local parameter, call one of the following commands:
void glGetProgramLocalParameterfvARB(GLenum target, GLuint index, GLfloat *params); void glGetProgramLocalParameterdvARB(GLenum target, GLuint index, GLdouble *params); |
To query the value of a program environment parameter, call one of the following commands:
void glGetProgramEnvParameterfvARB(GLenum target, GLuint index, GLfloat *params); void glGetProgramEnvParameterdvARB(GLenum target, GLuint index, GLdouble *params); |
For both local and environment parameters, the four components of the specified parameter are copied to the target array params.
The number of program local and environment parameters supported for each target type may be queried as described in section “Program Resource Limits and Usage”.
The section “Vertex and Fragment Attributes” describes vertex attributes in terms of how they are accessed within a vertex program. This section lists the commands for specifying vertex attributes and also describes attribute aliasing.
To set the value of a vertex attribute, call one of the following commands:
void glVertexAttrib1sARB(GLuint index, GLshort x); void glVertexAttrib1fARB(GLuint index, GLfloat x); void glVertexAttrib1dARB(GLuint index, GLdouble x); void glVertexAttrib2sARB(GLuint index, GLshort x, GLshort y); void glVertexAttrib2fARB(GLuint index, GLfloat x, GLfloat y); void glVertexAttrib2dARB(GLuint index, GLdouble x, GLdouble y); void glVertexAttrib3sARB(GLuint index, GLshort x, GLshort y, GLshort z); void glVertexAttrib3fARB(GLuint index, GLfloat x, GLfloat y, GLfloat z); void glVertexAttrib3dARB(GLuint index, GLdouble x, GLdouble y, GLdouble z); void glVertexAttrib4sARB(GLuint index, GLshort x, GLshort y, GLshort z, GLshort w); void glVertexAttrib4fARB(GLuint index, GLfloat x, GLfloat y, GLfloat z, GLfloat w); void glVertexAttrib4dARB(GLuint index, GLdouble x, GLdouble y, GLdouble z, GLdouble w); void glVertexAttrib4NubARB(GLuint index, GLubyte x, GLubyte y, GLubyte z, GLubyte w); void glVertexAttrib1svARB(GLuint index, const GLshort *v); void glVertexAttrib1fvARB(GLuint index, const GLfloat *v); void glVertexAttrib1dvARB(GLuint index, const GLdouble *v); void glVertexAttrib2svARB(GLuint index, const GLshort *v); void glVertexAttrib2fvARB(GLuint index, const GLfloat *v); void glVertexAttrib2dvARB(GLuint index, const GLdouble *v); void glVertexAttrib3svARB(GLuint index, const GLshort *v); void glVertexAttrib3fvARB(GLuint index, const GLfloat *v); void glVertexAttrib3dvARB(GLuint index, const GLdouble *v); void glVertexAttrib4bvARB(GLuint index, const GLbyte *v); void glVertexAttrib4svARB(GLuint index, const GLshort *v); void glVertexAttrib4ivARB(GLuint index, const GLint *v); void glVertexAttrib4ubvARB(GLuint index, const GLubyte *v); void glVertexAttrib4usvARB(GLuint index, const GLushort *v); void glVertexAttrib4uivARB(GLuint index, const GLuint *v); void glVertexAttrib4fvARB(GLuint index, const GLfloat *v); void glVertexAttrib4dvARB(GLuint index, const GLdouble *v); void glVertexAttrib4NbvARB(GLuint index, const GLbyte *v); void glVertexAttrib4NsvARB(GLuint index, const GLshort *v); void glVertexAttrib4NivARB(GLuint index, const GLint *v); void glVertexAttrib4NubvARB(GLuint index, const GLubyte *v); void glVertexAttrib4NusvARB(GLuint index, const GLushort *v); void glVertexAttrib4NuivARB(GLuint index, const GLuint *v); |
These commands update the value of the generic vertex attribute numbered index. The scalar forms set the first, second, third, and fourth components of the specified attribute to the passed x, y, z and w values, and the vector forms set the values of the specified attribute to the values pointed to by v.
If fewer than four values are passed (for the glVertexAttrib1*(), glVertexAttrib2*(), and glVertexAttrib3*() forms of the commands), unspecified values of y and z default to 0.0, and unspecified values of w default to 1.0.
The glVertexAttrib4N*() forms of the commands specify attributes with fixed-point coordinates. The specified fixed-point values are scaled to the range [0,1] (for unsigned forms of the commands) of to the range [–1,1] (for signed forms of the commands) in the same fashion as for the glNormal*() commands.
The number of vertex attributes supported for each target type may be queried as described in section “Program Resource Limits and Usage”.
Setting generic vertex attribute 0 specifies a vertex; the four vertex coordinates are taken from the values of attribute 0. A glVertex*() command is completely equivalent to the corresponding glVertexAttrib() command with an index of zero. Setting any other generic vertex attribute updates the current values of the attribute. There are no current values for vertex attribute 0.
Implementations may, but do not necessarily, use the same storage for the current values of generic and certain conventional vertex attributes. When any generic vertex attribute other than 0 is specified, the current values for the corresponding conventional attribute aliased with that generic attribute, as described in the Table 13-1, become undefined. Similarly, when a conventional vertex attribute is specified, the current values for the corresponding generic vertex attribute become undefined. For example, setting the current normal will leave generic vertex attribute 2 undefined, and conversely so.
Generic vertex attributes may also be specified when drawing by using vertex arrays. An array of per-vertex attribute values is defined by making the following call:
void glVertexAttribPointerARB(GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const GLvoid *pointer); |
The arguments are defined as follows:
size | Specifies the number of elements per attribute and must be 1, 2, 3, or 4. | |
type | Specifies the type of data in the array and must be one of GL_BYTE, GL_UNSIGNED_BYTE, GL_SHORT, GL_UNSIGNED_SHORT, GL_INT, GL_UNSIGNED_INT, GL_FLOAT, or GL_DOUBLE. | |
stride, pointer | Specifies the offset in basic machine units from one attribute value to the next in the array starting at pointer. As with other vertex array specification calls, a stride of zero indicates that fog coordinates are tightly packed in the array. | |
normalized | Specifies if fixed-point values will be normalized. If normalized is GL_TRUE, fixed-point values will be normalized (in the same fashion as the glVertexAttrib4N*() commands just described). Otherwise, fixed-point values are used unchanged. |
To enable or disable generic vertex attributes when drawing vertex arrays, call one of the following commands:
void glEnableVertexAttribArrayARB(GLuint index); void glDisableVertexAttribArrayARB(GLuint index); |
The number of program local and environment parameters supported for each target type may be queried as described in section “Program Resource Limits and Usage”.
Programs may use additional matrices, referred to as generic program matrices, from the OpenGL state. These matrices are specified using the same commands—for example, glLoadMatrixf()—as for other matrices such as modelview and projection. To set the current OpenGL matrix mode to operate on generic matrix n, call glMatrixMode() with a mode argument of GL_MATRIX0_ARB + n.
The number of program matrices supported may be queried as described in section “Program Resource Limits and Usage”.
The tables in this section summarize the complete instruction set supported for pipeline programs. In the Input and Output columns, the tables use the following notation:
v | Indicates a floating-point vector input or output. | |||
s | Indicates a floating-point scalar input. | |||
ssss | Indicates a scalar output replicated across a four-component result vector. | |||
a | Indicates a single address register component.
|
Table 13-15 summarizes instructions supported in both fragment and vertex programs.
Table 13-15. Program Instructions (Fragment and Vertex Programs)
Instruction Input | Input | Output | Description |
---|---|---|---|
ABS | v | v | Absolute value |
ADD | v,v | v | Add |
DP3 | v,v | ssss | Three-component dot product |
DP4 | v,v | ssss | Four-component dot product |
DPH | v,v | ssss | Homogeneous dot product |
DST | v,v | v | Distance vector |
EX2 | s | ssss | Exponentiate with base 2 |
FLR | v | v | Floor |
FRC | v | v | Fraction |
LG2 | s | ssss | Logarithm base 2 |
LIT | v | v | Compute light coefficients |
MAD | v,v,v | v | Multiply and add |
MAX | v,v | v | Maximum |
MIN | v,v | v | Minimum |
MOV | v | v | Move |
MUL | v,v | v | Multiply |
POW | s,s | ssss | Exponentiate |
RCP | s | ssss | Reciprocal |
RSQ | s | ssss | Reciprocal square root |
SGE | v,v | v | Set on greater than or equal |
SLT | v,v | v | Set on less than |
SUB | v,v | v | Subtract |
SWZ | v | v | Extended swizzle |
XPD | v,v | v | Cross product |
Table 13-16 summarizes instructions supported only in fragment programs.
Table 13-16. Program Instructions (Fragment Programs Only)
Instruction | Input | Output | Description |
---|---|---|---|
CMP | v,v,v | v | Compare |
COS | s | ssss | Cosine with reduction to [–pi,pi] |
KIL | v | v | Kill fragment |
LRP | v,v,v | v | Linear interpolation |
SCS | s | ss-- | Sine/cosine without reduction |
SIN | s | ssss | Sine with reduction to [–pi,pi] |
TEX | v,u,t | v | Texture sample |
TXB | v,u,t | v | Texture sample with bias |
TXP | v,u,t | v | Texture sample with projection |
Table 13-17 summarizes instructions supported only in vertex programs.
Table 13-17. Program Instructions (Vertex Programs Only)
Instruction | Input | Output | Description |
---|---|---|---|
ARL | s | a | Address register load |
EXP | s | v | Exponential base 2 (approximate) |
LOG | s | v | Logarithm base 2 (approximate) |
The following subsections describe each instruction in detail:
“Fragment and Vertex Program Instructions”
“Fragment Program Instructions”
“Vertex Program Instructions”
As shown in the preceding tables, most instructions are supported in both vertex and fragment programs.
Each subsection contains pseudo code describing the instruction. Instructions will have up to three operands, referred to as op0, op1, and op2.
Operands are loaded according to the component selection and modification rules. For a vector operand, these rules are referred to as the VectorLoad() operation. For a scalar operand, they are referred to as the ScalarLoad() operation.
The variables tmp, tmp0, tmp1, and tmp2 describe scalars or vectors used to hold intermediate results in the instruction.
Most instructions will generate a result vector called result. The result vector is then written to the destination register specified in the instruction possibly with destination write masking and, if the _SAT form of the instruction is used, with destination clamping, as described previously.
The instructions described here are supported in both fragment and vertex programs.
ABS—Absolute Value
The ABS instruction performs a component-wise absolute value operation on the single operand to yield a result vector.
Pseudo code:
tmp = VectorLoad(op0); result.x = fabs(tmp.x); result.y = fabs(tmp.y); result.z = fabs(tmp.z); result.w = fabs(tmp.w); |
ADD—Add
The ADD instruction performs a component-wise add of the two operands to yield a result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = tmp0.x + tmp1.x; result.y = tmp0.y + tmp1.y; result.z = tmp0.z + tmp1.z; result.w = tmp0.w + tmp1.w; |
The following rules apply to addition:
x + y == y + x, for all x and y
x + 0.0 == x, for all x
DP3—Three-Component Dot Product
The DP3 instruction computes a three-component dot product of the two operands (using the first three components) and replicates the dot product to all four components of the result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); dot = (tmp0.x * tmp1.x) + (tmp0.y * tmp1.y) + (tmp0.z * tmp1.z); result.x = dot; result.y = dot; result.z = dot; result.w = dot; |
DP4—Four-Component Dot Product
The DP4 instruction computes a four-component dot product of the two operands and replicates the dot product to all four components of the result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1): dot = (tmp0.x * tmp1.x) + (tmp0.y * tmp1.y) + (tmp0.z * tmp1.z) + (tmp0.w * tmp1.w); result.x = dot; result.y = dot; result.z = dot; result.w = dot; |
DPH—Homogeneous Dot Product
The DPH instruction computes a three-component dot product of the two operands (using the x, y, and z components), adds the w component of the second operand, and replicates the sum to all four components of the result vector. This is equivalent to a four-component dot product where the w component of the first operand is forced to 1.0.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1): dot = (tmp0.x * tmp1.x) + (tmp0.y * tmp1.y) + (tmp0.z * tmp1.z) + tmp1.w; result.x = dot; result.y = dot; result.z = dot; result.w = dot; |
DST—Distance Vector
The DST instruction computes a distance vector from two specially formatted operands. The first operand should be of the form [NA, d^2, d^2, NA] and the second operand should be of the form [NA, 1/d, NA, 1/d], where NA values are not relevant to the calculation and d is a vector length. If both vectors satisfy these conditions, the result vector will be of the form [1.0, d, d^2, 1/d].
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = 1.0; result.y = tmp0.y * tmp1.y; result.z = tmp0.z; result.w = tmp1.w; |
Given an arbitrary vector, d^2 can be obtained using the DP3 instruction (using the same vector for both operands) and 1/d can be obtained from d^2 using the RSQ instruction.
This distance vector is useful for light attenuation calculations: a DP3 operation using the distance vector and an attenuation constant vector as operands will yield the attenuation factor.
EX2—Exponentiate with Base 2
The EX2 instruction approximates 2 raised to the power of the scalar operand and replicates the approximation to all four components of the result vector.
Pseudo code:
tmp = ScalarLoad(op0); result.x = Approx2ToX(tmp); result.y = Approx2ToX(tmp); result.z = Approx2ToX(tmp); result.w = Approx2ToX(tmp); |
FLR—Floor
The FLR instruction performs a component-wise floor operation on the operand to generate a result vector. The floor of a value is defined as the largest integer less than or equal to the value. The floor of 2.3 is 2.0; the floor of –3.6 is –4.0.
Pseudo code:
tmp = VectorLoad(op0); result.x = floor(tmp.x); result.y = floor(tmp.y); result.z = floor(tmp.z); result.w = floor(tmp.w); |
FRC—Fraction
The FRC instruction extracts the fractional portion of each component of the operand to generate a result vector. The fractional portion of a component is defined as the result after subtracting off the floor of the component (see the FLR instruction) and is always in the range [0.0, 1.0].
For negative values, the fractional portion is not the number written to the right of the decimal point. The fractional portion of –1.7 is not 0.7; it is 0.3. The value 0.3 is produced by subtracting the floor of –1.7, which is –2.0, from –1.7.
Pseudo code:
tmp = VectorLoad(op0); result.x = fraction(tmp.x); result.y = fraction(tmp.y); result.z = fraction(tmp.z); result.w = fraction(tmp.w); |
LG2—Logarithm Base 2
The LG2 instruction approximates the base 2 logarithm of the scalar operand and replicates it to all four components of the result vector.
Pseudo code:
tmp = ScalarLoad(op0); result.x = ApproxLog2(tmp); result.y = ApproxLog2(tmp); result.z = ApproxLog2(tmp); result.w = ApproxLog2(tmp); |
If the scalar operand is zero or negative, the result is undefined.
LIT—Light Coefficients
The LIT instruction accelerates lighting by computing lighting coefficients for ambient, diffuse, and specular light contributions. The .x component of the single operand is assumed to hold a diffuse dot product (such as a vertex normal dotted with the unit direction vector from the point being lit to the light position). The .y component of the operand is assumed to hold a specular dot product (such as a vertex normal dotted with the half-angle vector from the point being lit). The .w component of the operand is assumed to hold the specular exponent of the material and is clamped to the range [–128,+128] exclusive.
The .x component of the result vector receives the value that should be multiplied by the ambient light/material product (always 1.0). The .y component of the result vector receives the value that should be multiplied by the diffuse light/material product (for example, state.lightprod[n].diffuse). The .z component of the result vector receives the value that should be multiplied by the specular light/material product (for example, state.lightprod[n].specular). The .w component of the result is the constant 1.0.
Negative diffuse and specular dot products are clamped to 0.0, as is done in the fixed-function per-vertex lighting operations. In addition, if the diffuse dot product is zero or negative, the specular coefficient is forced to zero.
Pseudo code:
tmp = VectorLoad(op0); if (tmp.x < 0) tmp.x = 0; if (tmp.y < 0) tmp.y = 0; if (tmp.w < -(128.0-epsilon)) tmp.w = -(128.0-epsilon); else if (tmp.w > 128-epsilon) tmp.w = 128-epsilon; result.x = 1.0; result.y = tmp.x; result.z = (tmp.x > 0) ? ApproxPower(tmp.y, tmp.w) : 0.0; result.w = 1.0; |
The power approximation function ApproxPower() may be defined in terms of the base 2 exponentiation and logarithm approximation operations. When executed in fragment programs, the definition should be as follows:
ApproxPower(a,b) = Approx2ToX(b * ApproxLog2(a)) |
The functions Approx2ToX() and ApproxLog2() are as defined by the EX2 and LG2 instructions. When executed in vertex programs, the definition should be as follows:
ApproxPower(a,b) = RoughApprox2ToX(b * RoughApproxLog2(a)) |
The functions RoughApprox2ToX() and RoughApproxLog2() are as defined by the EXP and LOG instructions. The approximation may not be any more accurate than the underlying exponential and logarithm approximations.
Since 0^0 is defined to be 1, ApproxPower(0.0, 0.0) will produce 1.0.
MAD—Multiply and Add
The MAD instruction performs a component-wise multiply of the first two operands, and then does a component-wise add of the product to the third operand to yield a result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); tmp2 = VectorLoad(op2); result.x = tmp0.x * tmp1.x + tmp2.x; result.y = tmp0.y * tmp1.y + tmp2.y; result.z = tmp0.z * tmp1.z + tmp2.z; result.w = tmp0.w * tmp1.w + tmp2.w; |
The multiplication and addition operations in this instruction are subject to the same rules as described for the MUL and ADD instructions.
MAX—Maximum
The MAX instruction computes component-wise maximums of the values in the two operands to yield a result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = (tmp0.x > tmp1.x) ? tmp0.x : tmp1.x; result.y = (tmp0.y > tmp1.y) ? tmp0.y : tmp1.y; result.z = (tmp0.z > tmp1.z) ? tmp0.z : tmp1.z; result.w = (tmp0.w > tmp1.w) ? tmp0.w : tmp1.w; |
MIN—Minimum
The MIN instruction computes component-wise minimums of the values in the two operands to yield a result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = (tmp0.x > tmp1.x) ? tmp1.x : tmp0.x; result.y = (tmp0.y > tmp1.y) ? tmp1.y : tmp0.y; result.z = (tmp0.z > tmp1.z) ? tmp1.z : tmp0.z; result.w = (tmp0.w > tmp1.w) ? tmp1.w : tmp0.w; |
MOV—Move
The MOV instruction copies the value of the operand to yield a result vector.
Pseudo code:
result = VectorLoad(op0); |
MUL—Multiply
The MUL instruction performs a component-wise multiply of the two operands to yield a result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = tmp0.x * tmp1.x; result.y = tmp0.y * tmp1.y; result.z = tmp0.z * tmp1.z; result.w = tmp0.w * tmp1.w; |
The following rules apply to multiplication:
x * y == y * x, for all x and y
+/-0.0 * x = +/-0.0 at least for all x that correspond to representable numbers (The IEEE non-number and infinity encodings may be exceptions.)
+1.0 * x = x, for all x
Multiplication by zero and one should be invariant, as it may be used to evaluate conditional expressions without branching.
POW—Exponentiate
The POW instruction approximates the value of the first scalar operand raised to the power of the second scalar operand and replicates it to all four components of the result vector.
Pseudo code:
tmp0 = ScalarLoad(op0); tmp1 = ScalarLoad(op1); result.x = ApproxPower(tmp0, tmp1); result.y = ApproxPower(tmp0, tmp1); result.z = ApproxPower(tmp0, tmp1); result.w = ApproxPower(tmp0, tmp1); |
The power approximation function may be implemented using the base 2 exponentiation and logarithm approximation operations in the EX2 and LG2 instructions, as shown in the following:
ApproxPower(a,b) = ApproxExp2(b * ApproxLog2(a)) |
Note that a logarithm may be involved even for cases where the exponent is an integer. This means that it may not be possible to exponentiate correctly with a negative base. In constrast, it is possible in a normal mathematical formulation to raise negative numbers to integer powers (for example, (–3)^2 == 9, and (–0.5)^-2 == 4).
RCP—Reciprocal
The RCP instruction approximates the reciprocal of the scalar operand and replicates it to all four components of the result vector.
Pseudo code:
tmp = ScalarLoad(op0); result.x = ApproxReciprocal(tmp); result.y = ApproxReciprocal(tmp); result.z = ApproxReciprocal(tmp); result.w = ApproxReciprocal(tmp); |
The following rule applies to reciprocation:
ApproxReciprocal(+1.0) = +1.0 |
RSQ—Reciprocal Square Root
The RSQ instruction approximates the reciprocal of the square root of the absolute value of the scalar operand and replicates it to all four components of the result vector.
Pseudo code:
tmp = fabs(ScalarLoad(op0)); result.x = ApproxRSQRT(tmp); result.y = ApproxRSQRT(tmp); result.z = ApproxRSQRT(tmp); result.w = ApproxRSQRT(tmp); |
SGE—Set on Greater or Equal Than
The SGE instruction performs a component-wise comparison of the two operands. Each component of the result vector is 1.0 if the corresponding component of the first operands is greater than or equal that of the second and 0.0, otherwise.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = (tmp0.x >= tmp1.x) ? 1.0 : 0.0; result.y = (tmp0.y >= tmp1.y) ? 1.0 : 0.0; result.z = (tmp0.z >= tmp1.z) ? 1.0 : 0.0; result.w = (tmp0.w >= tmp1.w) ? 1.0 : 0.0; |
SLT—Set on Less Than
The SLT instruction performs a component-wise comparison of the two operands. Each component of the result vector is 1.0 if the corresponding component of the first operand is less than that of the second and 0.0, otherwise.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = (tmp0.x < tmp1.x) ? 1.0 : 0.0; result.y = (tmp0.y < tmp1.y) ? 1.0 : 0.0; result.z = (tmp0.z < tmp1.z) ? 1.0 : 0.0; result.w = (tmp0.w < tmp1.w) ? 1.0 : 0.0; |
SUB—Subtract
The SUB instruction performs a component-wise subtraction of the second operand from the first to yield a result vector.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = tmp0.x - tmp1.x; result.y = tmp0.y - tmp1.y; result.z = tmp0.z - tmp1.z; result.w = tmp0.w - tmp1.w; |
SWZ—Extended Swizzle
The SWZ instruction loads the single vector operand and performs a swizzle operation more powerful than that provided for loading normal vector operands to yield an instruction vector.
The extended swizzle is expressed as the following:
SWZ result, op0, xswz, yswz, zswz, wswz |
The arguments xswz, yswz, zswz, and wswz are each one of the following extended swizzle selectors:
0, +0, -0, 1, +1, -1, x, +x, -x, y, +y, -y, z, +z, -z, w, +w, or -w |
For the numeric extended swizzle selectors, the result components corresponding to xswz, yswz, zswz, and wswz are loaded with the specified number. For the non-numeric extended swizzle selectors, the result components are loaded with the source component of op0 specified by the extended swizzle selector and are negated if the selector begins with the – sign.
In fragment programs, but not in vertex programs, the following extended swizzle selectors may also be used:
r, +r, -r, g, +g, -g, b, +b, -b, a, +a, or -a |
Since the SWZ instruction allows for component selection and negation for each individual component, the grammar does not allow the use of the normal swizzle and negation operations allowed for vector operands in other instructions.
The following example of SWZ shows most of the possible types of selectors:
PARAM color = { -0.1, 0.7, 1.2, 1.0 }; TEMP result; SWZ result, color, -1, 0, x, -y; # result now contains { -1.0, 0.0, 0.1, -0.7 } |
XPD—Cross Product
The XPD instruction computes the cross product using the first three components of its two vector operands to generate the X, Y, and Z components of the result vector. The W component of the result vector is undefined.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); result.x = tmp0.y * tmp1.z - tmp0.z * tmp1.y; result.y = tmp0.z * tmp1.x - tmp0.x * tmp1.z; result.z = tmp0.x * tmp1.y - tmp0.y * tmp1.x; |
The instructions supported only in fragment programs are of two types:
Math instructions
Texture instructions
Math Instructions
The math instructions include the following mathematical operations common in per-pixel shading algorithms:
CMP
COS
LRP
SCS
SIN
CMP—Compare
The CMP instructions perform a component-wise comparison of the first operand against zero and copies the values of the second or third operands based on the results of the compare.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); tmp2 = VectorLoad(op2); result.x = (tmp0.x < 0.0) ? tmp1.x : tmp2.x; result.y = (tmp0.y < 0.0) ? tmp1.y : tmp2.y; result.z = (tmp0.z < 0.0) ? tmp1.z : tmp2.z; result.w = (tmp0.w < 0.0) ? tmp1.w : tmp2.w; |
COS—Cosine
The COS instruction approximates the trigonometric cosine of the angle specified by the scalar operand and replicates it to all four components of the result vector. The angle is specified in radians and does not have to be in the range [–pi,pi].
Pseudo code:
tmp = ScalarLoad(op0); result.x = ApproxCosine(tmp); result.y = ApproxCosine(tmp); result.z = ApproxCosine(tmp); result.w = ApproxCosine(tmp); |
LRP—Linear Interpolation
The LRP instruction performs a component-wise linear interpolation between the second and third operands using the first operand as the blend factor.
Pseudo code:
tmp0 = VectorLoad(op0); tmp1 = VectorLoad(op1); tmp2 = VectorLoad(op2); result.x = tmp0.x * tmp1.x + (1 - tmp0.x) * tmp2.x; result.y = tmp0.y * tmp1.y + (1 - tmp0.y) * tmp2.y; result.z = tmp0.z * tmp1.z + (1 - tmp0.z) * tmp2.z; result.w = tmp0.w * tmp1.w + (1 - tmp0.w) * tmp2.w; |
SCS—Sine/Cosine
The SCS instruction approximates the trigonometric sine and cosine of the angle specified by the scalar operand and places the cosine in the x component and the sine in the y component of the result vector. The z and w components of the result vector are undefined. The angle is specified in radians and must be in the range [–pi,pi].
Pseudo code:
tmp = ScalarLoad(op0); result.x = ApproxCosine(tmp); result.y = ApproxSine(tmp); |
If the scalar operand is not in the range [–pi,pi], the result vector is undefined.
SIN—Sine
The SIN instruction approximates the trigonometric sine of the angle specified by the scalar operand and replicates it to all four components of the result vector. The angle is specified in radians and does not have to be in the range [–pi,pi].
Pseudo code:
tmp = ScalarLoad(op0); result.x = ApproxSine(tmp); result.y = ApproxSine(tmp); result.z = ApproxSine(tmp); result.w = ApproxSine(tmp); |
Texture Instructions
The following texture instructions include texture map lookup operations and a kill instruction:
TEX
TXP
TXB
KIL
The TEX, TXP, and TXB instructions specify the mapping of 4-tuple vectors to colors of an image. The sampling of the texture works in the same fashion as the fixed-function OpenGL pipeline, except that texture environments and texture functions are not applied to the result and the texture enable hierarchy is replaced by explicit references to the desired texture target—1D, 2D, 3D, CUBE (for cubemap targets) and RECT (for texture rectangle targets, if the EXT_texture_rectangle extension is supported). These texture instructions specify how the 4-tuple is mapped into the coordinates used for sampling. The following function is used to describe the texture sampling in the descriptions below:
vec4 TextureSample(float s, float t, float r, float lodBias, int texImageUnit, enum texTarget); |
Note that not all three texture coordinates s, t, and r are used by all texture targets. In particular, 1D texture targets only use the s component, and 2D and RECT (non-power-of-two) texture targets only use the s and t components. The following descriptions of the texture instructions supply all three components, as would be the case with 3D or CUBE targets.
If a fragment program samples from a texture target on a texture image unit where the bound texture object is not complete, the result will be the vector (R, G, B, A) = (0, 0, 0, 1).
A fragment program will fail to load if it attempts to sample from multiple texture targets on the same texture image unit. For example, the following program would fail to load:
!!ARBfp1.0 TEX result.color, fragment.texcoord[0], texture[0], 2D; TEX result.depth, fragment.texcoord[1], texture[0], 3D; END |
The KIL instruction does not sample from a texture but rather prevents further processing of the current fragment if any component of its 4-tuple vector is less than zero.
Texture Indirections
A dependent texture instruction is one that samples using a texture coordinate residing in a temporary rather than in an attribute or a parameter. A program may have a chain of dependent texture instructions, where the result of the first texture instruction is used as the coordinate for a second texture instruction, which is, in turn, used as the coordinate for a third texture instruction, etc. Each node in this chain is termed an indirection and can be thought of as a set of texture samples that execute in parallel and are followed by a sequence of ALU instructions.
Some implementations may have limitations on how long the dependency chain may be. Therefore, indirections are counted as a resource just like instructions or temporaries are counted. All programs have at least one indirection (one node in this chain) even if the program performs no texture operation. Each instruction encountered is included in this node until the program encounters a texture instruction with one of the following properties:
Its texture coordinate is a temporary that has been previously written in the current node.
Its result vector is a temporary that is also the operand or result vector of a previous instruction in the current node.
A new node is then started that includes the texture instruction and all subsequent instructions, and the process repeats for all instructions in the program. Note that for simplicity in counting, result writemasks and operand suffixes are not taken into consideration when counting indirections.
TEX—Map Coordinate to Color
The TEX instruction takes the first three components of its source vector and maps them to s, t, and r. These coordinates are used to sample from the specified texture target on the specified texture image unit in a manner consistent with its parameters. The resulting sample is mapped to RGBA and written to the result vector.
Pseudo code:
tmp = VectorLoad(op0); result = TextureSample(tmp.x, tmp.y, tmp.z, 0.0, op1, op2); |
TXP—Project Coordinate and Map to Color
The TXP instruction divides the first three components of its source vector by the fourth component and maps the results to s, t, and r. These coordinates are used to sample from the specified texture target on the specified texture image unit in a manner consistent with its parameters. The resulting sample is mapped to RGBA and written to the result vector. If the value of the fourth component of the source vector is less than or equal to zero, the result vector is undefined.
Pseudo code:
tmp = VectorLoad(op0); tmp.x = tmp.x / tmp.w; tmp.y = tmp.y / tmp.w; tmp.z = tmp.z / tmp.w; result = TextureSample(tmp.x, tmp.y, tmp.z, 0.0, op1, op2); |
TXB—Map Coordinate to Color While Biasing Its Level Of Detail
The TXB instruction takes the first three components of its source vector and maps them to s, t, and r. These coordinates are used to sample from the specified texture target on the specified texture image unit in a manner consistent with its parameters. Additionally, before determining the mipmap level(s) to sample, the fourth component of the source vector is added with bias factors for the per-texture-object and per-texture-unit level of detail. The resulting sample is mapped to RGBA and written to the result vector.
Pseudo code:
tmp = VectorLoad(op0); result = TextureSample(tmp.x, tmp.y, tmp.z, tmp.w, op1, op2); |
KIL—Kill Fragment
Rather than mapping a coordinate set to a color, the KIL operation prevents a fragment from receiving any future processing. If any component of its source vector is negative, the processing of this fragment will be discontinued and no further output to this fragment will occur. Subsequent stages of the GL pipeline will be skipped for this fragment.
Pseudo code:
tmp = VectorLoad(op0); if ((tmp.x < 0) || (tmp.y < 0) || (tmp.z < 0) || (tmp.w < 0)) { exit; } |
The instructions described in this section are only supported by vertex programs. They include address register loads (ARL) as well as lower-precision (and higher-performance) exponential and logarithmic computations (EXP and LOG).
ARL—Address Register Load
The ARL instruction loads a single scalar operand and performs a floor operation to generate a signed integer scalar result.
Pseudo code:
result = floor(ScalarLoad(op0)); |
EXP—Exponentiate with Base 2 (approximate)
The EXP instruction computes a rough approximation of 2 raised to the power of the scalar operand. The approximation is returned in the .z component of the result vector. A vertex program can also use the .x and .y components of the result vector to generate a more accurate approximation by evaluating result.x * f(result.y), where f(x) is a user-defined function that approximates 2^x over the domain [0.0, 1.0]. The .w component of the result vector is always 1.0.
Pseudo code:
tmp = ScalarLoad(op0); result.x = 2^floor(tmp); result.y = tmp - floor(tmp); result.z = RoughApprox2ToX(tmp); result.w = 1.0; |
The approximation function is accurate to at least 10 bits.
LOG—Logarithm Base 2 (approximate)
The LOG instruction computes a rough approximation of the base 2 logarithm of the absolute value of the scalar operand. The approximation is returned in the .z component of the result vector. A vertex program can also use the .x and .y components of the result vector to generate a more accurate approximation by evaluating result.x + f(result.y), where f(x) is a user-defined function that approximates 2^x over the domain [0.0, 1.0]. The .w component of the result vector is always 1.0.
Pseudo code:
tmp = fabs(ScalarLoad(op0)); result.x = floor(log2(tmp)); result.y = tmp / 2^(floor(log2(tmp))); result.z = RoughApproxLog2(tmp); result.w = 1.0; |
The floor(log2(tmp)) refers to the floor of the exact logarithm, which can be easily computed for standard floating point representations. The approximation function is accurate to at least 10 bits.
You can query resources allocated and consumed by programs by making the following call:
void glGetProgramivARB(GLenum target, GLenum pname, GLint *params); |
The argument target may be either GL_VERTEX_PROGRAM_ARB or GL_FRAGMENT_PROGRAM_ARB.
To determine the maximum possible resource limits for a program of the specified target type, use one of the values in Table 13-18 for pname. There are two types of limits:
Native limits | If native limits are not exceeded by a program, it is guaranteed that the program can execute in the graphics hardware. The parameter names for native limits are of the form GL_MAX_PROGRAM_NATIVE*. | |
Overall limits | If the overall limits are not exceeded, the program will execute, but possibly on a software fallback path with greatly reduced performance. The parameter names for overall limits are of the form GL_MAX_PROGRAM*. |
The concepts of texture instructions and texture indirections are described in section “Fragment Program Instructions”. In a fragment program, ALU instructions are all instructions other than the texture instructions TEX, TXP, TXB, and KIL.
Table 13-18. Program Resource Limits
Resource Limit Name (overall, native) | Min Value for Vertex Programs | Min Value for Fragment Programs | Description |
---|---|---|---|
GL_MAX_PROGRAM_INSTRUCTIONS_ARB, | 128 | 72 | Maximum number of instructions declared |
GL_MAX_PROGRAM_ATTRIBS_ARB, | 16 | 10 | Maximum number of attributes declared |
GL_MAX_PROGRAM_PARAMETERS_ARB, | 96 | 24 | Maximum number of parameters declared |
GL_MAX_PROGRAM_TEMPORARIES_ARB, | 12 | 16 | Maximum number of temporaries declared |
GL_MAX_PROGRAM_ADDRESS_REGISTERS_ARB, | 1 |
| Maximum number of address registers declared (vertex programs only) |
GL_MAX_PROGRAM_ALU_INSTRUCTIONS_ARB, |
| 48 | Maximum number of ALU instructions declared (fragment programs only) |
GL_MAX_PROGRAM_TEX_INSTRUCTIONS_ARB, |
| 24 | Maximum number of texture instructions declared (fragment programs only) |
GL_MAX_PROGRAM_TEX_INDIRECTIONS_ARB, |
| 4 | Maximum number of texture indirections declared (fragment programs only) |
To determine the resources actually consumed by the currently bound program of the specified target type, use one of the values in Table 13-19 for pname. There are two types of usage:
Overall usage | Overall usage is that of the program as written. The parameter names for overall usage are of the form GL_PROGRAM*. | |
Native usage | Native usage is for the program as compiled for the target hardware. In some cases, emulation of operations not directly supported by the hardware will consume additional resources. The parameter names for native usage are of the form GL_PROGRAM_NATIVE*. |
Table 13-19. Program Resource Usage
Resource Usage Name (overall, native) | Description |
---|---|
GL_PROGRAM_INSTRUCTIONS_ARB, | Number of instructions used |
GL_PROGRAM_ATTRIBS_ARB, | Number of attributes used |
GL_PROGRAM_PARAMETERS_ARB, | Number of parameters used |
GL_PROGRAM_TEMPORARIES_ARB, | Number of temporaries used |
GL_PROGRAM_ADDRESS_REGISTERS_ARB, | Number of address registers used (vertex programs only) |
GL_PROGRAM_ALU_INSTRUCTIONS_ARB, | Number of ALU instructions used (fragment programs only) |
GL_PROGRAM_TEX_INSTRUCTIONS_ARB, | Number of texture instructions used (fragment programs only) |
GL_PROGRAM_TEX_INDIRECTIONS_ARB, | Number of texture indirections used (fragment programs only) |
To assist in determining if a program is running on the actual graphics hardware, call glGetProgramivARB() with pname set to GL_PROGRAM_UNDER_NATIVE_LIMITS_ARB. This returns 0 in params if the native resource consumption of the program currently bound to target exceeds the number of available resources for any resource type and 1, otherwise.
To determine the maximum number of program local parameters and program environment parameters that may be specified for target, use a pname of GL_MAX_PROGRAM_LOCAL_PARAMETERS_ARB or GL_MAX_PROGRAM_ENV_PARAMETERS_ARB, respectively.
To determine the maximum number of generic vertex attributes that may be specified for vertex programs, call glGetIntegerv() with a pname of GL_MAX_VERTEX_ATTRIBS_ARB.
To determine the maximum number of generic matrices that may be specified for programs, call glGetIntegerv() with a pname of GL_MAX_PROGRAM_MATRICES_ARB. At least 8 program matrices are guaranteed to be supported. To determine the maximum stack depth for generic program matrices, call glGetIntegerv() with a pname of GL_MAX_PROGRAM_MATRIX_STACK_DEPTH_ARB. The maximum generic matrix stack depth is guaranteed to be at least 1.
To determine properties of generic matrices, rather than extending glGet*() to accept the GL_MATRIX0+n terminology, additional parameter names are defined which return properties of the current matrix (as set with the glMatrixMode() function). The depth of the current matrix stack can be queried by calling glGetIntegerv() with a pname of GL_CURRENT_MATRIX_STACK_DEPTH_ARB, while the current matrix values can be queried by calling glGetFloatv() with a pname of GL_CURRENT_MATRIX_ARB or GL_TRANSPOSE_CURRENT_MATRIX_ARB. The functions return the 16 entries of the current matrix in column-major or row-major order, respectively.
In addition to program resource limits and usage, you can query for following information about the currently bound program:
Program string length, program string format, and program string name
Source text
Parameters of the generic vertex attribute array pointers
Calling glGetProgramivARB() with a pname of GL_PROGRAM_LENGTH_ARB, GL_PROGRAM_FORMAT_ARB, or GL_PROGRAM_BINDING_ARB, returns one integer reflecting the program string length (in GLubytes), program string format, and program name, respectively, for the program object currently bound to target.
Making the following call returns the source text for the program bound to target in the array string:
void glGetProgramStringARB(GLenum target, GLenum pname, GLvoid *string); |
The argument pname must be GL_PROGRAM_STRING_ARB. The size of string must be at least the value of GL_PROGRAM_LENGTH_ARB queried with glGetProgramivARB(). The program string is always returned using the format given when the program string was specified.
You can query the parameters of the generic vertex attribute array pointers by calling one of the following commands:
void glGetVertexAttribdvARB(GLuint index, GLenum pname, GLdouble *params); void glGetVertexAttribfvARB(GLuint index, GLenum pname, GLfloat *params); void glGetVertexAttribivARB(GLuint index, GLenum pname, GLint *params); |
The pname value must be one of the following:
GL_VERTEX_ATTRIB_ARRAY_ENABLED_ARB
GL_VERTEX_ATTRIB_ARRAY_NORMALIZED_ARB
GL_VERTEX_ATTRIB_ARRAY_SIZE_ARB
GL_VERTEX_ATTRIB_ARRAY_STRIDE_ARB
GL_VERTEX_ATTRIB_ARRAY_TYPE_ARB
Bound generic vertex array pointers can be queried by making the following call:
void glGetVertexAttribPointervARB(GLuint index, GLenum pname, GLvoid **pointer); |
These examples are intended primarily to show complete vertex and fragment programs using a range of instructions and input. The OpenGL programming required to set up and execute these programs on sample geometry are not included.
The following vertex program implements a simple ambient, specular, and diffuse infinite lighting computation with a single light and an eye-space normal:
!!ARBvp1.0 ATTRIB iPos = vertex.position; ATTRIB iNormal = vertex.normal; PARAM mvinv[4] = { state.matrix.modelview.invtrans }; PARAM mvp[4] = { state.matrix.mvp }; PARAM lightDir = state.light[0].position; PARAM halfDir = state.light[0].half; PARAM specExp = state.material.shininess; PARAM ambientCol = state.lightprod[0].ambient; PARAM diffuseCol = state.lightprod[0].diffuse; PARAM specularCol = state.lightprod[0].specular; TEMP xfNormal, temp, dots; OUTPUT oPos = result.position; OUTPUT oColor = result.color; # Transform the vertex to clip coordinates. DP4 oPos.x, mvp[0], iPos; DP4 oPos.y, mvp[1], iPos; DP4 oPos.z, mvp[2], iPos; DP4 oPos.w, mvp[3], iPos; # Transform the normal to eye coordinates. DP3 xfNormal.x, mvinv[0], iNormal; DP3 xfNormal.y, mvinv[1], iNormal; DP3 xfNormal.z, mvinv[2], iNormal; # Compute diffuse and specular dot products and use LIT to compute # lighting coefficients. DP3 dots.x, xfNormal, lightDir; DP3 dots.y, xfNormal, halfDir; MOV dots.w, specExp.x; LIT dots, dots; # Accumulate color contributions. MAD temp, dots.y, diffuseCol, ambientCol; MAD oColor.xyz, dots.z, specularCol, temp; MOV oColor.w, diffuseCol.w; END |
The following fragment program shows how to perform a simple modulation between the interpolated fragment color from rasterization and a single texture:
!!ARBfp1.0 ATTRIB tex = fragment.texcoord; # First set of texture coordinates ATTRIB col = fragment.color.primary; # Diffuse interpolated color OUTPUT outColor = result.color; TEMP tmp; TXP tmp, tex, texture, 2D; # Sample the texture MUL outColor, tmp, col; # Perform the modulation END |
The following fragment program simulates a chrome surface:
!!ARBfp1.0 ######################## # Input Textures: #----------------------- # Texture 0 contains the default 2D texture used for general mapping # Texture 2 contains a 1D pointlight falloff map # Texture 3 contains a 2D map for calculating specular lighting # Texture 4 contains normalizer cube map # # Input Texture Coordinates: #----------------------- # TexCoord1 contains the calculated normal # TexCoord2 contains the light to vertex vector # TexCoord3 contains the half-vector in tangent space # TexCoord4 contains the light vector in tangent space # TexCoord5 contains the eye vector in tangent space ######################## TEMP NdotH, lV, L; ALIAS diffuse = L; PARAM half = { 0.5, 0.5, 0.5, 0.5 }; ATTRIB norm_tc = fragment.texcoord[1]; ATTRIB lv_tc = fragment.texcoord[2]; ATTRIB half_tc = fragment.texcoord[3]; ATTRIB light_tc = fragment.texcoord[4]; ATTRIB eye_tc = fragment.texcoord[5]; OUTPUT oCol = result.color; TEX L, light_tc, texture[4], CUBE; # Sample cube map normalizer # Calculate diffuse lighting (N.L) SUB L, L, half; # Bias L and then multiply by 2 ADD L, L, L; DP3 diffuse, norm_tc, L; # N.L # Calculate specular lighting component { (N.H), |H|^2 } DP3 NdotH.x, norm_tc, half_tc; DP3 NdotH.y, half_tc, half_tc; DP3 lV.x, lv_tc, lv_tc; # lV = (|light to vertex|)^2 ############# # Pass 2 ############# TEMP base, specular; ALIAS atten = lV; TEX base, eye_tc, texture[0], 2D; # sample enviroment map using eye vector TEX atten, lV, texture[2], 1D; # Sample attenuation map TEX specular, NdotH, texture[3], 2D; # Sample specular NHHH map= (N.H)^256 # specular = (N.H)^256 * (N.L) # this ensures a pixel is only lit if facing the light (since the specular # exponent makes negative N.H positive we must do this) MUL specular, specular, diffuse; # specular = specular * environment map MUL specular, base, specular; # diffuse = diffuse * environment map MUL diffuse, base, diffuse; # outColor = (specular * environment map) + (diffuse * environment map) ADD base, specular, diffuse; # Apply point light attenutaion MUL oCol, base, atten.r; END |
If a program fails to load because it contains an error when glBindProgramARB() is called or because it would exceed the resource limits of the implementation, a GL_INVALID_OPERATION error is generated. Calling glGetIntegerv() with a pname of GL_PROGRAM_ERROR_POSITION_ARB will return the byte offset into the currently bound program string at which the error was detected, and calling glGetString() with pname GL_PROGRAM_ERROR_STRING_ARB will return a string describing the error (for example, a compiler error message).
If the currently bound vertex or fragment program does not contain a valid program and the corresponding vertex or fragment program mode is enabled, a GL_INVALID_OPERATION error is generated whenever glBegin(), glRasterPos*(), or any drawing command that performs an explicit glBegin(), such as glDrawArrays(), is called.
Under the following conditions, GL_INVALID_VALUE errors will be generated if the specified index exceeds the implementation limit for the number of attributes, program local parameters, program environment parameters, etc.:
When specifying vertex attribute indices in immediate-mode or vertex array calls
When specifying program parameter indices in specification or query calls and in similar calls
The ARB_vertex_program and ARB_fragment_program extensions introduce the following functions:
In addition to the ARB_vertex_program and ARB_fragment_program extension, Onyx4 and Silicon Graphics Prism systems also support the following set of ATI vendor extensions for vertex and fragment programming:
These two extensions, developed prior to the ARB extensions, are included only for support of legacy applications being ported from other platforms. They supply no functionality not present in ARB_vertex_program and ARB_fragment_program and are not as widely implemented. Whenever writing new code using vertex or fragment programs, always use the ARB extensions.
Since these are legacy extensions, they are not documented in detail here. This section only describes how the legacy extensions map onto the corresponding ARB extensions.
EXT_vertex_shader | Allows an application to define vertex programs that are functionally comparable to ARB_vertex_program programs. | |
ATI_fragment_shader | Allows an application to define fragment programs that are functionally comparable to ARB_fragment_program programs. |
Instead of specifying a program string, each legacy program instruction is a function call specifying instruction parameters.
The ATI_fragment_shader and EXT_vertex_shader extensions introduce the following set of functions: