When optimizing an OpenGL application, there are two problems you need to address:
When you are writing an OpenGL application, it is difficult to know whether a particular feature (like depth buffering or texture mapping) is fast enough to be useful.
If you want your application to run fast on a variety of machines while taking advantage of as many hardware features as possible, you need to write code that makes configuration decisions at run time.
For the OpenGL predecessor IRIS GL, you could call getgdesc() to determine whether a feature had hardware support. For example, you could determine whether a Z buffer existed. If it did, you might assume that Z buffering was fast and, therefore, your application would use it.
In OpenGL, things are more complicated. All the core features are provided even when there is no hardware support for them and they must be implemented completely in software. There is no OpenGL routine that reports whether a feature is implemented partly or completely in hardware.
Furthermore, features interact in unpredictable ways. The following are examples:
A machine might have hardware support for depth buffering but only for some comparison functions.
Depth buffering might be fast only as long as stencilling is not enabled.
Depth buffering might be fast when drawing to a window but slow when drawing to a pixmap.
A routine that identifies hardware support for particular features is actually a lot more complicated and less useful than you might think.
To decide whether a given OpenGL feature is fast, you have to measure it. Since the performance of a section of graphics code is dependent on many pieces of information from the run-time environment, no other method is as well-defined and reliable.
Keep in mind that while the results of the libisfast routines are interesting, they apply to limited special cases. Always consider using a more general tool like Open Inventor or OpenGL Performer.
Performance measurement can be tricky, as indicated by the following considerations:
You need to handle the cases when you are displaying over a network as well as locally.
You must flush the graphics pipeline properly and account for the resulting overhead.
Measuring all the features needed by your application may take a while. Save performance measurements and reuse them whenever possible; users will not want to wait for measurements each time the application starts.
Consider measuring things other than graphics: disk and network throughput, processing time for a particular set of data, performance on single-processor and multiprocessor systems, and so on.
T his appendix describes two libraries that can help with all of the tasks just mentioned:
libpdb | Performance database routines for measuring execution rates and maintaining a simple database. | |
libisfast | A set of routines demonstrating libpdb that answer common questions about the performance of OpenGL features (using reasonable but subjective criteria). |
These libraries cannot substitute for comprehensive benchmarking and performance analysis and do not replace more sophisticated tools (like OpenGL Performer and Open Inventor) that optimize application performance in a variety of ways. However, they can handle simple tasks easily.
Library libpdb provides the following routines:
All libpdb routines return a value of type pdbStatusT, which is a bit mask of error conditions. If the value is zero (PDB_NO_ERROR), the call completed successfully. If the
value is nonzero, it is a combination of one or more of the conditions listed in Table B-1.
Table B-1. Errors Returned by libpdb Routines
Error | Meaning |
---|---|
Attempt to allocate memory failed. | |
Database contains one or more records that could not be parsed. | |
Database does not contain the record requested by the application. | |
Database file could not be updated. | |
Routine pdbOpen() was not invoked before calling one of the other libpdb routines. | |
Routine pdbOpen() was called while the database is still open (for example, before pdbClose() is invoked). |
Every program must call pdbOpen() before using the database and pdbClose() when the database is no longer needed. Routine pdbOpen() opens the database file (stored in $HOME/.pdb2 on IRIX and Linux systems) and reads all the performance measurements into main memory. Routine pdbClose() releases all memory used by the library, and writes the database back to its file if any changes have been made by invoking pdbWriteRate(). The following are sample calls to the routines:
pdbStatusT pdbOpen(void); pdbStatusT pdbClose(void); |
Routine pdbOpen() returns the following:
PDB_NO_ERROR on success
PDB_OUT_OF_MEMORY if there was insufficient main memory to store the entire database
PDB_SYNTAX_ERROR if the contents of the database could not be parsed or seemed implausible (for example, a nonpositive performance measurement)
PDB_ALREADY_OPEN if the database has been opened by a previous call to pdbOpen() and not closed by a call to pdbClose()
Routine pdbClose() returns the following:
PDB_NO_ERROR on success
PDB_CANT_WRITE if the database file is unwritable for any reason
PDB_NOT_OPEN if the database is not open
Normally applications should look for the performance data they need before going to the trouble of taking measurements. Routine pdbReadRate(), which is used for this purpose, has the following format:
pdbStatusT pdbReadRate (const char* machineName,const char* appName, const char* benchmarkName,const char* versionString, double* rate) |
The variable items are defined as follows:
machineName | A zero-terminated string giving the name of the machine for which the measurement is sought. If NULL, the default machine name is used. (In X11 environments, the display name is an appropriate choice, and the default machine name is the content of the DISPLAY environment variable.) | |
appName | Name of the application. This is used as an additional database key to reduce accidental collisions between benchmark names. | |
benchmarkName | Name of the benchmark. | |
versionString | The fourth argument is a string identifying the desired version of the benchmark. For OpenGL performance measurements, the string returned by glGetString(GL_VERSION) is a good value for this argument. Other applications might use the version number of the benchmark rather than the version number of the system under test. | |
rate | A pointer to a double-precision floating-point variable that receives the performance measurement (the “rate”) from the database. The rate indicates the number of benchmark operations per second that were measured on a previous run. If pdbReadRate() returns zero, then it completed successfully and the rate is returned in the last argument. If the requested benchmark is not present in the database, it returns PDB_NOT_FOUND. Finally, if pdbReadRate() is called when the database has not been opened by pdbOpen(), it returns PDB_NOT_OPEN. |
The following example illustrates the use of pdbReadRate():
main() { double rate; pdbOpen(); if (pdbReadRate(NULL, "myApp", "triangles", glGetString(GL_VERSION), &rate) == PDB_NO_ERROR) printf("%g triangle calls per second\n", rate); pdbClose(); } |
When the application is run for the first time or when the performance database file has been removed (perhaps to allow a fresh start after a hardware upgrade), pdbReadRate() is not able to find the desired benchmark. If this happens, the application should use pdbMeasureRate(), which has the following format, to make a measurement:
pdbStatusT pdbMeasureRate (pdbCallbackT initialize, pdbCallbackT operation, pdbCallbackT finalize, int calibrate, double* rate) |
The variable items are defined as follows:
initialize | A pointer to the initialization function. The initialization function is run before each set of operations. For OpenGL performance measurement, it is appropriate to use glFinish() for initialization to make sure that the graphics pipe is quiet. However, for other performance measurements, the initialization function can create test data, preload caches, and so on. The value may be NULL; in which case, no initialization is performed. | |
operation | A pointer to the operation function. This function performs the operations that are to be measured. Usually you will want to make sure that any global state needed by the operation is set up before calling the operation function so that you do not include the cost of the setup operations in the measurement. | |
finalize | A pointer to a finalization function. This is run once, after all the calls to the operation function are complete. In the preceding example, glFinish() ensures that the graphics pipeline is idle. The value of finalize may be NULL; in which case, no finalization is performed. The finalization function must be calibrated so that the overhead of calling it may be subtracted from the time used by the operation function. If the fourth argument is nonzero, then pdbMeasureRate() calibrates the finalization function. If the fourth argument is zero, then pdbMeasureRate() uses the results of the previous calibration. Recalibrating each measurement is the safest approach, but it roughly doubles the amount of time needed for a measurement. For OpenGL, it should be acceptable to calibrate once and recalibrate only when using a different X11 display. | |
rate | A pointer to a double-precision floating-point variable that receives the execution rate. This rate is the number of times the operation function was called per second. Routine pdbMeasureRate() attempts to compute a number of repetitions that results in a run time of about one second. (Calibration requires an additional second.) It is reasonably careful about timekeeping on systems with low-resolution clocks. |
Routine pdbMeasureRate() always returns PDB_NO_ERROR.
The following example illustrates the use of pdbMeasureRate():
void SetupOpenGLState(void) { /* set all OpenGL state to desired values */ } void DrawTriangles(void) { glBegin(GL_TRIANGLE_STRIP); /* specify some vertices... */ glEnd(); } main() { double rate; pdbOpen(); if (pdbReadRate(NULL, "myApp", "triangles", glGetString(GL_VERSION), &rate) != PDB_NO_ERROR) { SetupOpenGLState(); pdbMeasureRate(glFinish, DrawTriangles, glFinish, 1, &rate); } printf("%g triangle calls per second\n", rate); pdbClose(); } |
Once a rate has been measured, it should be stored in the database by calling pdbWriteRate(), which has the following format:
pdbStatusT pdbWriteRate (const char* machineName, const char* applicationName, const char* benchmarkName, const char* versionString, double rate) |
The first four arguments of pdbWriteRate() match the first four arguments of pdbReadRate(). The last argument is the performance measurement to be saved in the database.
Routine pdbWriteRate() returns the following:
PDB_NO_ERROR if the performance measurement was added to the in-memory copy of the database
PDB_OUT_OF_MEMORY if there was insufficient main memory
PDB_NOT_OPEN if the database is not open
When pdbWriteRate() is called, the in-memory copy of the performance database is marked “dirty.” Routine pdbClose() takes note of this and writes the database back to disk.
The following example illustrates the use of pdbWriteRate():
main() { double rate; pdbOpen(); if (pdbReadRate(NULL, "myApp", "triangles", glGetString(GL_VERSION), &rate) != PDB_NO_ERROR) { SetupOpenGL(); pdbMeasureRate(glFinish, DrawTriangles, glFinish, 1, &rate); pdbWriteRate(NULL, "myApp", "triangles", glGetString(GL_VERSION), rate); } printf("%g triangle calls per second\n", rate); pdbClose(); } |
The libisfast library is a set of demonstration routines that show how libpdb can be used to measure and maintain OpenGL performance data. Library libisfast is based on purely subjective performance criteria. If they are appropriate for your application, feel free to use them. If not, copy the source code and modify it accordingly.
In all cases that follow, the term “triangles” refers to a triangle strip with 37 vertices. The triangles are drawn with perspective projection, lighting, and smooth (Gouraud) shading. Unless otherwise stated, display-list-mode drawing is used. This makes libisfast yield more useful results when the target machine is being accessed over a network.
The application must initialize libisfast before performing any performance measurements and clean up after the measurements are finished. On X11 systems, initialize libisfast by making the following call:
int IsFastXOpenDisplay(const char* displayName); |
Perform cleanup by making the following call:
void IsFastXCloseDisplay(void); |
The following are other libisfast routines to use:
Although the routines in libisfast are useful for a number of applications, you should study them and modify them for your own use. Doing so allows yout to explore the particular performance characteristics of your systems: their sensitivity to triangle size, triangle strip length, culling, stencil function, texture-map type, texture-coordinate generation method, and so on.