- Following up on previous
news and the PS3 Linux kernel module
, this weekend PlayStation 3 developer gzorin
(aka Alexander Betts
) has made available RSXGL - the RSX Graphics Library for PS3.
Download: [Register or Login to view links]
To quote: RSXGL - The RSX Graphics Library
This library implements parts of the OpenGL 3.1 core profile specification for the PlayStation 3's RSX GPU. It's suitable for use in programs that have exclusive access to the RSX, such as GameOS software (and is likely unsuitable for implementing a multitasking desktop, as the library
doesn't arbitrate access to the RSX).
Please see the STATUS file for up-to-date information about the current capabilities of this library.
Briefly, the following OpenGL 3.1 features haven't been implemented yet (the RSX is capable of supporting most of them):
- Full GLSL support. Currently, GLSL and Cg shaders can be compiled offline and used with a OpenGL ES 2 style API.
- Instanced draw commands.
- Framebuffer objects and multiple render targets.
- glQuery*() objects.
- Point sprites.
- A variety of capabilities related to texture maps (rectangular and cube textures, the behavior of glPixelStore, copying from the framebuffer, mipmap generation, and texture formats, including compressed formats, that require conversion and/or swizzling).
- Client-side vertex array data. OpenGL 3.1's core profile specifically omits this, but it is specified by OpenGL ES 2 (as well as the OpenGL 3 compatibility profile), and is likely still widely used, particularly by smaller demonstration programs.
- Application object namespaces (another feature omitted by OpenGL 3.1, but used by earlier specs).
- Most glGet*() functions haven't been implemented. Some specified behavior of glEnable()/glDisable() likely needs to be implemented as well.
Further details follow:
This is admittedly a large area of missing capability. Currently you can compile GLSL (and Cg) programs offline using NVIDIA's cgc compiler, and use them in an OpenGL program via an API that resembles the OpenGL ES 2 API - glShaderBinary() is used to load shader objects, and exactly two such shader objects (a vertex program and a fragment program) can be linked together by glLinkProgram().
This project certainly hopes to extend its implementation of GLSL further.
Intel contributed a standalone GLSL compiler to Mesa which can emit two intermediate representations that maybe could be translated to the NVIDIA microcode used by the RSX (maybe this compiler does this already; I haven't looked in a few months). Perhaps this compiler could be made to run from within a PS3 program.
Another longer avenue would be to explore writing an NVIDIA microcode backend for the LLVM compiler (my own brief investigation of this suggests that this might be hard, as LLVM possibly requires that its targets model some sort of heap, which the NVIDIA cards don't do).
There are opportunities to creatively implement other, recent OpenGL features. Uniform buffer objects could be supported, since both the vertex and fragment stages of the RSX can fetch from floating point textures. Transform feedback might be implemented by having the GLSL compiler factor out the program paths that compute vertices and generate a separate "fragment" program that renders to a buffer. Newer pipeline stages, like tesselation and geometry programs, could possibly be executed on the PS3's SPU's.
The RSX can directly read from a handful of texture formats, but the OpenGL spec calls for many more besides. RSXGL currently implements only those formats whereby image data can simply be copied to the RSX's memory; other formats will require pixel-by-pixel type conversion and swizzling. The Mesa library appears to be able to perform many of these conversions.
- STREAMING, SYNCHRONIZATION, CACHING
Right now, RSXGL takes a conservative approach to handling data that streams to the GPU - if an application wants to write to an area of memory that the GPU is using, then the application will wait for the GPU to finish.
This library will implement other strategies. OpenGL specifies a few ways for an application to request synchronization behavior, but
affords implementations a wide latitude for interpreting such requests. Data stores might be partially or fully double-buffered, temporarily or for the entire lifetime of the memory area. Additionally, the PS3 itself makes two equally-sized memory areas available to the RSX, with different transfer speeds in each direction.
RSXGL takes a similarly greedy approach the flushing the GPU's vertex and texture caches - it does this every time a new glDraw*() command is sent by the application. The library will instead use the information it has available to it to determine if these caches really
ought to be flushed or not before drawing.
RSXGL performs the equivalent of a glFlush() when the framebuffer is flipped, and when the application needs to wait for the GPU to finish with a buffer before modifying its contents (in addition to the occasions when flushing is explicitly called for by a GL function
I've noticed that whan a large-ish number of glDraw*() commands (on the order of several thousand) are submitted per frame without any additional flushing that performance degrades considerably. This particularly is the case if program uniform variables (such as an object's transformation matrix) are modified prior to each draw call. Therefore it's currently a good idea for the application to call glFlush() frequently.
Obviously this needs to be handled more transparently by the library itself, but I'm undecided as to a flushing policy - should it happen every draw call, or per a certain number of draw calls, or should it happen based upon the number of vertices being drawn, or the current size of the command buffer? (You can be assured that this will be an application-tunable setting.
I'm unclear as to how expensive an operation glFlush() is - it involves a small write by the PPU to the RSX's control register, allegedly a fast operation (reading from RSX memory by the PPU is not fast), but beyond that I don't know.
I haven't either tried to observe what happens when the command buffer's capacity is exceeded before the framebuffer is flipped (the libEGL implementation creates a command buffer with the capcity for 524288 commands, which is set in rsxgl_config.h and can be overriden at runtime by calling rsxeglInit() before initializing EGL).
The OpenGL ES 2 profile, as well as OpenGL profiles prior to version 3.1, allow an application to specify vertex data from the system's main memory, without specifically asking the library to create a buffer of some predetermined size. This usually requires the library
to implement some strategies to migrate client-side memory to the GPU, and there are likely many ways to try to perform this efficiently.
OpenGL 3.1 requires applications to create buffer objects for any vertex data (though not for index buffers; these can still be migrated
from client memory). This makes life easier for the library implementor, and is one reason why RSXGL implements GL 3.1 and not GL ES 2 (it also helps keep the data structures that the library allocates small).
Nonetheless, many existing programs depend upon the older spec, so it'd be good to support this in RSXGL (even if it's not super-efficient at first). Since supporting this older capability can potentially bloat RSXGL's data structures, this will likely be an
option specified when the library is configured.
Since client memory can be mapped into the RSX's address space, there will also be capability, implemented in the style of an OpenGL extension, for the application to promise that the client pointers submitted via glVertexArrayPointer are so mapped, eliminating a memcpy.
OpenGL 3.1 requires that applications call glGen*() functions to create the names for objects that get created. ES 2 and previous GL
versions made this optional, allowing the application to come up with any unsigned integers it wanted to name objects with. This change was another reason to implement OpenGL 3.1, because it allows for a faster pool-style allocation of object data structures instead of potentially requiring a costly data structure like an associative map.
The older behavior will be supported as a configure option, too, for compatibility with existing programs.
I'm interested in porting the [OpenGL Samples Pack] ([Register or Login to view links]
). They are small demonstration programs that are helpfully organized by the OpenGL profiles that they support.
RSXGL uses the GNU autotools for its build system and is distributed with a configure script. It requires the following projects:
NVIDIA's cgc shader compiler, from the Cg toolkit, is also required to use vertex and fragment GPU programs.
The RSXGL library depends upon a toolchain that can generate binaries for the PS3's PPU, and also upon parts of the PSL1GHT SDK. The sample programs also require a few ported libraries, such as libpng, which are provided by the ps3toolchain project. ps3toolchain recommends setting two environment variables to locate these dependencies:
RSXGL's configure script will use these environment variables if they're set; if they aren't set, by default the script uses the above settings. The PORTLIBS environment variable may also be set to further control were the ported libraries can be found.
Anyway if these variables are set reasonably, then the following commands should build RSXGL and its samples:
(Sample programs are packaged into NPDRM packages, but those packages remain in their build locations; they don't get moved anywhere relative to RSXGL's install path).
By default, configure will create non-debug builds with low optimization settings (-O1). Other options are:
# Debug build, including assertions:
# Optimized build (-O3):
The locations of the library's dependencies can be specified on the configure command-line, too:
The configure expects to find versions of the gcc, g++, ar, ranlib, and ld programs that target the PS3's PPU. It tries to find these in $PS3DEV/ppu/bin, but the following environment variables can be set to provide full paths to these programs to the configure script:
PPU_CC Location of C compiler
PPU_CXX Location of C++ compiler
PPU_AR Location of the ar archive utility
PPU_RANLIB Location of the ranlib utility
PPU_LD Location of the ld linker
The following environment variables can also be set to further influence how headers and libraries are located:
Currently two sample programs are built:
- src/samples/rsxgltest - A very simple test program whose contents and behavior will vary. This program is mainly used to try out various features of the library as they are developed.
- src/samples/rsxglgears - A port of an old chestnut, the "glgears" program that uses OpenGL to render some spinning gears. This port is based upon a version included in the Mesa library, which was itself a port to OpenGL ES 2 after being handed down throughout the ages.
The sample can print debugging information over TCP, in the manner of PSL1GHT's network/debugtest sample. src/samples/rsxgltest/main.c has symbols called TEST_IP and TESTPORT which specify the IP address and port number to try to connect to, and you can use "nc -l portnum" to receive messages.
- Instanced rendering
- Framebuffer objects
- x glMapBuffer* should block if an operation on the buffer is still pending.
- Better-performing glMapBuffer.
- Less conservative approach to buffer mapping & GPU cache invalidation.
- (Finish) object "orphaning" (Programs orphanable, Disposal of orphaned objects)
- x Handle timestamp overflow
- Integer vertex specification.
- format conversion (take this from mesa).
- test mipmapping
- cube maps
- rectangular textures (x implement glTexStorage*())
- implement glCopyTexImage*() and glCopyTexSubImage*()
- proxy textures
- implement the effects of glPixelStore
- Support for GLES2-style client vertex data.
- Support for GLES2-style application object namespaces.
- Implement glGet*() functions.
- More GLSL support.
- Make it possible to work with non-EGL configured display.
- Move object storage from globals to context.
- Support multiple contexts with shared objects.
- __restrict keyword where appropriate.
- x Add libstdc++.a to libGL.a, so that client code doesn't need to use g++ to link.
- Flushing and orphan cleanup policy
- Vertex program textures