Mathieu De Coster - SDL_GPU: the GPU API I always wanted

SDL_GPU: the GPU API I always wanted

17 March 2026

For a long time, writing a graphics program meant picking your poison. SDL_GPU, shipped with SDL3, hits the balance between usability and power just right.

OpenGL is battle-tested, but its age shows. The API is a sprawling collection of decades-old state-machine logic. What was "best practice" in OpenGL 2.1 is often a performance trap in 3.3, and OpenGL 4.6 is an entirely different beast altogether. While it is theoretically cross-platform, the reality is messy: mobile requires OpenGL ES, and Apple famously deprecated support years ago. Vulkan has effectively succeeded it – even if it was never intended as a replacement.

Modern APIs map better to current GPU architectures, but they come with complexity. Vulkan gives you total control but asks you to write four hundred lines before you can clear the screen. Direct3D 12 is the same story, but only on Windows. I've heard Metal is great, but I don't own a Macbook anymore.

I have been playing around with creating a small game with SDL3, SDL_GPU, and C++ to learn modern graphics programming. (I last dabbled with OpenGL nearly a decade ago.) I've always had a soft spot for SDL's clean C API and good documentation, and SDL3 feels like a perfected version of the library.

The thing that sold me first was how shaders are handled. You write HLSL once, and at build time SDL_shadercross compiles it to SPIR-V for Vulkan, DXIL for DirectX 12, and MSL for Metal. At runtime, you ask SDL which formats the device supports and load the right binary. I write the shader logic once, in a language I know, and the game runs on every modern desktop platform without a single #ifdef.

The second thing I appreciate is that SDL_GPU is explicit without being overwhelming. It exposes command buffers, copy passes, and render passes, but doesn't make you manage synchronisation primitives and memory barriers. My renderer currently has a clear two-phase structure per frame:

SDL_GPUCommandBuffer* cmd  = SDL_AcquireGPUCommandBuffer(device);

SDL_GPUCopyPass* copy = SDL_BeginGPUCopyPass(cmd);
// Upload vertex data to GPU buffers
SDL_EndGPUCopyPass(copy);

SDL_GPURenderPass* pass = SDL_BeginGPURenderPass(cmd, &color_target, 1, nullptr);
// Issue draw calls
SDL_EndGPURenderPass(pass);

SDL_SubmitGPUCommandBuffer(cmd);

This maps directly to how the GPU actually works. Data moves in the copy pass. Drawing happens in the render pass. I can clearly see where allocations happen and when commands are submitted. While I haven't needed multi-threaded command buffer creation yet, the API is designed to handle it when/if the time comes.

The CPU-to-GPU transfer model is a good example of the expliciteness of the API. You allocate a transfer buffer on the CPU side, write your vertex data into it, and send it to the GPU-side buffer in the copy pass. More code than OpenGL's glBufferData, yes. But you understand exactly what is happening, you control when it happens, and it scales correctly to more complex scenarios.

The best thing is that the API surface is relatively small and easy to grasp. For Vulkan, you need to read a couple of books to even understand the concepts. With SDL_GPU, all you need is some HTML documentation and header files.

It has been years since I really dived into graphics, and for the first time in a long time, I'm actually having fun programming the hardware again.