|
OpenShot Library | libopenshot
0.5.0
|
Hardware acceleration in libopenshot allows FFmpeg to use platform-specific GPU APIs for video decode and encode when available. In practice, this means some of the work that would otherwise be done by the CPU can be offloaded to the GPU or to dedicated media blocks on the GPU.
This document focuses on what hardware acceleration in libopenshot does today, how it fits into the current processing pipeline, and what users and developers should expect from it.
The following table summarizes the historically supported hardware-acceleration backends in libopenshot. Actual behavior still depends on FFmpeg build options, driver availability, operating system support, and the runtime environment.
| Linux Decode | Linux Encode | macOS Decode | macOS Encode | Windows Decode | Windows Encode | Notes | |
|---|---|---|---|---|---|---|---|
| VA-API | ✔️ | ✔️ | - | - | - | - | Linux only |
| VDPAU | ✔️ 1 | ✅ 2 | - | - | - | - | Linux only |
| CUDA (NVDEC/NVENC) | ❌ 3 | ✔️ | - | - | - | ✔️ | Backend availability depends on the FFmpeg build |
| VideoToolbox | - | - | ✔️ | ❌ 4 | - | - | macOS only |
| DXVA2 | - | - | - | - | ❌ 3 | - | Windows only |
| D3D11VA | - | - | - | - | ❌ 3 | - | Windows only |
| QSV | ❌ 3 | ❌ | ❌ | ❌ | ❌ | ❌ | Backend availability depends on the FFmpeg build |
This table should be read as a support map, not a guarantee that every backend is fully validated on every current OS/driver combination.
Hardware acceleration is useful for two main reasons:
However, hardware acceleration is not automatically faster for every file or on every system. The real result depends on codec support, driver quality, stream format, pixel format, resolution, frame rate, and how much CPU-side work still needs to happen after decode.
Today, hardware acceleration in libopenshot is primarily used for:
It is not currently used to keep the entire edit/render pipeline on the GPU. Decoded frames usually still need to be copied back into system memory for colorspace conversion, scaling, caching, effect processing, compositing, and timeline rendering.
That detail is important because it explains why hardware decode does not always produce a speedup.
The current decode flow looks roughly like this:
Settings::HARDWARE_DECODER.If hardware decode fails during startup decode or frame transfer, libopenshot falls back to software decode for that reader instead of returning corrupt, green, or black frames.
Hardware decode is best-effort, not all-or-nothing.
If a hardware decoder is requested and one of the following happens early in the decode path:
libopenshot reopens that reader in software decode mode and continues decoding.
This behavior is intentionally conservative. The priority is correctness and stability:
For diagnostics and UI checks, this means there is a difference between:
FFmpegReader::HardwareDecodeSuccessful() exists to expose that distinction.
Hardware decode is not guaranteed to be faster than software decode.
In libopenshot's current pipeline, decoded frames are brought back to system memory immediately after decode. That introduces costs that can erase or outweigh the raw decode benefit:
Because of that, hardware decode performance is workload-dependent.
General guidance:
Hardware acceleration should be treated as a capability that may help, not as a guarantee of better performance.
A file can fail on hardware decode for several reasons:
For example, consumer hardware decode paths often handle H.264 4:2:0 very well, but may not support H.264 4:2:2 decode reliably. In those cases, software decode may work perfectly while hardware decode fails.
Older historical note:
Because backend support has changed over time, always validate against the actual FFmpeg build and driver stack in use.
The following settings are used by libopenshot to enable, disable, and control hardware acceleration features.
VA-API is one of the primary Linux hardware-decode paths used by libopenshot. On supported Intel and AMD systems it can work well, but not every file format, codec profile, or pixel format is supported by every driver.
VDPAU support exists historically, but behavior can vary with driver and FFmpeg stack. Treat it as backend-dependent rather than universally reliable.
NVIDIA hardware encode support has historically been more reliable than decode support in libopenshot, depending on FFmpeg build and driver stack. Validate the actual runtime environment before assuming support.
VideoToolbox support exists, but stability and feature coverage should be tested carefully on the target FFmpeg/macOS version.
Windows decode backends are highly dependent on FFmpeg build options and device support. They should be treated as runtime-validated features, not assumptions.
If the computer has multiple graphics cards installed, libopenshot can choose which device should be used for decode and encode. This is currently practical mainly on Linux, where FFmpeg expects device names such as /dev/dri/renderD128.
Contributions are welcome for improving cross-platform device enumeration and selection.
When validating hardware decode, check both:
A frame that looks correct is not enough to prove that hardware acceleration worked, because software fallback may have rescued the decode.
Recommended validation:
The biggest architectural limitation today is that decoded frames are generally copied back to CPU memory for the rest of the pipeline.
Longer-term improvements could include:
Avoiding repeated GPU-to-CPU and CPU-to-GPU copies would make hardware acceleration much more effective for end-to-end editing and export workflows.
Hardware acceleration support changes with FFmpeg, drivers, operating systems, and GPU generations. If you find incorrect information or validate a backend on a newer stack, please update this document.
A big thanks to Peter M (https://github.com/eisneinechse) for all his work on integrating hardware acceleration into libopenshot. The community thanks you for this major contribution.
1.8.17