GPU Requirements

GPU Requirements

VRAM requirements, supported GPU architectures, driver compatibility, and platform-specific GPU notes.

Recaster uses GPU acceleration for face swapping, enhancement, training, and video upscaling. While CPU processing is supported as a fallback, a dedicated NVIDIA GPU is strongly recommended for reasonable performance.

VRAM by Operation

The table below shows the approximate GPU memory (VRAM) required for each operation. These are estimates and may vary depending on input resolution and model settings.

OperationMinimum VRAMRecommended VRAMNotes
Face Swapping2 GB4 GBInSwapper runs well on 2 GB; Ghost benefits from 4 GB
Face Enhancement2 GB4 GBGPEN-2048 may need 6 GB at full resolution
Training (Quick96)4 GB6 GBSmaller architecture, faster iterations
Training (SAEHD)6 GB8-12 GBHigher dimensions require more VRAM
Training (AMP)6 GB8-12 GBSimilar to SAEHD requirements
Training (XSeg)4 GB6 GBMask training is less VRAM-intensive
Video Upscaling (2x)2 GB4 GBTile-based processing limits VRAM usage
Video Upscaling (4x/8x)2 GB4-6 GBMulti-pass 2x uses same VRAM as single 2x

Studio Tier Cloud GPUs

If your local GPU does not meet the minimum requirements, Studio tier users can offload processing to cloud GPUs via Vast.ai. Cloud instances typically offer 24-48 GB VRAM (RTX 3090, A100, etc.).

Driver Requirements

Recaster requires compatible NVIDIA drivers for GPU acceleration. Keep your drivers updated to the latest stable version for best compatibility.

NVIDIA CUDA Drivers

  • NVIDIA driver version 535+ recommended
  • CUDA 11.8 or 12.x supported
  • Download from nvidia.com/drivers

cuDNN (Remote Instances)

Remote training and processing instances require cuDNN 9 for ONNX Runtime GPU acceleration. This is automatically installed during provisioning, but can be manually installed if needed:

pip install nvidia-cudnn-cu12==9.1.0.70

ONNX Runtime Compatibility

Recaster uses ONNX Runtime for model inference. Specific version pinning is required for reliable operation.

Version Pinning Required

ONNX Runtime GPU must be version 1.19.2. Versions 1.23 and above have known compatibility issues with driver 535.x and certain CUDA configurations. Do not upgrade ONNX Runtime unless a new compatible version is verified.
PackageRequired VersionNotes
onnxruntime-gpu1.19.2NOT 1.23+ (compatibility issues)
NumPy<2.0.0e.g. 1.26.4 (NumPy 2.0 breaks ONNX Runtime)
OpenCV4.5 - 4.10Use opencv-python, not opencv-python-headless

If you encounter version conflicts on a remote instance, use the following commands to force correct versions:

pip install --force-reinstall onnxruntime-gpu==1.19.2
pip install --force-reinstall "numpy>=1.23.0,<2.0.0"
pip uninstall -y opencv-python-headless
pip install --no-deps "opencv-python>=4.5.0,<4.11.0"

RTX 5000 Series (Blackwell)

RTX 5000 Series Limitations

NVIDIA RTX 5090, 5080, 5070, and 5060 GPUs use the Blackwell architecture (SM 12.0). TensorFlow dropped native Windows GPU support after version 2.10, and current TensorFlow wheels only support up to compute capability 9.0 (RTX 40 series). This means local DeepFaceLab training will not work on RTX 5000 series GPUs.

Recaster shows a warning dialog on startup when an RTX 5000 series GPU is detected. The following workarounds are available:

  • Remote Training (Recommended) — Use Studio tier cloud GPUs via Vast.ai. Works immediately with no local GPU constraints.
  • WSL2 — Use Windows Subsystem for Linux with a community fork that supports Blackwell GPUs.
  • Quick Recast and Upscaling — ONNX Runtime-based operations (face swapping, enhancement, upscaling) may work with RTX 5000 GPUs since ONNX Runtime updates CUDA support more frequently than TensorFlow.

Dismissing the Warning

To suppress the RTX 5000 compatibility warning, check "Don't show again" in the dialog. This saves gpu_compatibility_warning_dismissed: true in your settings file.

macOS GPU Support

macOS does not support NVIDIA CUDA. GPU acceleration on Mac uses Apple's CoreML framework where available, with automatic fallback to CPU for unsupported operations.

OperationmacOS SupportNotes
Real-ESRGANCoreML Accelerated10-20 FPS, recommended for Mac users
SwinIRCPU Fallback2-5 FPS on CPU. CoreML does not support SwinIR dynamic shapes.
Face SwappingSupportedONNX Runtime CPU provider, adequate for most use cases
DFL TrainingLimitedCPU only, very slow. Use remote training for practical results.

Best Practice for Mac Users

Use Real-ESRGAN for local upscaling and leverage Studio tier remote processing for training and SwinIR upscaling. Apple Silicon Macs handle face swapping and enhancement at acceptable speeds for most workflows.

Supported GPU Families

The following NVIDIA GPU families are tested and supported for local processing:

  • RTX 40 Series — Full support. Best local performance.
  • RTX 30 Series — Full support. Excellent performance.
  • RTX 20 Series — Full support. Good performance.
  • GTX 16 Series — Supported. Limited VRAM may restrict training.
  • GTX 10 Series — Basic support. Face swapping works, training may be slow.
  • RTX 50 Series — Partial support. See RTX 5000 section above.

AMD and Intel GPUs

AMD and Intel GPUs are not currently supported for GPU acceleration. CPU processing is used automatically when no compatible NVIDIA GPU is detected. Studio tier cloud GPUs provide an alternative for users without NVIDIA hardware.