Upscaling Models
Recaster includes two families of AI upscaling models: Real-ESRGAN (GAN-based) and SwinIR (Transformer-based). Each offers different quality and speed trade-offs.
Model Comparison
The table below lists every upscaling model available in Recaster, along with its tier, file size, quality rating, and a brief description. Models are automatically downloaded from Hugging Face the first time you use them.
| Name | Tier | Variants | Quality | Description |
|---|---|---|---|---|
| Real-ESRGAN 2x | Free | 2x | 9.2 / 10 | Fast GAN-based upscaling. Good balance of speed and quality for quick previews and general use. |
| Real-ESRGAN 4x+ | Free | 4x | 9.2 / 10 | Enhanced GAN upscaling with the Real-ESRGAN-x4plus architecture. Strong detail recovery at 4x scale. |
| SwinIR 2x | Free | 2x | 9.7 / 10 | Transformer-based architecture delivering the highest quality output. Best choice for professional work. |
| SwinIR 4x | Free | 4x | 9.5 / 10 | Direct 4x upscaling with SwinIR transformer. Excellent quality with a single pass. |
| Real-ESRGAN 8x | Studio | 8x | 9.0 / 10 | Maximum resolution upscaling for Studio tier. Produces extremely large output from low-resolution sources. |
Real-ESRGAN
Real-ESRGAN is a Generative Adversarial Network (GAN) designed for practical image restoration and super-resolution. It excels at handling real-world degradation such as compression artifacts, blur, and noise. The ONNX models used in Recaster are sourced from the qualcomm/Real-ESRGAN-x4plus repository on Hugging Face.
Key Characteristics
- GAN architecture produces sharp, visually appealing results
- 67 MB model file for both 2x and 4x variants
- Supports tile-based processing for large frames
- CoreML acceleration on macOS for fast local processing
- 8x variant (70 MB) available exclusively on the Studio tier
SwinIR
SwinIR uses a Swin Transformer architecture that captures long-range dependencies in images, producing the highest-quality upscaling results available in Recaster. The quality advantage is most noticeable on complex textures, fine hair detail, and skin tones. Models are sourced from the lixinze/swinir repository on Hugging Face.
Key Characteristics
- Transformer architecture with highest quality rating (9.7/10 at 2x)
- 67 MB model file for both 2x and 4x variants
- Best choice for professional output where quality matters most
- Supports tile-based processing for large frames
- Requires Hugging Face authentication token for download
Hugging Face Authentication
SwinIR models are hosted in a gated repository that requires Hugging Face authentication. You must add a Hugging Face access token to your Recaster settings before downloading SwinIR models.
- Visit huggingface.co/settings/tokens and create a new access token (or copy an existing one).
- Open your Recaster settings file. On macOS this is located at
~/Library/Application Support/Recaster/settings.json. - Add a
hf_tokenfield with your token value:
{
"hf_token": "hf_YOUR_TOKEN_HERE",
...
}Restart Recaster after adding the token. SwinIR models will download automatically when you first select them.
Real-ESRGAN does not require authentication
Model Cache Location
Downloaded models are cached locally so they only need to be downloaded once. The cache location depends on your operating system:
| Platform | Cache Path |
|---|---|
| macOS | ~/Library/Application Support/Recaster/models/ |
| Windows | %APPDATA%\Recaster\models\ |
| Linux | ~/.config/Recaster/models/ |
macOS Compatibility
On macOS, ONNX Runtime uses Apple's CoreML framework for hardware acceleration. Real-ESRGAN works well with CoreML and processes at approximately 10 to 20 FPS on Apple Silicon. However, SwinIR has a known compatibility issue with CoreML that causes it to fall back to CPU-only processing on macOS.
SwinIR on macOS
Choosing a Model
Use the guidelines below to pick the right model for your project:
Choose Real-ESRGAN when...
- Speed is more important than maximum quality
- You are processing on macOS locally
- You want quick previews before final output
- Your source has heavy compression artifacts
- You need 8x upscaling (Studio tier)
Choose SwinIR when...
- Quality is the top priority
- You have a dedicated NVIDIA GPU or use remote upscaling
- You are working on professional or final-delivery output
- Fine detail such as hair and skin texture matters
- You are upscaling high-quality source material
Recommended for most users
GPU Requirements
Both model families use ONNX Runtime for inference. On systems with an NVIDIA GPU, the CUDA Execution Provider is used automatically. On macOS, CoreML provides hardware acceleration for compatible models. If no GPU is available, processing falls back to CPU.
GPU memory usage is approximately 2 GB for 4x upscaling with the default tile size of 256 pixels. You can adjust the tile size and GPU memory limit in the pipeline settings to fit your hardware.
Model Sources
All models are hosted on Hugging Face and verified for compatibility with Recaster:
- Real-ESRGAN 2x/4x: qualcomm/Real-ESRGAN-x4plus
- SwinIR 2x/4x: lixinze/swinir (requires authentication)
- Real-ESRGAN 8x: facefusion/models
Was this page helpful?