Remote Training
Train on cloud GPUs via Vast.ai with live preview streaming, multi-session support, and budget controls.
Studio Feature
Overview
Remote training connects Recaster to cloud GPU instances on Vast.ai, a marketplace for renting GPU compute by the hour. This gives you access to high-end GPUs like the RTX 3090, RTX 4090, and A100 at a fraction of the purchase cost.
Recaster handles the full remote training lifecycle: provisioning instances, uploading your dataset, starting training, streaming live previews back to your desktop, and syncing results. You interact with the same Training widget interface -- the only difference is where the computation happens.
Vast.ai Setup
Before using remote training, you need to configure your Vast.ai account and SSH credentials in Recaster.
Create a Vast.ai account
Generate an SSH key
Add your SSH key to Vast.ai
Enter your API key
SSH Key Location
~/Library/Application Support/Recaster/ssh/. The private key must have 600 permissions (read/write for owner only). Recaster sets this automatically when generating keys.Launching Instances
Once your Vast.ai account is configured, you can browse available GPU offers and launch instances directly from Recaster's Remote panel.
Instance Browser
The instance browser shows available GPU machines sorted by price per hour. Each listing includes:
- GPU model and VRAM (e.g., RTX 3090 24GB)
- Price per hour
- Available disk space
- Network speed
- Reliability rating
Recommended GPUs for Training
| GPU | VRAM | Typical Price | Best For |
|---|---|---|---|
| RTX 3090 | 24 GB | $0.20-0.40/hr | General training, great value |
| RTX 4090 | 24 GB | $0.40-0.80/hr | Fast training, high-res models |
| A100 40GB | 40 GB | $0.60-1.20/hr | Large models, high resolution |
| A100 80GB | 80 GB | $1.00-2.00/hr | Maximum resolution and batch size |
Cost-Effective Training
File Synchronization
Recaster uses rsync over SSH to synchronize your project files between your local machine and the remote instance. This includes uploading face datasets, model files, and configuration, as well as downloading training results.
Sync Workflow
- Initial upload -- When you start remote training, Recaster uploads your source and destination face datasets to the instance. A progress bar shows the upload status.
- Incremental sync -- After the initial upload, only changed files are synced. This makes subsequent syncs much faster.
- Result download -- When training is complete, sync the trained model files back to your local machine for merging.
Sync Panel
The Sync tab in the Project Panel provides a visual interface for file synchronization. It shows the sync status of each directory (source faces, destination faces, model files) with color-coded indicators:
- Green -- Fully synced, local and remote files match.
- Yellow -- Local files are ahead of remote (upload needed).
- Orange -- Remote files are ahead of local (download needed).
Upload Speed
Live Preview Streaming
One of the most powerful Studio features is real-time preview streaming from remote training sessions. Instead of waiting for training to complete and downloading the model, you can watch the face swap quality improve in real-time.
How Streaming Works
Recaster deploys a lightweight streaming server to the remote instance alongside the training process. This server:
- Captures the DFL training preview window using X11 screenshots.
- Crops the header and footer to isolate the face grid.
- Encodes the frame as JPEG and sends it via Server-Sent Events (SSE).
- An SSH tunnel forwards the stream to your local machine.
- Recaster receives the frames and displays them in the preview canvas.
The streaming adds minimal overhead to the training process. Preview frames update every few seconds, and the stream automatically adapts to network conditions.
Preview Features
- All 9 preview views available with Space/Shift+Space navigation.
- 4-column by 2-row grid layout (same as local preview).
- Loss values and iteration count streamed alongside preview frames.
- Adaptive quality adjusts JPEG compression based on network latency.
Enable Live Preview
Multi-Session Training
Studio users can run multiple training sessions concurrently on separate GPU instances. This is useful when you need to train models for different face pairs simultaneously, or when you want to compare different configurations in parallel.
Session Management
The Multi-Session panel provides an overview of all active training sessions:
- Session cards -- Each active session is displayed as a card showing the project name, model type, current iteration, loss values, and GPU instance details.
- Quick actions -- Pause, resume, or stop any session from the card controls.
- Switch preview -- Click a session card to view its live preview in the main canvas.
Port Allocation
Each concurrent session uses a unique port for SSH tunneling and preview streaming. Recaster automatically allocates ports in the range 8765-8769, supporting up to 5 simultaneous sessions.
Cost Tracking
Remote training costs money, so Recaster includes built-in cost tracking and budget management to help you stay within your spending limits.
Budget Configuration
Set a spending budget in the Budget Configuration dialog:
- Daily budget -- Maximum spend per 24-hour period. Instances are paused when the limit is reached.
- Monthly budget -- Maximum total spend per month. A warning appears when approaching the limit.
- Per-session limit -- Cap the cost of any single training session.
Spending Alerts
Recaster provides proactive cost alerts:
- 80% warning -- A yellow banner appears when you have used 80% of your configured budget.
- 100% action -- When the budget limit is reached, running instances are automatically paused. You can increase the budget to continue or stop instances to save the remaining balance.
- Session cost display -- Each session card shows its accumulated cost and per-hour rate.
Instance Billing
Project Panel Integration
Remote training integrates with the Project Panel through the Local/Remote toggle switch. When enabled, the Project Panel shows three tabs:
Local Tab
Your local project files. Always visible regardless of mode.
Remote Tab
Browse the remote instance filesystem. Navigate directories, view files, and verify that datasets uploaded correctly.
Sync Tab
Upload and download files between local and remote. Sync status indicators show which directories are up to date.
Instance Association
Each Recaster project remembers which remote instance it is associated with. When you toggle to Remote mode:
- If one instance is running, it is automatically associated.
- If multiple instances are running, a selection dialog appears.
- If no instances are running, you are prompted to launch one.
- The association is saved in the project state and restored on next launch.
Model Versioning
Studio users have access to model version snapshots, which allow you to save the current state of a training model and restore it later. This is particularly useful for remote training where you may want to:
- Save a checkpoint before changing training parameters.
- Compare models from different training stages to find the sweet spot.
- Roll back to a previous state if training quality degrades.
- Share model snapshots between local and remote environments.
Snapshot Before Experimenting
Troubleshooting
SSH connection fails
Verify your SSH key has correct permissions (chmod 600). Confirm the key is added to your Vast.ai account. Test the connection manually in a terminal.
Live preview not showing
Check that the streaming server is running on the remote instance. The server is deployed automatically on first use. If the preview remains blank, the SSH tunnel may have disconnected -- try toggling Live off and on again.
rsync progress not showing on macOS
The built-in macOS rsync is version 2.6.9 which lacks modern progress flags. Install the latest version via Homebrew: brew install rsync.
Instance terminated unexpectedly
Vast.ai instances can be interrupted by the provider. Your training history and model auto-saves are preserved. Launch a new instance, sync your project files, and resume training.
Was this page helpful?