Video Processing

Video Processing

Process full videos with real-time preview and audio preservation.

Quick Recast processes videos frame by frame, applying face replacement to every frame in the video. This guide covers video-specific features including input formats, trimming, audio handling, progress tracking, and output configuration.

Video Input

To process a video, drag a video file onto the Target drop zone in the Quick Recast interface. You can also click the drop zone to open a file browser.

Supported Formats

Quick Recast supports the following video formats:

FormatExtensionNotes
MPEG-4.mp4Most common format, recommended for best compatibility
AVI.aviWidely supported, larger file sizes
QuickTime.movCommon on macOS, Apple ecosystem
Matroska.mkvOpen format, supports multiple tracks
Windows Media.wmvWindows platform format
Flash Video.flvLegacy web video format

MP4 Recommended

For the best results and compatibility, use MP4 (H.264) as both your input and output format. It provides a good balance of quality, file size, and broad platform support.

Video Trimming

You do not need to process an entire video. Quick Recast includes a built-in video trimming tool that lets you select a specific range of frames to process.

Trim Range Slider

When a video is loaded, a dual-handle range slider appears below the target preview. The slider represents the full duration of the video:

  • Left handle: Sets the start frame. Drag it right to skip the beginning of the video.
  • Right handle: Sets the end frame. Drag it left to skip the end of the video.
  • Timecodes: The start and end timecodes are displayed above the slider, showing the exact position in hours:minutes:seconds:frames format.

Only the frames within the selected range will be processed. This is useful for focusing on specific scenes, testing different models on a short clip before committing to the full video, or skipping intro/outro segments.

Test with a Short Clip First

Before processing an entire video, use the trim slider to select a 10-15 second segment containing the target face. Process this clip first to evaluate the swap quality and model choice. Once satisfied, extend the range to the full video.

Audio Preservation

Quick Recast automatically preserves the original audio from your source video. The pipeline works as follows:

1

Audio detection

Before processing, Recaster uses ffprobe to check whether the video contains an audio stream.

2

Video-only processing

The video frames are processed through the face swap pipeline without touching the audio. Each frame is replaced individually.

3

Audio muxing

After all frames are processed, the original audio track is combined ("muxed") with the processed video using ffmpeg. The result contains both the swapped video and the original audio.

FFmpeg Required for Audio

Audio preservation requires FFmpeg to be installed on your system. If FFmpeg is not available, Recaster will still process the video but the output will be silent. FFmpeg is free and available for all platforms at ffmpeg.org.

Processing Progress

Video processing provides detailed real-time feedback through several indicators:

Progress Bar

A progress bar shows the current completion percentage. Below the bar, the current frame number and total frame count are displayed (for example, "Frame 450/1800").

Real-Time Preview

The before/after comparison view updates periodically as frames are processed. You can see intermediate results without waiting for the entire video to finish. This allows you to spot quality issues early and stop processing if needed.

Stopping Processing

Click the Stop button at any time to cancel video processing. Frames that have already been processed are kept — you will get a partial video with all completed frames up to the point where you stopped.

Batch Processing Optimization

When processing videos that contain multiple faces per frame, Recaster uses batch processing to speed things up. Instead of processing each face one at a time, multiple faces are grouped and processed in a single GPU call. This optimization provides a 30-50% speed improvement for multi-face scenes.

Batch processing is automatic and works with InSwapper models. Ghost-style models use sequential processing by design, as each swap requires a unique source image reference.

Output Settings

The processed video is saved with the following defaults:

  • Location: Same directory as the target file
  • Filename: Original name with a _recast suffix (e.g., scene_01_recast.mp4)
  • Format: Same as input format
  • Frame rate: Preserved from original
  • Audio: Original audio track muxed back in

You can customize the output location by clicking the folder icon next to the output path field before starting processing.

Performance Tips

Video processing speed depends on several factors. Here are tips to optimize performance:

TipImpact
Use InSwapper 128-FP1620-30% faster than standard InSwapper with minimal quality loss
Set detection to Fast (320px)2-3x faster face detection per frame
Disable enhancement for draftsSkip the enhancement pass to save 30-50% processing time
Trim video to target sceneOnly process the frames you actually need
Use remote processingCloud GPUs (RTX 3090/4090) process 5-10 FPS vs local lower-end GPUs

Remote Video Processing Studio

Studio tier users can process videos on cloud GPUs. The workflow consists of three phases:

1

Upload

Source and target files are uploaded to the cloud instance via rsync. A progress indicator shows the upload percentage.

2

Process

The video is processed on the cloud GPU. Preview snapshots are streamed back at configurable intervals (default: every 5 seconds) so you can monitor results in real time.

3

Download

The final processed video is downloaded back to your local machine. A progress indicator shows the download percentage.

Internet Connection Required

Remote video processing requires a stable internet connection throughout the entire workflow. Large videos may take significant time to upload and download. Consider using remote processing for long or high-resolution videos where the cloud GPU speed advantage outweighs the transfer time.

Preview Update Interval

When processing remotely, you can configure how often preview snapshots are sent back from the cloud instance. The interval is set in seconds (range: 1-60 seconds, default: 5 seconds). Lower values give more frequent updates but increase network usage.

The interval is automatically converted to a frame count based on the video's FPS. For example, at 30fps with a 5-second interval, a preview snapshot is saved every 150 frames.

Typical Processing Times

Processing speed varies based on GPU, video resolution, number of faces, and model choice. Here are approximate times for common scenarios on cloud GPUs (RTX 3090/4090):

VideoUploadProcessingDownloadTotal
30s @ 30fps~15s~90s~45s~2.5 min
2 min @ 30fps~60s~6 min~3 min~10 min
Local processing speed depends heavily on your GPU. Modern NVIDIA GPUs (RTX 3060+) typically achieve 5-10 FPS for face swapping. Older or lower-end GPUs may be significantly slower. If processing speed is a concern, consider using the Studio tier for cloud GPU access.