🔊

Remotion Audio & Subtitles — Synchronizing Sound and Visuals

Synchronize audio/video sources to frames with <Audio>, <OffthreadVideo>, useAudioData()

Synchronizing sound and visuals is a core element of video.

: Inserts MP3/WAV files into video. Use startFrom/endAt props for specific audio segments, or pass a function to volume for per-frame volume control. Placing inside a Sequence plays audio only during that scene.

useAudioData() + getWaveformPortion(): Extracts waveform data from audio files to reflect current frame amplitude in visuals. Enables audio visualizers where bars dance to music.

: Composites existing MP4 files into Remotion video. Unlike regular

@remotion/captions: Synchronizes subtitle data (with timestamps) generated from TTS engines or Whisper at frame precision.

How It Works

Insert audio track into video with <Audio src={audioUrl} /> (can be placed inside Sequence)

Pass (f) => interpolate(f, [0, 30], [0, 1]) to volume prop for fade-in effect

Load waveform with useAudioData() → extract current frame amplitude with getWaveformPortion()

Reflect amplitude data in SVG/CSS styles to render audio visualizer

Parse subtitle JSON with @remotion/captions and display subtitle text matching current frame

Pros

✓ Frame precision: audio/subtitles/visuals sync to exactly the same frame
✓ OffthreadVideo composites existing footage without blocking main thread
✓ Per-frame volume control via volume function → natural crossfade

Cons

✗ Audio waveform analysis is CPU-intensive → performance degradation with long audio files
✗ Timestamp data for subtitle sync must be prepared separately

Use Cases

Podcast video: waveform visualizer synced to audio + auto subtitles Music video: motion graphics that react to music beats TTS-based content: AI voice + auto-generated subtitles + visual sync

References

🎯

Remotion Animation — Creating Motion with spring() and interpolate()

Natural video animation with physics-based springs and interpolation functions

→

← 📊 Remotion Data-Driven Video — Auto-Generate Videos from API/DB Data ☁️ Remotion Lambda — Parallel Rendering Thousands of Videos on AWS →