🔊

Remotion Audio & Subtitles — Synchronizing Sound and Visuals

Synchronize audio/video sources to frames with <Audio>, <OffthreadVideo>, useAudioData()

Synchronizing sound and visuals is a core element of video.

: Inserts MP3/WAV files into video. Use startFrom/endAt props for specific audio segments, or pass a function to volume for per-frame volume control. Placing inside a Sequence plays audio only during that scene.

useAudioData() + getWaveformPortion(): Extracts waveform data from audio files to reflect current frame amplitude in visuals. Enables audio visualizers where bars dance to music.

: Composites existing MP4 files into Remotion video. Unlike regular

@remotion/captions: Synchronizes subtitle data (with timestamps) generated from TTS engines or Whisper at frame precision.

How It Works

1

Insert audio track into video with <Audio src={audioUrl} /> (can be placed inside Sequence)

2

Pass (f) => interpolate(f, [0, 30], [0, 1]) to volume prop for fade-in effect

3

Load waveform with useAudioData() → extract current frame amplitude with getWaveformPortion()

4

Reflect amplitude data in SVG/CSS styles to render audio visualizer

5

Parse subtitle JSON with @remotion/captions and display subtitle text matching current frame

Pros

  • Frame precision: audio/subtitles/visuals sync to exactly the same frame
  • OffthreadVideo composites existing footage without blocking main thread
  • Per-frame volume control via volume function → natural crossfade

Cons

  • Audio waveform analysis is CPU-intensive → performance degradation with long audio files
  • Timestamp data for subtitle sync must be prepared separately

Use Cases

Podcast video: waveform visualizer synced to audio + auto subtitles Music video: motion graphics that react to music beats TTS-based content: AI voice + auto-generated subtitles + visual sync