🎬

Tooscut — GPU-Accelerated Video Editor Running in the Browser

How WebGPU + Rust/WASM Achieves Native-Level Real-Time Compositing in the Browser

Tooscut is an NLE (Non-Linear Editor) that runs in a single browser tab. No need to install DaVinci Resolve or Premiere Pro — just open a URL and start editing.

But "browser video editor" usually conjures images of toy-level tools. What makes Tooscut different: the entire rendering engine runs on Rust/WASM + WebGPU. JavaScript handles UI only. Actual frame compositing happens on the GPU.

Three-Layer Architecture

React UI (TanStack Start) → TypeScript render engine → Rust/WASM compositor. The codebase is ~80% TypeScript, ~20% Rust. That 20% of Rust handles all the performance-critical work.

What GPU Acceleration Actually Does

Brightness, contrast, saturation, blur, hue rotation — all these effects run inside a single WGSL fragment shader. Move a slider and only a GPU uniform value changes; the shader re-executes instantly. CPU stays untouched.

Take blur as an example: a 13×13 Gaussian kernel is hardcoded in the shader. Step size scales with sigma, so changing blur intensity doesn't require additional texture passes.

Real-Time Preview Pipeline

This is the clever part.

  1. Main thread runs a requestAnimationFrame loop
  2. Filters only clips visible at the current time (sorted array + binary search)
  3. Extracts video frames from HTMLVideoElement via createImageBitmap()
  4. Transfers (not copies) ImageBitmap to a Web Worker
  5. WASM compositor inside the Worker composites via WebGPU directly onto an OffscreenCanvas

Step 4 is key. ImageBitmap transfer is zero-copy — ownership moves from main thread to Worker with no memory duplication. Texture upload also goes directly to GPU via copy_external_image_to_texture.

Memory Management

WASM linear memory doesn't count against V8 heap. GPU buffers live in VRAM, Bitmaps use native allocation. Video files are decoded on-demand rather than loaded entirely into memory. This architecture lets 4K video work without hitting browser tab memory limits.

Export

FrameRendererPool runs multiple Web Workers in parallel. Each Worker has an independent WASM compositor, splitting frames across workers for rendering. MediaBunny library handles MP4 muxing.

Why Not Just Use WebGPU from JS Directly?

You can. WebGPU is a JavaScript API, so calling it from JS is the default path. The shader code (WGSL) running on the GPU is identical whether called from JS or Rust — blur, brightness, contrast performance is the same.

The difference is in CPU-side work outside the GPU. Every frame requires keyframe interpolation (bezier curves), 4×4 matrix math, uniform buffer packing (128 bytes, 16-byte aligned) — all within 16.6ms.

TaskJSRust/WASM
Keyframe interpolationDepends on V8 JIT optimizationCompile-time optimization, SIMD possible
Uniform buffer packingManual ArrayBuffer offset calculationPod derive for struct → bytes auto-conversion
Audio mixing (~128 samples)GC jitter riskNo GC — critical for real-time audio

Bottom line: 3-4 tracks with 1-2 effects? JS is fine. 10+ tracks with dozens of keyframes and audio effect chains? JS garbage collection randomly drops frames. Audio callbacks come every ~128 samples (~2.9ms) — a GC pause causes audible glitches. Tooscut chose Rust not for raw speed but for deterministic, GC-free timing.

Rendering Pipeline Details

Frame Compositing Flow

Main Thread requestAnimationFrame loop → filter visible clips at current time (binary search) → buildRenderFrame() evaluates keyframes + merges Transform/Effects
Video Decode HTMLVideoElementcreateImageBitmap()transfer ImageBitmap to Worker (zero-copy)
Web Worker Comlink proxy → WASM Compositor.renderFrame() → WebGPU pipeline execution → renders directly to OffscreenCanvas
GPU Texture upload (copy_external_image_to_texture) → vertex shader (fullscreen quad) → fragment shader (effect chain) → alpha blending compositing

GPU Effect Shader Implementation

All effects run sequentially inside a single WGSL fragment shader. One uniform buffer (128 bytes) carries all parameters.

Effect Shader Implementation Performance
Blur 13x13 Gaussian kernel, sigma-based step scaling Single pass — no extra texture needed
Brightness color.rgb * brightness Single multiply
Contrast (rgb - 0.5) * contrast + 0.5 Single vector op
Saturation Luminance-based mix(grayscale, color, saturation) dot + mix ops
Hue Rotate RGB → HSL → H += radian → HSL → RGB Two color space conversions
Transition UV-based smoothstep masking (Wipe L/R/U/D) No per-pixel branching

Application order: Blur → Brightness → Contrast → Saturation → Hue Rotate → Transition → Opacity → clamp

Rust/WASM Crate Structure

Crate Role Core Tech
compositor GPU compositing engine — media/text/shape/line rendering wgpu, glyphon, cosmic-text
keyframe Keyframe interpolation — Linear/Step/Bezier Temporal coherence cache (sequential O(1), seek O(log n))
audio-engine AudioWorklet multi-track mixer — EQ/compressor/reverb ~128 sample real-time PCM output
types Shared type definitions — auto-generates TS types via tsify-next serde + wasm-bindgen

Memory Management Strategy

  • WASM linear memory — Not counted against V8 heap. Separate from JS heap shown in DevTools
  • GPU buffers — Stored in VRAM. TextureManager reuses same-size textures by updating data only
  • Video decoding — No full loading. On-demand buffering window. Preview uses HTMLVideoElement (browser-optimized), export uses MediaBunny (frame-accurate)
  • ImageBitmap transfer — Ownership transfer from main thread to Worker (zero-copy). No duplication cost

Developer Background

Mohamad Mohebifar — Co-founder & CTO at Codemod. Previously at Meta, Brex, Shopify. 10+ years in code transformation/transpilers. WorldSkills 2015 Bronze Medal (Web Design & Development). Goal: \"Photopea for video editing\" — cover 80% of everyday editing without installing anything.

Step-by-Step

1

Visit tooscut.app → Editor runs directly in browser (no install)

2

File System Access API references local files directly — edit without upload/download

3

Place video, audio, image, text, and shape clips on the timeline

4

Apply bezier curve animation to position, scale, opacity, and effects in the keyframe panel

5

On export, FrameRendererPool renders via parallel Workers → MP4 output

Pros

  • Zero install — professional editor via URL access only
  • GPU-accelerated real-time effects — no rendering delay on slider adjustment
  • Local-first — media files never leave your machine
  • Rust/WASM engine — 90-95% of native performance

Cons

  • WebGPU required — no Safari support, Firefox unstable (Chrome-only in practice)
  • 8K multi-track or color grading level work still belongs to DaVinci Resolve
  • Elastic License 2.0 — not OSI-definition open source (commercial hosting restricted)

Use Cases

Quick YouTube video cut editing — no desktop app install needed When you need to edit video on a borrowed PC during travel Environments with restricted app installation like Chromebook Privacy-first — when files must not be uploaded to servers