background-removal-js — How AI Background Removal Works in the Browser
IS-Net model + ONNX Runtime Web + WebGPU — client-side segmentation without a server
import { removeBackground } from '@imgly/background-removal';
const blob = await removeBackground(imageUrl);
Quite a lot happens behind this one line.
Model: IS-Net
IS-Net (Intermediate Supervision Network) is a U-Net family salient object detection model from the DIS (Dichotomous Image Segmentation) paper. Not SAM or MODNet.
Encoder-decoder structure with intermediate supervision at each layer for precise binary masks. Purpose: separate "salient objects" (people, animals, objects) from background.
3 model variants: large (~176MB FP32), medium (~88MB FP16, default), small (~44MB INT8 quantized).
Preprocessing → Inference → Postprocessing
Step 1: Convert input to RGBA Uint8Array tensor [H, W, 4]
Step 2: Resize to 1024×1024 (bilinear, ignoring aspect ratio). HWC→BCHW with normalization: float32 = (uint8 - 128) / 256. Output shape: [1, 3, 1024, 1024]
Step 3: ONNX Runtime Web inference. Output: [1024, 1024, 1] float32 alpha mask (0.0-1.0)
Step 4: Multiply by 255 → uint8, resize to original resolution, write directly to alpha channel: data[4*i+3] = mask[i]
No separate alpha matting — IS-Net output is precise enough.
ONNX Runtime Web
WASM backend (default): WebAssembly SIMD + multithreading. WebGPU backend: device: 'gpu'. Auto-fallback if GPU unavailable.
Model Loading
Models hosted on imgly CDN, chunked for parallel download. Up to 176MB, loaded via Promise.all.
IS-Net Pipeline
Key Points
Convert input image to RGBA Uint8Array [H, W, 4] tensor
Resize to 1024×1024 bilinear → HWC→BCHW + normalize (mean=128, std=256)
IS-Net model outputs [1024, 1024, 1] float32 alpha mask (foreground=1.0, background=0.0)
Resize mask to original resolution → write directly to alpha channel
ONNX Runtime: WASM (default) or WebGPU backend — model chunks downloaded in parallel from CDN