background-removal-js β How Background Removal Actually Works in the Browser
IS-Net ONNX Model + WASM Inference + Canvas API β A Fully Client-Side Pipeline
Feed it an image, get a transparent PNG back. No server round-trip. Everything runs in the browser. ONNX Runtime's WASM backend executes a deep learning model client-side.
The pipeline is surprisingly simple when you open the code.
5-Stage Pipeline
Stage 1 β Image Decode (codecs.ts)
Whether the input is a URL, Blob, or ArrayBuffer, it all converts to NdArray<Uint8Array> (RGBA, [H, W, 4]). Browser path: createImageBitmap -> Canvas -> getImageData. Node.js uses sharp.
Stage 2 β Preprocessing (inference.ts + utils.ts)
Two operations: resize and normalize.
The image gets resized to 1024x1024. Aspect ratio not preserved β forced square. Bilinear interpolation. Then HWC uint8 RGBA converts to BCHW float32 RGB. Normalization: (pixel - 128) / 256 β value range becomes roughly [-0.5, 0.5].
This happens in tensorHWCtoBCHW: float32Data[j] = (imageBufferData[i] - 128) / 256 β R, G, B channels placed in planar order with stride multiplication.
Stage 3 β Model Inference (onnx.ts)
Runs IS-Net (Dichotomous Image Segmentation) via ONNX Runtime. Input tensor [1, 3, 1024, 1024], output [1024, 1024, 1] (foreground probability 0.0-1.0).
Two execution providers: wasm (default) and webgpu (when config.device is gpu). ort.env.wasm.numThreads = navigator.hardwareConcurrency enables multi-threaded WASM.
Three model sizes: isnet (FP32), isnet_fp16 (FP16, default), isnet_quint8 (INT8 quantized). Downloaded from CDN in chunks.
Stage 4 β Post-processing (inference.ts)
Float32 mask converts to uint8 (x255), resizes back to original resolution. Then overwrites the original image's alpha channel:
outImageTensor.data[4 * i + 3] = alphamask.data[i] β this single line is the core of background removal. Mask value 0 (background) becomes transparent, 255 (foreground) stays.
Stage 5 β Encode (codecs.ts)
RGBA tensor drawn to Canvas via putImageData, exported as PNG/JPEG/WebP Blob via convertToBlob.
Honest Take
The simplicity is nice, but there's a cost.
Forced 1024x1024 resize is brutal. Whether the original is 4000x3000 or 200x200, everything gets squashed to 1024x1024. No aspect ratio preservation. Tall portraits stretch horizontally, wide landscapes compress vertically. Distortion before model input degrades mask quality. Aspect-preserving resize + padding should be the default.
No trimap, no alpha matting. Other systems (MODNet, RVM) use trimap generation -> alpha matting for semi-transparent edges like hair. This library relies on a single IS-Net pass. Result quality is 100% model-dependent. Hair boundary artifacts are a structural limitation.
Session memoization via lodash.memoize. Config JSON as key, no duplicate model loads. Reasonable. But no memory release mechanism β once loaded, the ONNX session holds memory until page close.
AGPL-3.0 license. Commercial use requires source disclosure. Not MIT/Apache. This might be the biggest barrier for production use.
69 open issues, many unanswered. Two maintainers, last commit July 2025. Eight months of inactivity on main.
Still, the pattern of running ONNX Runtime WASM inference in the browser has real value. Run this in a Web Worker and you get background removal without blocking the UI β exactly the pattern from our web-workers-internals post.
Repository Status (2026-03)
6,973 stars, 453 forks. Most popular JS background removal library.
Two maintainers: DanielHauschildt (56 commits, imgly employee) and mirko314 (10 commits). Last main commit 2025-07-18. 69 open issues. 1 open PR. No GitHub Releases β npm only. AGPL-3.0. TypeScript 59% + JavaScript 31%. Latest npm version 1.5.7 (2024-11-27).
Pipeline Code Explorer
Click each card to expand the actual source code
IS-Net Model Variants
| Config | Precision | Size | Speed |
|---|---|---|---|
large | FP32 | ~176MB | Slow |
medium (default) | FP16 | ~88MB | Normal |
small | INT8 | ~44MB | Fast |
Key Points
Image input (URL/Blob/Buffer) β createImageBitmap + Canvas to RGBA NdArray
1024x1024 resize (no aspect ratio) + HWCβBCHW + (pixel-128)/256 normalization
ONNX Runtime (WASM/WebGPU) IS-Net inference β [1024,1024,1] foreground probability mask
Mask to uint8 β resize to original resolution β overwrite alpha channel
Canvas putImageData β convertToBlob for PNG/JPEG/WebP output