🖼️

background-removal-js — How AI Background Removal Works in the Browser

IS-Net model + ONNX Runtime Web + WebGPU — client-side segmentation without a server

import { removeBackground } from '@imgly/background-removal';
const blob = await removeBackground(imageUrl);

Quite a lot happens behind this one line.

Model: IS-Net

IS-Net (Intermediate Supervision Network) is a U-Net family salient object detection model from the DIS (Dichotomous Image Segmentation) paper. Not SAM or MODNet.

Encoder-decoder structure with intermediate supervision at each layer for precise binary masks. Purpose: separate "salient objects" (people, animals, objects) from background.

3 model variants: large (~176MB FP32), medium (~88MB FP16, default), small (~44MB INT8 quantized).

Preprocessing → Inference → Postprocessing

Step 1: Convert input to RGBA Uint8Array tensor [H, W, 4]
Step 2: Resize to 1024×1024 (bilinear, ignoring aspect ratio). HWC→BCHW with normalization: float32 = (uint8 - 128) / 256. Output shape: [1, 3, 1024, 1024]
Step 3: ONNX Runtime Web inference. Output: [1024, 1024, 1] float32 alpha mask (0.0-1.0)
Step 4: Multiply by 255 → uint8, resize to original resolution, write directly to alpha channel: data[4*i+3] = mask[i]

No separate alpha matting — IS-Net output is precise enough.

ONNX Runtime Web

WASM backend (default): WebAssembly SIMD + multithreading. WebGPU backend: device: 'gpu'. Auto-fallback if GPU unavailable.

Model Loading

Models hosted on imgly CDN, chunked for parallel download. Up to 176MB, loaded via Promise.all.

IS-Net Pipeline

InputURL/Blob → createImageBitmap → Canvas getImageData → RGBA [H, W, 4]

Preprocessbilinear resize → 1024×1024 → HWC→BCHW → (pixel-128)/256 → [1, 3, 1024, 1024]

InferenceONNX Runtime Web (WASM/WebGPU) → IS-Net → [1024, 1024, 1] alpha mask

Postprocess×255 → uint8 → resize to original → write alpha channel directly

Key Points

1

Convert input image to RGBA Uint8Array [H, W, 4] tensor

2

Resize to 1024×1024 bilinear → HWC→BCHW + normalize (mean=128, std=256)

3

IS-Net model outputs [1024, 1024, 1] float32 alpha mask (foreground=1.0, background=0.0)

4

Resize mask to original resolution → write directly to alpha channel

5

ONNX Runtime: WASM (default) or WebGPU backend — model chunks downloaded in parallel from CDN

Use Cases

Profile photo editing — instant background removal in browser without server E-commerce product images — auto-replace background with white Video call virtual backgrounds — not real-time but static image masking