📏

What Is a Vector?

A bundle of numbers with direction and magnitude — higher dimensions carry meaning

A vector is an ordered list of numbers.

2D vector — an arrow on a plane

[3, 4] means move 3 in x, 4 in y. Like saying "3km east, 4km north" on a map.

Distance between two vectors tells you "how far apart," angle tells you "how similar in direction." This angle-based similarity is cosine similarity.

3D vector — a point in space

[3, 4, 2] is a point in 3D space. Just x, y plus a z-axis (height). Distance and angle calculations work the same as 2D.

Up to here, humans can see it directly. 2D on paper, 3D with tools like Three.js that let you rotate the view.

High-dimensional vectors — the space of meaning

[0.023, -0.15, 0.41, ..., -0.33] — 1536 numbers. You can't see this anymore.

But the math doesn't change. The formula for distance between two points in 2D works identically in 1536D. More dimensions, same principle: close means similar, far means different.

Why 1536 dimensions in AI embeddings? Simple — 2-3 dimensions can express "cat and dog are similar" but not nuances like "Persian cats are luxurious while stray cats are rugged."

Visualizing high dimensions

You can't see 1536D directly, but you can "compress" it to 2D/3D.

t-SNE: Projects to 2D keeping nearby points close and distant points far. Great for spotting clusters.

UMAP: Faster than t-SNE, better at preserving global structure. Good for large datasets.

PCA: Keeps the 2-3 axes with the most information, discards the rest. Fastest but loses the most information.

If the compressed visualization shows "similar images clustered together," the embedding is working.

Why it matters for RecSys

Almost every RecSys technique boils down to "computing distance between user vectors and item vectors." Understand vectors and you'll see that Matrix Factorization, Two-Tower, and embedding-based search all share the same foundation.

2D 벡터 — 인터랙티브

화살표 끝(●)을 드래그해서 벡터를 움직여보세요. 코사인 유사도가 실시간으로 변합니다.

A = [3, 4]

B = [4, 1]

코사인 유사도 0.00

3D 벡터 — 회전해서 보기

마우스 드래그로 공간을 회전, 스크롤로 줌. 3개 벡터의 관계를 3차원에서 확인할 수 있습니다.

고양이 [0.8, 0.6, 0.3]

강아지 [0.7, 0.5, 0.4]

자동차 [-0.3, 0.9, -0.2]

고양이↔강아지: 0.97 | 고양이↔자동차: 0.12

고차원 → 2D 압축 (시뮬레이션)

1536차원 벡터를 2D로 압축하면 이런 느낌입니다. 의미가 비슷한 단어끼리 뭉칩니다.

동물

탈것

음식

감정

How It Works

List numbers to get a vector (e.g., [3, 4])

Dimension = count of numbers (2D, 3D, 1536D...)

Measure similarity via distance/angle (cosine similarity)

Compress high dimensions to 2D/3D with t-SNE/UMAP

Pros

✓ Math works identically regardless of dimension — understand 2D, same principles apply to 1536D
✓ Cosine similarity alone can compare text, images, and user preferences

Cons

✗ Higher dimensions lose intuitive understanding — visualization always loses information
✗ Curse of dimensionality — in high dimensions all points tend to be similarly distant