当前投影:身份层级 × 性别
越靠右表示越偏高层级身份,越靠上表示越偏女性特征。
After tokens enter the model, they don't remain as discrete indices but are mapped into vectors. The power of embedding lies in this: concepts with similar semantics, functions, or contexts tend to cluster together in high-dimensional space. Think of this page as a "word vector map."
Real embeddings typically have hundreds to thousands of dimensions. Here we select just two dimensions for projection. What you see is not the "complete space" but a slice of the high-dimensional semantic space onto a plane. Switching between different projections reveals how relative positions between words change.
越靠右表示越偏高层级身份,越靠上表示越偏女性特征。
Vector addition and subtraction aren't mysterious. Their meaning is typically: preserve certain relationships while replacing certain attributes. If a direction happens to encode "gender change" or "status hierarchy change," then moving along that direction may find a new corresponding word.
If two words frequently appear in similar contexts, the training process gradually moves them closer together.
Embeddings aren't learned to "look nice" but to better accomplish prediction tasks. Task requirements shape the spatial structure.
2D visualizations work well for teaching, but they're just a corner of the high-dimensional space. Real semantic relationships typically span many more dimensions.