This row is not the "final correct answer," but rather the reference strength the current token gives to other tokens in the whole sentence when updating its own representation.
Self-Attention Intuition
If each token only looks at itself, language understanding would be very fragile. Pronouns wouldn't know who they refer to, adjectives wouldn't know what they modify, and actions wouldn't know who they act upon. The first layer of intuition for Self-Attention is to let each token, when updating itself, first look at the whole sentence to see which other tokens are most important to it.
Switch a scenario to see who the current token is "looking at"
Choose different linguistic scenarios, then switch the current focus token. The page will highlight the words in the sentence that receive major attention and display a row of attention weights. You can also auto-play to let different tokens become the "current query" in sequence.
Put "who is looking at whom" into a small table
The table below puts the attention rows of several key tokens in the current scenario together. Horizontally it shows "who it is looking at," and vertically it shows "who is initiating the query." The highlighted row is the query you currently have selected.
What is the relationship between this page and the QKV math formula
Current Focus Token
It corresponds to a Query row in the later formula. You can understand it as "I now want to update myself, and I need to find which contexts are relevant."
Other Attended Tokens
They correspond to the Key and Value positions in the later formula. Key is responsible for judging relevance, and Value is responsible for providing the actual information to be taken away.
This Row of Attention Bars
It is the intuitive version of softmax(QK^T). A higher value indicates that the position is more worthy of reference.
Next Page You Can Enter the Formula
If you have accepted that "a token will look back at the whole sentence and form a row of weights," then entering Q, K, V and the scoring matrix will feel much more natural.