Syllabus Lesson 106 of 239 · Neural-Net Intuition, LLMs & AI Capstone
Neural-Net Intuition, LLMs & AI Capstone

Watch: Attention Weights

You just implemented scaled dot-product attention. Here is what those matrices actually do. Attention lets each word look at every word in the sentence (including itself), score how much each one matters, and rebuild itself as a weighted blend of them.

Press play. Watch the word sat attend across the cat sat: the raw scores become softmax weights that sum to 1, the thickest line points to the word that matters most, and sat absorbs its context, mostly from cat.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output