The ML Around the LLM

Embedding-to-Prototype Routing

A classifier needs labelled training data. Sometimes you do not have any -> you just have a handful of example queries for each intent. The cheapest router then is nearest-prototype: represent each intent by one vector (its "prototype", often the average embedding of a few examples), embed the incoming query, and route it to whichever prototype it points most directly at.

"Points most directly at" means cosine similarity: the cosine of the angle between two vectors, ignoring their length. It is the right measure for text because a long document and a short one about the same topic should still match.

cosine(a, b) = (a . b) / (|a| |b|)

A value near 1.0 means the vectors point the same way (same intent); near 0.0 means unrelated. The router picks the prototype with the highest cosine, and that score doubles as a confidence you can threshold later.

Honest framing: in production those vectors come from a trained embedding model (an API call, or a sentence-transformer). We cannot load one here -> there is no model weights file and no GPU in the browser. So in this lesson the vectors are given to you as plain numpy arrays. You are building the routing geometry, which is identical whether the numbers came from a real embedder or from a fixture. Real embeddings just fill in better numbers; the cosine-and-argmax logic is the part you ship.

Ties must be deterministic. If two prototypes are exactly equidistant, picking "whichever happens to come first in the dict" is a latent bug -> dict order can shift. Break ties by choosing the intent whose name sorts first alphabetically, so the same tie always resolves the same way.

Build two functions. cosine(a, b) returns the cosine similarity of two vectors and is zero-safe (a zero-length vector returns 0.0, never a NaN). route_by_prototype(query_vec, prototypes) takes the query vector and a dict {intent_name: prototype_vector} and returns a tuple (best_intent, best_score) -> the nearest prototype and its cosine, with the alphabetical tie-break. Press Run to route a few queries and see the confidence each one comes back with.

Your turn

Write cosine(a, b) = a.b / (|a||b|), returning 0.0 when either vector has zero length (no NaN). Then write route_by_prototype(query_vec, prototypes) where prototypes is a dict {name: vector}; return the tuple (best_name, best_score) for the highest cosine, breaking exact ties by the alphabetically-first name.

Spotted a problem in this lesson? Report it

Code · runs in your browser

Output

Back Next lesson

Embedding-to-Prototype Routing

This lesson is locked

Best on a laptop