Syllabus Lesson 193 of 239 · The ML Around the LLM
The ML Around the LLM

Confidence Thresholding & Abstain

A classifier always returns an answer, even when it has no idea. Softmax has to sum to 1, so a genuinely confused model still hands you a top label -> just with a low probability behind it. In production that is dangerous: silently mis-routing a customer is worse than admitting you are unsure.

The fix is a confidence threshold with an abstain option. You look at the probability of the top label; if it clears the bar you act on it, and if it does not you return a special "abstain" verdict that hands the request to a human (or escalates to a more expensive model). This is the human-handoff gate every serious LLM system has.

probs  = [0.80, 0.12, 0.08]      # model is confident
labels = ["billing", "technical", "account"]
# top prob 0.80 >= 0.6 -> return "billing"

probs  = [0.40, 0.35, 0.25]      # model is unsure
# top prob 0.40 <  0.6 -> return "abstain"

The threshold is a dial you tune from data: raise it and you abstain more often (safer, more human work); lower it and you act more often (cheaper, riskier). The boundary itself matters -> decide whether a probability exactly equal to the threshold counts as confident. Here we use >= threshold means confident (the boundary value passes).

Where do the probabilities come from? Usually model.predict_proba(...) from the classifier you built earlier, or the per-prototype cosine scores normalised into a distribution. Either way, the gating logic is the same small function, and it is pure and deterministic -> exactly the kind of guardrail you can unit-test.

Build one function. classify_with_threshold(probs, labels, threshold): find the label with the highest probability; if that probability is >= threshold return the label, otherwise return the string "abstain". probs and labels are aligned lists. Press Run to see a confident case pass through and an unsure one abstain.

Your turn

Write classify_with_threshold(probs, labels, threshold) where probs and labels are aligned lists. Return the label whose probability is highest, but only if that probability is >= threshold; otherwise return the string "abstain". Treat a probability exactly equal to the threshold as confident.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output