Intent Classifier with Abstain
Welcome to your capstone build. Over four lessons you ship the layer that sits between your users and the model: an AI gateway that classifies what a request wants, routes it to the right handler, caches answers, guards against abuse, and accounts for every dollar. This is the unglamorous infrastructure that turns a clever demo into something a company can actually run, and it is exactly what a hiring manager means when a resume line reads "built an AI gateway: classify, route, cache, guardrail, cost-account."
One honest note on scope: you are building the application-layer logic of a gateway (classify, route, semantic-cache, guardrail, cost-account). A gateway you would actually run in production also needs a running service to host it, authentication, real distributed rate-limiting, and a deployment, which are beyond this in-browser capstone.
First stop: the front door. Every request that arrives is a blob of text, and before you can do anything smart with it you need to know what it is. Is this a weather question? An account problem? A billing complaint? You will build an intent classifier that learns from a handful of labelled examples and tags each new request. The catch that separates a toy from a real classifier: it must know when it does not know. A model that confidently labels gibberish is worse than useless downstream, so yours will abstain and return "unknown" when it is not sure.
You will write two functions:
train_intent(texts, labels)-> fit aTfidfVectorizerto turn the texts into vectors, then aLogisticRegression(random_state=0)on top. Return whatever bundle (a dict is fine)classifyneeds to make predictions. Seed the model so everyone gets identical results.classify(model, text)-> return a(label, confidence)tuple. Get the class probabilities withpredict_proba, take the top one, and if its confidence is below a threshold (about0.5), return("unknown", confidence)instead.
The trick for catching truly off-distribution input is cheap and robust. When TfidfVectorizer.transform sees text with none of the training vocabulary, the resulting row is all zeros (its .nnz is 0). That is your signal the model has nothing to go on, so abstain immediately:
X = vec.transform([text])
if X.nnz == 0:
return ("unknown", 0.0)
probs = clf.predict_proba(X)[0]
idx = probs.argmax()
conf = float(probs[idx])
return (clf.classes_[idx], conf) if conf >= 0.5 else ("unknown", conf)One real-world note on the model: on tiny training sets the default regularization makes every probability hover near chance, so legitimate questions never clear the threshold. Loosening regularization with C=10.0 (and bumping max_iter so it converges) sharpens the model enough that real paraphrases score confidently while junk still abstains. Press Run to train on a few labelled support questions and watch paraphrases get classified while gibberish backs off.
Write train_intent(texts, labels) using TfidfVectorizer + LogisticRegression(random_state=0) (loosen with C=10.0 and raise max_iter so small data converges), returning whatever bundle classify needs. Then write classify(model, text) returning (label, confidence): take the top predict_proba class, abstain to ("unknown", conf) below a ~0.5 threshold, and abstain immediately when the transformed row is all zeros (X.nnz == 0) so off-distribution or empty input cannot be labelled.
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.