Syllabus Lesson 154 of 239 · Build a RAG Pipeline
Build a RAG Pipeline

Out-of-Scope Refusal Guardrail

A RAG system that always answers is dangerous. Ask your support bot "what is the weather in Tokyo?" and, if it answers anyway, it is no longer grounded in your documents, it is guessing. The fix is a refusal guardrail: before you ever build the prompt, look at how well the retrieved chunks actually match the question. If the best match is weak, the documents probably do not cover the question, so the right move is to refuse and hand off rather than hallucinate.

The signal you already have is the top retrieval similarity. Pick a threshold. If the highest-scoring retrieved chunk is below it, refuse. This is a single, cheap check that sits between retrieval and generation, and it is one of the highest-value pieces of a production RAG system: it is the difference between "I don't know" and a confident wrong answer.

contexts = [
    {"text": "Refunds take five business days...", "score": 0.62},
    {"text": "Shipping takes 5 to 7 days...",        "score": 0.20},
]
should_refuse("how do refunds work?", contexts, 0.3)  # -> False, top 0.62 clears it
should_refuse("what is the weather?", contexts, 0.7)  # -> True,  top 0.62 below 0.7

What to build. Write should_refuse(query, contexts, threshold) where contexts is a list of dicts each carrying a numeric "score" (the retrieval similarity). Return:

  • True when contexts is empty (nothing retrieved -> nothing to ground on).
  • True when the highest score among the contexts is strictly below threshold.
  • False otherwise (at least one chunk clears the bar).

The boundary is deliberate: a score exactly equal to the threshold is good enough to answer (use strictly-less-than for the refusal). Note that query is passed because a real system may also refuse on other signals, but here the decision rests on the scores. Press Run to watch in-domain and out-of-domain questions get different verdicts.

Your turn

Write should_refuse(query, contexts, threshold) that returns True when contexts is empty or when the maximum "score" across the context dicts is strictly less than threshold, and False otherwise. A score exactly equal to the threshold should NOT refuse.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output