Syllabus Lesson 209 of 239 · Project: Customer Support Copilot
Project: Customer Support Copilot

Retrieve + Build the Grounded Prompt

Retrieval works -- you can find the right articles. Now comes the move that makes RAG trustworthy instead of a confident liar: grounding. You hand the language model the retrieved text and instruct it to answer only from that text, and to say so when the answer is not there. The model never sees your raw documents; it sees the prompt you assemble. So the prompt is the product. This lesson builds that prompt.

This is where hallucinations are won or lost. A support bot that invents a refund window is worse than useless -- it is a liability. The fix is a disciplined prompt with three parts: the retrieved chunks fenced and numbered so the model can cite them, the user's question, and an explicit escape hatch for when the context does not contain the answer.

Write one function:

assemble_prompt(query, retrieved_chunks)  # -> the full prompt string

It must, in order:

  • State the rule that the assistant answers using only the context below.
  • Include this line exactly, character for character: If the answer is not in the context, reply exactly: I do not know.
  • Open a CONTEXT: section, then list each chunk numbered and fenced, like [1] followed by the chunk text between triple quotes, then [2], and so on -- in the order given.
  • Add the user's question on a QUESTION: line.
  • End on an open answer turn -- a final ANSWER: line with nothing after it, so the model writes the answer next.

The numbering matters: it is what lets the model (and the next lesson) cite which source a claim came from. The exact escape line matters too -- downstream code will look for that precise sentinel to detect a refusal, so a paraphrase breaks everything.

Handle the missing-context case: if retrieved_chunks is empty, you still produce a valid prompt (instruction, the escape line, an empty context, the question, and the open answer turn) -- the model will then correctly reply with the sentinel. The live LLM call is not part of this exercise; you are building the string it would receive. Press Run to print a grounded prompt and read it like the model would.

Your turn

Write assemble_prompt(query, retrieved_chunks) that returns the full grounded prompt string: an instruction to answer only from context, the exact line If the answer is not in the context, reply exactly: I do not know., a CONTEXT: section with each chunk numbered [1], [2], ... and fenced in triple quotes in order, a QUESTION: line, and a final open ANSWER: turn. An empty chunk list still yields a valid prompt.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output