Build a RAG Pipeline

Assembling the Grounded Prompt

Retrieval found the right chunks; now you have to tell the model to use them and nothing else. This is the prompt-assembly step, and it is where grounding is enforced. A grounded prompt does three jobs: it presents the retrieved context clearly, it instructs the model to answer only from that context, and it gives the model an escape hatch so it can admit ignorance instead of inventing an answer.

Numbering the chunks matters for two reasons: it visually separates them so the model does not blur two passages together, and it lets the model cite "[2]" if you later ask for citations. The escape-hatch line is the other half of the refusal guardrail from the last lesson: even when retrieval is decent, the specific answer might not be in the text, and a model told exactly what to say when it does not know is far less likely to hallucinate.

System: Answer the question using only the context below.
If the answer is not in the context, reply exactly: I don't know.

Context:
[1] Refunds are issued within five business days...
[2] Shipping takes 5 to 7 business days...

Question: how long do refunds take?
Assistant:

That trailing Assistant: line is the open turn: it is where the model starts writing. (In your browser the on-device WebLLM would take this exact string and produce the answer. You are graded only on the string you assemble, never on the model's reply, which is the deterministic part you actually engineer.)

What to build. Write build_rag_prompt(query, contexts) where contexts is a list of chunk strings. The returned prompt must:

Contain each context chunk, numbered starting at 1 as [1] ..., [2] ... in order.
Contain the exact instruction to answer using only the context.
Contain the exact escape hatch I don't know. (the model's instructed reply when the answer is absent).
Contain the query text.
End on an open answer turn (the last non-empty line is the Assistant: marker).

Press Run to assemble a prompt from two retrieved chunks and read it back.

Your turn

Write build_rag_prompt(query, contexts) that returns a grounded prompt string. It must number each context chunk as [1] ..., [2] ... in order, include an instruction to answer using only the context, include the exact escape hatch I don't know., include the query, and end on an open Assistant: turn.

Spotted a problem in this lesson? Report it

Code · runs in your browser

Output

Back Next lesson

Assembling the Grounded Prompt

This lesson is locked

Best on a laptop