Guardrail Pipeline + Trace
This is the last piece, and it is the one that lets you put the gateway in front of real users: safety and observability. Before any text reaches the model you run it through a guard pipeline that blocks prompt-injection attempts and scrubs personal data. And after every call you record a trace event, then roll those events up into the summary an on-call engineer actually looks at. Together these two ideas, "guardrail" and "cost-account," complete the resume line you started this module with.
The guard. Write guard(request) that inspects the request text and returns a verdict dict like {"allowed": bool, "reason": ..., "text": ..., "redacted": [...]}. Two checks:
- Injection block: if the text matches a known attack pattern (think "ignore all previous instructions," "reveal the system prompt," "you are now..."), refuse it outright with
allowed=False. A small list of case-insensitive regexes is enough here. - PII redaction: otherwise, scrub emails and phone numbers out of the text, replacing them with
[EMAIL]and[PHONE]placeholders, and return the cleaned text withallowed=Trueplus the list of kinds you found.
INJECTION = re.compile(r"ignore (?:all )?previous instructions|reveal .*system prompt|you are now", re.I)
def guard(request):
text = request["text"]
if INJECTION.search(text):
return {"allowed": False, "reason": "injection", "text": text, "redacted": []}
clean, kinds = redact(text)
return {"allowed": True, "reason": "ok", "text": clean, "redacted": kinds}The trace. Every served request leaves a breadcrumb: which handler ran, whether it was blocked, and what it cost. Write aggregate_trace(events) over a list of dicts with keys handler, blocked, and cost, returning a summary with the total number of calls, how many were blocks, the total_cost of the calls that were actually allowed, and a by_handler breakdown of calls and cost per handler. This is a perfect job for a pandas groupby:
df = pd.DataFrame(events)
allowed = df[~df["blocked"]]
grouped = allowed.groupby("handler")["cost"].agg(["count", "sum"])Blocked requests cost nothing and must not inflate total_cost, so sum cost only over the allowed rows. Handle the empty-trace case so a fresh gateway reports clean zeros instead of crashing. Press Run to block an injection, redact a request full of contact details, let a benign one through, and print a rolled-up trace.
Write guard(request) (request has a "text" key) returning a verdict dict. Block prompt-injection attempts (a small list of case-insensitive regexes) with allowed=False and reason="injection"; otherwise redact emails and phone numbers to [EMAIL]/[PHONE] placeholders and return allowed=True, the cleaned text, and the list of kinds found. Then write aggregate_trace(events) over dicts with handler, blocked, cost returning {"calls", "blocks", "total_cost", "by_handler"}, where total_cost sums only the allowed (non-blocked) calls and by_handler breaks calls/cost down per handler.
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.