Context & Token Budgeting
Every model has a finite context window, and you pay per token. When a conversation grows past the budget, you cannot send all of it, so you decide what to keep. The standard policy: keep the system prompt (it sets the rules) plus the most recent messages that still fit, and drop the oldest.
You will build a simple budgeter. First, a rough token estimator: real tokenizers are model-specific, but a solid offline approximation is about 4 characters per token. estimate_tokens(text) returns len(text) // 4 (and 0 for empty text, at least 1 for any non-empty text).
Then fit_context(system, messages, budget) decides what survives:
- The system prompt is counted first and always kept.
- Walk the messages from newest to oldest, keeping each one whose tokens still fit under the budget; stop when the next one would overflow.
- Return
{"kept": [...], "dropped": n, "tokens": total}withkeptrestored to chronological order.
# three 10-token messages, budget 20, no system
fit_context("", [m1, m2, m3], 20)
# {"kept": [m2, m3], "dropped": 1, "tokens": 20} # newest two surviveThe two subtleties the grader checks: a tighter budget keeps fewer (and the right) messages, and a non-empty system prompt eats into the budget so fewer messages fit. Reverse-iterate to pick newest-first, then reverse the kept list back so the conversation reads in order. Press Run to grade.
Write estimate_tokens(text) returning len(text) // 4 (0 for empty, at least 1 for non-empty). Write fit_context(system, messages, budget) that counts the system first, keeps the newest messages that fit (iterate in reverse), and returns {"kept", "dropped", "tokens"} with kept back in chronological order.
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.