JSON Mode: Parse + Validate Against a Schema
The moment you want an LLM to feed a downstream system -> a database, an API, a billing job -> free-form prose is useless. You need structured output: the model returns JSON, and your code turns that JSON into a typed object it can trust. Most model APIs have a "JSON mode" that makes the model emit a JSON string. But "it is valid JSON" and "it is the object I asked for" are two different things. The boundary between the model and your code is where you validate.
In real code the model call looks like this (illustrative only -> we do not call any model here):
resp = client.messages.create(model="...", messages=[...])
text = resp.content[0].text # a JSON string, hopefully
record = parse_and_validate(text, {"name": str, "age": int})You are building parse_and_validate(text, schema). The schema is a dict mapping each required field to its Python type, like {"name": str, "age": int}. Your function should:
json.loadsthe text into a Python object.- Confirm it is a
dict-> a JSON array or bare number is not the record you asked for. - For every field in the schema: check it is present (raise on a missing field) and of the right type (raise on a mismatch).
- Return a typed dict of just those fields.
The bool-is-int trap. In Python, bool is a subclass of int, so isinstance(True, int) is True. JSON true decodes to Python True. If your schema wants an int for age and the model sends "age": true, a naive isinstance check would wave it through. Guard against it: when the expected type is int, reject a bool value explicitly.
if expected is int and isinstance(value, bool):
raise ValueError("age must be int, got bool")Raise a clear exception (a ValueError with a message naming the bad field) so a caller can log exactly what the model got wrong. Build it so two different well-formed payloads both pass, and a missing field, a wrong type, and a sneaky bool each raise.
Write parse_and_validate(text, schema) where schema is a {field: type} dict. json.loads the text, confirm it is a dict, then for each schema field check it is present and of the right type, returning the typed dict. Missing fields and wrong types must raise. Reject a JSON true where an int is required (the bool-is-int trap).
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.