Syllabus Lesson 133 of 239 · Structured Outputs & Function/Tool Calling
Structured Outputs & Function/Tool Calling

JSON Mode: Parse + Validate Against a Schema

The moment you want an LLM to feed a downstream system -> a database, an API, a billing job -> free-form prose is useless. You need structured output: the model returns JSON, and your code turns that JSON into a typed object it can trust. Most model APIs have a "JSON mode" that makes the model emit a JSON string. But "it is valid JSON" and "it is the object I asked for" are two different things. The boundary between the model and your code is where you validate.

In real code the model call looks like this (illustrative only -> we do not call any model here):

resp = client.messages.create(model="...", messages=[...])
text = resp.content[0].text          # a JSON string, hopefully
record = parse_and_validate(text, {"name": str, "age": int})

You are building parse_and_validate(text, schema). The schema is a dict mapping each required field to its Python type, like {"name": str, "age": int}. Your function should:

  • json.loads the text into a Python object.
  • Confirm it is a dict -> a JSON array or bare number is not the record you asked for.
  • For every field in the schema: check it is present (raise on a missing field) and of the right type (raise on a mismatch).
  • Return a typed dict of just those fields.

The bool-is-int trap. In Python, bool is a subclass of int, so isinstance(True, int) is True. JSON true decodes to Python True. If your schema wants an int for age and the model sends "age": true, a naive isinstance check would wave it through. Guard against it: when the expected type is int, reject a bool value explicitly.

if expected is int and isinstance(value, bool):
    raise ValueError("age must be int, got bool")

Raise a clear exception (a ValueError with a message naming the bad field) so a caller can log exactly what the model got wrong. Build it so two different well-formed payloads both pass, and a missing field, a wrong type, and a sneaky bool each raise.

Your turn

Write parse_and_validate(text, schema) where schema is a {field: type} dict. json.loads the text, confirm it is a dict, then for each schema field check it is present and of the right type, returning the typed dict. Missing fields and wrong types must raise. Reject a JSON true where an int is required (the bool-is-int trap).

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output