Syllabus Lesson 134 of 239 · Structured Outputs & Function/Tool Calling
Structured Outputs & Function/Tool Calling

Repairing Malformed Model JSON

Models are not perfect JSON emitters. Even in JSON mode they wrap output in ```json fences, add a chatty preamble ("Sure! Here is the data:"), leave a trailing comma, or use single quotes. A bare json.loads throws on all of these, and in production that is a failed request. A small repair step recovers most of them cheaply before you give up and re-prompt the model.

The shape of a robust path looks like this:

text = call_model(prompt)        # may be fenced / prose-wrapped / sloppy
try:
    data = json.loads(text)
except ValueError:
    data = repair_json(text)     # one cheap repair attempt before re-prompting

You are building repair_json(text). Walk it through these steps, in order, and stop as soon as json.loads succeeds:

  • Strip a code fence. If the whole thing is wrapped in ``` (optionally ```json), keep only the inside. A regex like ^```[a-zA-Z]*\s*(.*?)\s*```$ with re.DOTALL does it.
  • Cut surrounding prose. Find the first { or [ and the matching last } or ], and slice out just that span. That drops "Sure! Here is..." before and any trailing chatter after.
  • Try as-is. Many strings are valid now -> return early.
  • Fix common breakage: convert single quotes to double quotes, and remove trailing commas with re.sub(r",(\s*[}\]])", r"\1", s). Then json.loads again.
  • If it still will not parse, raise a ValueError -> never return junk. An unrepairable string is a signal to re-prompt, not to guess.

Note the honest limits: a blunt single-quote swap also rewrites apostrophes inside string values, so a real repairer is more careful. For model output, which rarely contains prose values with apostrophes, this is a pragmatic first line of defense. Make sure clean JSON passes untouched, true/false survive as Python True/False, and a string with no recoverable JSON raises.

Your turn

Write repair_json(text) that strips a ```/```json code fence, cuts any prose before the first {/[ and after the matching last }/], then tries json.loads; on failure it converts single quotes to double quotes and removes trailing commas and tries again; if it still will not parse it raises a ValueError. Clean JSON must pass through unchanged.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output