Repairing Malformed Model JSON
Models are not perfect JSON emitters. Even in JSON mode they wrap output in ```json fences, add a chatty preamble ("Sure! Here is the data:"), leave a trailing comma, or use single quotes. A bare json.loads throws on all of these, and in production that is a failed request. A small repair step recovers most of them cheaply before you give up and re-prompt the model.
The shape of a robust path looks like this:
text = call_model(prompt) # may be fenced / prose-wrapped / sloppy
try:
data = json.loads(text)
except ValueError:
data = repair_json(text) # one cheap repair attempt before re-promptingYou are building repair_json(text). Walk it through these steps, in order, and stop as soon as json.loads succeeds:
- Strip a code fence. If the whole thing is wrapped in
```(optionally```json), keep only the inside. A regex like^```[a-zA-Z]*\s*(.*?)\s*```$withre.DOTALLdoes it. - Cut surrounding prose. Find the first
{or[and the matching last}or], and slice out just that span. That drops "Sure! Here is..." before and any trailing chatter after. - Try as-is. Many strings are valid now -> return early.
- Fix common breakage: convert single quotes to double quotes, and remove trailing commas with
re.sub(r",(\s*[}\]])", r"\1", s). Thenjson.loadsagain. - If it still will not parse, raise a
ValueError-> never return junk. An unrepairable string is a signal to re-prompt, not to guess.
Note the honest limits: a blunt single-quote swap also rewrites apostrophes inside string values, so a real repairer is more careful. For model output, which rarely contains prose values with apostrophes, this is a pragmatic first line of defense. Make sure clean JSON passes untouched, true/false survive as Python True/False, and a string with no recoverable JSON raises.
Write repair_json(text) that strips a ```/```json code fence, cuts any prose before the first {/[ and after the matching last }/], then tries json.loads; on failure it converts single quotes to double quotes and removes trailing commas and tries again; if it still will not parse it raises a ValueError. Clean JSON must pass through unchanged.
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.