Syllabus Lesson 126 of 239 · Prompt Engineering for AI Engineers
Prompt Engineering for AI Engineers

Self-Consistency: Sample, Parse, Vote

One model sample can be wrong by luck. Self-consistency takes several samples for the same question, parses an answer out of each, and lets them vote. The majority answer is usually more reliable than any single run. The sampling is the model's job; the parse-and-vote is yours, and it is fully deterministic.

Build two functions:

  • extract_answer(text) returns the LAST integer mentioned in the text (the model's final number after reasoning), or None if there is no integer. Find integers with a regex over the string.
  • self_consistency(samples) takes a list of sample strings, extracts an answer from each, drops the ones that parsed to None, and returns the majority value via a Counter. Break ties toward the smallest value so the result is deterministic.

For example, given samples that parse to [4, 4, 5], the vote returns 4. Given [3, 5] (a 1-1 tie), it returns 3, the smaller. Unparseable samples are simply ignored:

self_consistency(["... so 4", "answer 4", "I think 5"])  # 4
self_consistency(["hmm no number", "= 9", "= 9", "= 2"])     # 9

The deterministic tie-break matters: an eval that returns a different winner each run on tied votes is untestable and unshippable. Press Run to grade.

Your turn

Write extract_answer(text) returning the last integer in the text (handle a leading minus sign), or None if none. Write self_consistency(samples) that extracts an answer from each sample, drops the None results, and returns the majority value with a Counter, breaking ties toward the smallest value. With no parseable answers, return None.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output