Syllabus Lesson 204 of 239 · Projects: Build Real Things
Projects: Build Real Things

Project: a Tiny Interpreter

This is a rite of passage: write a program that reads a language and runs it. Your language is arithmetic, but you will build it the way real interpreters are built, so 2 + 3 * 4 gives 14, not 20. Python will not help you here - you are parsing the string yourself, with no eval().

Two stages. First tokenize: turn the raw string into a clean list of pieces. "2 + 30" becomes ["2", "+", "30"]. Walk the characters, skip spaces, emit each of + - * / ( ) as its own token, and gather runs of digits (and a dot) into one number token. An unexpected character is a ValueError.

Then evaluate, using recursive descent: one small function per precedence level, each calling the tighter-binding one below it. That structure is what encodes the order of operations:

def parse_expr():      # lowest precedence: + and -
    value = parse_term()
    while peek() in ("+", "-"):
        op = advance()
        value = value + parse_term() if op == "+" else value - parse_term()
    return value

def parse_term():      # higher: * and /
    value = parse_factor()
    while peek() in ("*", "/"):
        ...

def parse_factor():    # highest: a number, or ( expr )
    if peek() == "(":
        advance(); value = parse_expr()
        # then expect and consume the matching )
        ...
    return float(advance())

Because parse_factor can call parse_expr again for a parenthesised group, parentheses fall out for free, nested ones included. Keep a small pos cursor with peek() (look at the current token) and advance() (consume it). Malformed input must raise ValueError: a dangling operator like 2 +, an unbalanced (2 + 3, or leftover tokens. The targets to hit: evaluate('2 + 3 * 4') == 14.0, evaluate('(2 + 3) * 4') == 20.0, nested parentheses, and a raise on '2 +'. Press Run to evaluate a handful of expressions.

Your turn

Write tokenize(expr) (numbers as string tokens plus the symbols + - * / ( ), spaces skipped, unknown characters rejected) and evaluate(expr) returning a float that honours * / over + - and respects parentheses (recursive descent or shunting-yard). Malformed input must raise ValueError. Targets: evaluate('2 + 3 * 4') == 14.0, evaluate('(2 + 3) * 4') == 20.0, nested parens, and a raise on '2 +'.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output