Syllabus Lesson 202 of 239 · Projects: Build Real Things
Projects: Build Real Things

Project: Server Log Analyzer

This is the kind of script you actually write on the job: a pile of server log lines comes in, and you need numbers out of it. Error rate? Slowest endpoint? Traffic by status code? You will build the whole pipeline: write a sample log to disk, parse it, load it, and report on it.

The sandbox has a real (virtual) filesystem, so first your program writes the log itself so there is something to read:

LOG = """2026-06-17 08:01 200 /home 53
2026-06-17 08:02 404 /missing 12
... about 30 lines ..."""
with open("access.log", "w") as f:
    f.write(LOG)

Each line is date time status path ms separated by spaces, where status is the HTTP code and ms is how long the request took. Real logs are messy though, so we slipped one corrupt line into the data on purpose. Your parser must survive it.

Build three functions:

  • parse_line(line) -> a dict {"date", "time", "status", "path", "ms"} with status and ms coerced to int, OR None if the line is malformed. Use line.split(), check you got five fields, and wrap the int() calls in try/except ValueError so a bad line returns None instead of crashing.
  • load(path) -> open the file, parse every line, and keep only the rows that are not None.
  • report(rows) -> a dict with "n" (row count), "error_rate" (rows with status >= 400 divided by n, rounded to 4 places), "slowest_path" (the path of the single slowest request), and "hits_by_status" (a dict counting requests per status code).

This is the parse -> load -> aggregate shape behind almost every data tool. Press Run to print the report and watch the corrupt line get quietly skipped.

Your turn

Write a sample access log to access.log (the solution embeds ~30 lines including one deliberately corrupt line). Then build parse_line(line) (split, int-coerce status and ms, return None on a malformed line via try/except), load(path) (parse the file, skip the None rows), and report(rows) returning {"n", "error_rate", "slowest_path", "hits_by_status"}.

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output