Project: Analyze a CSV
Time to put it together. You will write a tiny CSV file in memory, read it back with pandas, and compute grouped aggregates, the way a real analysis starts.
pandas reads CSV text from a file or from an in-memory buffer. We use io.StringIO so nothing touches disk:
import io
import pandas as pd
csv_text = "name,dept,salary\nAda,eng,100\nBo,eng,140\n"
df = pd.read_csv(io.StringIO(csv_text))
print(df)From there you already know the moves: filter with a mask, group with groupby(...).agg(...), pull a single value out of a Series with .loc or by indexing its label.
Your job: read the data, then build a single results dict that summarises it. Returning a plain dict is a common, testable way to hand analysis output to the rest of a program (or to a report, or to an LLM prompt). We grade the numbers in that dict, not any printout.
What goes in results
"n_rows": the number of rows in the table"total_salary": the sum of the whole salary column"avg_by_dept": a dict mapping each dept to its average salary (build it from a groupby mean, then.to_dict())"top_dept": the dept with the highest average salary
The CSV text is provided in csv_text. Read it into a DataFrame df with pd.read_csv(io.StringIO(csv_text)). Then build a dict named results with these keys: "n_rows" (row count), "total_salary" (sum of the salary column), "avg_by_dept" (a dict of dept to mean salary), and "top_dept" (the dept whose average salary is highest).
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.