Syllabus Lesson 205 of 239 · Projects: Build Real Things
Projects: Build Real Things

Project: Investigate a Dataset

You have learned pandas. Now use it the way an analyst does: get handed a table, ask it questions, and chart the answer. The starter builds a realistic 90-day spending log for you (columns date, category, amount, merchant). It is generated with random.seed(0), so everyone gets the exact same numbers and your answers are checkable.

Your job is the analysis. Write three functions, then draw one chart.

  • monthly_totals(df) -> a dict like {"2026-01": 3499.72, ...}: total spend per calendar month, rounded to 2 places. Make a month key from the date with pd.to_datetime(df["date"]).dt.strftime("%Y-%m"), then df.groupby(that)["amount"].sum() and .round(2).to_dict().
  • top_merchants(df, n) -> the n merchants you spent the most at, highest first, as a list of names. Group by merchant, sum amount, sort descending, and take the top n index labels.
  • weekend_vs_weekday_avg(df) -> {"weekend": ..., "weekday": ...}: the average transaction on Sat/Sun versus the rest of the week. Get the weekday with .dt.dayofweek (Saturday is 5, Sunday is 6) and average each group.

Then make a bar chart of the monthly totals:

fig, ax = plt.subplots()
ax.bar(list(totals.keys()), list(totals.values()))
ax.set_title("Spending by month")

Do not call plt.show() - the sandbox renders the figure for you. Press Run to print your findings and see the chart appear. Because the data is seeded, the busiest month and your top merchant are the same for everyone, which is exactly why the grader can check your work.

Your turn

Using the seeded 90-row spending df the starter builds, write monthly_totals(df) (a {"YYYY-MM": total} dict rounded to 2 dp), top_merchants(df, n) (the n highest-spend merchants, highest first), and weekend_vs_weekday_avg(df) ({"weekend", "weekday"} average amounts, Sat/Sun vs the rest). Then build a bar chart of the monthly totals with fig, ax = plt.subplots() and ax.bar(...) (no plt.show()).

Spotted a problem in this lesson? Report it

Code · runs in your browser
Output