The Machine Learning Mental Model
Machine learning sounds mystical, but the core idea is plain: instead of writing the rules yourself, you show a program lots of examples and let it find the rules. Almost everything in this module rests on three ideas.
Features and labels
An example is split into two parts. The features are what you know going in (a house's size, number of bedrooms). The label is what you want to predict (its price). By long tradition, features are called X (a table, capital letter) and the label is called y (a single column, lowercase).
X (features) y (label)
size_sqft bedrooms price_k
650 1 220
800 2 265
1200 3 410Supervised vs unsupervised
- Supervised learning has labels. You know the right answer for each training example, so the model learns to map
Xtoy. Predicting price, or sorting email into spam vs not-spam, are supervised. - Unsupervised learning has no labels. You just have
Xand ask the model to find structure, like grouping similar customers. This whole module is supervised.
The fit / predict pattern
Every scikit-learn model speaks the same two verbs. You will see this shape again and again:
model = SomeModel()
model.fit(X, y) # learn from examples
model.predict(X_new) # guess labels for new rowsThat is the entire ritual. The hard part is never the API, it is choosing good features and honestly measuring how well the model does, which the rest of this module is about.
In this exercise you will not train anything yet. You will just do the unglamorous but essential first step: take a small table and carve it into X and y.
A small housing DataFrame df is given with columns size_sqft, bedrooms, and price_k. Set feature_cols = ["size_sqft", "bedrooms"], build X = df[feature_cols] (the two input columns only) and y = df["price_k"] (the label). Then set n_features to the number of feature columns and n_samples to the number of rows.
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.