Your First Machine Learning Models

Evaluating Models: Accuracy and Error

A trained model is useless until you can say how good it is. The right score depends on what you are predicting.

Classification: accuracy

When the label is a category (spam or not, species A/B/C), the simplest score is accuracy: the fraction of predictions that were correct.

from sklearn.metrics import accuracy_score

acc = accuracy_score(y_true, y_pred)   # e.g. 0.875 means 7 of 8 right

Accuracy runs from 0 to 1. But "good" is relative. If 95 percent of email is not spam, a lazy model that always guesses "not spam" already scores 0.95 while catching zero spam. Always compare against that kind of naive baseline before celebrating a number.

Regression: error

When the label is a continuous number (a price, a temperature), accuracy makes no sense, you will almost never predict the exact value. Instead you measure how far off you were. A common choice is mean absolute error, the average size of the miss:

from sklearn.metrics import mean_absolute_error

mae = mean_absolute_error(y_true, y_pred)   # in the same units as y

MAE is in the label's own units, which makes it easy to read: an MAE of 12 on house prices in thousands means you are off by about 12k on average. Lower is better, and 0 is perfect.

You will compute both: an accuracy from given predictions, and an MAE from a tiny linear regression fit on a perfectly straight line (so its error should be essentially zero).

Your turn

Part 1: arrays y_true and y_pred (8 labels each, differing in exactly one spot) are given. Set acc = accuracy_score(y_true, y_pred). Part 2: X and target describe the line y = 2x. Fit a LinearRegression named reg on them, predict on X into reg_pred, then set mae = mean_absolute_error(target, reg_pred).

Spotted a problem in this lesson? Report it

Code · runs in your browser

Output

Back Next lesson

Evaluating Models: Accuracy and Error

Classification: accuracy

Regression: error

This lesson is locked

Best on a laptop