Evaluating Models: Accuracy and Error
A trained model is useless until you can say how good it is. The right score depends on what you are predicting.
Classification: accuracy
When the label is a category (spam or not, species A/B/C), the simplest score is accuracy: the fraction of predictions that were correct.
from sklearn.metrics import accuracy_score
acc = accuracy_score(y_true, y_pred) # e.g. 0.875 means 7 of 8 rightAccuracy runs from 0 to 1. But "good" is relative. If 95 percent of email is not spam, a lazy model that always guesses "not spam" already scores 0.95 while catching zero spam. Always compare against that kind of naive baseline before celebrating a number.
Regression: error
When the label is a continuous number (a price, a temperature), accuracy makes no sense, you will almost never predict the exact value. Instead you measure how far off you were. A common choice is mean absolute error, the average size of the miss:
from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(y_true, y_pred) # in the same units as yMAE is in the label's own units, which makes it easy to read: an MAE of 12 on house prices in thousands means you are off by about 12k on average. Lower is better, and 0 is perfect.
You will compute both: an accuracy from given predictions, and an MAE from a tiny linear regression fit on a perfectly straight line (so its error should be essentially zero).
Part 1: arrays y_true and y_pred (8 labels each, differing in exactly one spot) are given. Set acc = accuracy_score(y_true, y_pred). Part 2: X and target describe the line y = 2x. Fit a LinearRegression named reg on them, predict on X into reg_pred, then set mae = mean_absolute_error(target, reg_pred).
This lesson is locked
Lessons open one at a time. Finish the previous lesson to unlock this one.