Your First Machine Learning Models

Project: Classify Flowers End to End

Time to put the whole pipeline together yourself, with no scaffolding. This is the exact workflow you would use on a real problem.

You are given a small, iris-like flower dataset as a DataFrame named data. Each row has two features, petal_len and petal_wid, and a label species that is 0, 1, or 2 for three flower types. The three species occupy clearly different petal sizes, so a good model should separate them well.

Your job is the full sequence you have learned, in order:

Split features and label into X and y.
Hold out a test set with train_test_split.
Train a model on the training set.
Predict on the test set.
Evaluate with accuracy on that test set.

Two details for a clean, reproducible run. Pass random_state=0 everywhere randomness appears (the split and the model). And pass stratify=y to train_test_split so the test set keeps the same mix of the three species rather than, say, leaving one species out by chance, which matters when each class is small.

This is the same loop, scaled up, that runs behind spam filters, medical screens, and recommendation systems. The model here is small and the data is friendly, but the shape of the work is exactly real.

Your turn

Using the given DataFrame data: set X = data[["petal_len", "petal_wid"]] and y = data["species"]. Split with train_test_split(X, y, test_size=0.25, random_state=0, stratify=y) into X_train, X_test, y_train, y_test. Train a RandomForestClassifier(n_estimators=50, random_state=0) named model on the training set, predict on X_test into predictions, and set acc = accuracy_score(y_test, predictions).

Spotted a problem in this lesson? Report it

Code · runs in your browser

Output

Back Next lesson

Project: Classify Flowers End to End

This lesson is locked

Best on a laptop