Your First Machine Learning Models

Logistic Regression & Probabilities

The decision tree gave you a label. But real systems usually need more than "spam" or "not spam" -> they need how sure the model is, so you can set a threshold, abstain when unsure, or route the confident cases automatically. The workhorse model for that is logistic regression, and despite the name it is a classifier, not a regressor.

Here is the one idea behind it. A linear model computes a weighted sum of the features, which can be any number from minus infinity to plus infinity. Logistic regression squashes that number through the sigmoid function into a probability between 0 and 1:

sigmoid(z) = 1 / (1 + e**(-z))
# z = -4 -> 0.018     z = 0 -> 0.5     z = 4 -> 0.982

So the model outputs P(class = 1). Anything above 0.5 predicts class 1, anything below predicts class 0, and a value near 0.5 is the model saying "I am not sure." That single number is what every downstream confidence-threshold and routing trick in this course is built on.

The code is the same fit/predict ritual as the tree, with one new method, predict_proba, which returns the probabilities instead of the hard labels:

from sklearn.linear_model import LogisticRegression

clf = LogisticRegression()
clf.fit(X, y)
clf.predict(X)              # hard labels: 0 or 1
clf.predict_proba(X)       # one [P(0), P(1)] row per sample, each row sums to 1

You will train it on a tiny, intuitive dataset: hours studied versus whether a student passed. Few hours -> fail (0), many hours -> pass (1). Because the pattern is monotonic, the fitted model should give a higher probability of passing to someone who studied more. The two columns of predict_proba are P(class 0) and P(class 1) in the order of clf.classes_ (here [0, 1]), so column index 1 is the probability of passing.

Your turn

A 1-feature dataset is given: hours studied and passed (0 or 1). (1) Create a LogisticRegression() and fit it on X, y. (2) Store the hard labels for X in preds. (3) Use predict_proba to store the probability of passing (class 1) for studying 2 hours in p_low, and for 7 hours in p_high. The model should be confident enough that p_high > p_low, and both are real probabilities between 0 and 1.

Spotted a problem in this lesson? Report it

Code · runs in your browser

Output

Back Next lesson

Logistic Regression & Probabilities

This lesson is locked

Best on a laptop