The Threshold Problem
A trained logistic regression or neural network outputs a score between 0 and 1 for each example. To make a binary prediction, we apply a threshold: if score > τ, predict positive; otherwise predict negative.
ROC curves and AUC are the standard evaluation metrics in medicine, fraud detection, and ad ranking — any domain where the cost of false positives and false negatives differ. A single accuracy number hides exactly the tradeoffs these metrics expose.
The default is often τ = 0.5, but this is arbitrary. The same model with different thresholds produces radically different precision and recall values:
| Threshold | TP | FP | TN | FN | Precision | Recall |
|---|---|---|---|---|---|---|
| 0.9 | 20 | 1 | 99 | 30 | 0.952 | 0.400 |
| 0.5 | 42 | 8 | 92 | 8 | 0.840 | 0.840 |
| 0.2 | 48 | 25 | 75 | 2 | 0.658 | 0.960 |
Which threshold is "best" depends on the application. But comparing two models using metrics at a single threshold is unfair — a model might win at threshold 0.5 and lose at threshold 0.7.
ROC curves solve this: they evaluate the model across all thresholds at once.
Building the ROC Curve
For each threshold value, compute:
- Symbol: = TP / (TP + FN) — plot on y-axis
- Symbol: = FP / (FP + TN) — plot on x-axis
As we lower the threshold from 1.0 to 0.0:
- More examples are predicted positive → TP increases (better recall) but so does FP
- The curve moves from (0, 0) toward (1, 1)
Worked Example: 5 Examples
Sorted by model score (highest to lowest):
| Example | Score | True Label | TP | FP | TPR | FPR |
|---|---|---|---|---|---|---|
| Start | — | — | 0 | 0 | 0.00 | 0.00 |
| A | 0.92 | + | 1 | 0 | 0.33 | 0.00 |
| B | 0.81 | + | 2 | 0 | 0.67 | 0.00 |
| C | 0.68 | − | 2 | 1 | 0.67 | 0.50 |
| D | 0.45 | + | 3 | 1 | 1.00 | 0.50 |
| E | 0.22 | − | 3 | 2 | 1.00 | 1.00 |
(There are 3 positives and 2 negatives total)
The ROC curve passes through: (0,0) → (0, 0.33) → (0, 0.67) → (0.5, 0.67) → (0.5, 1.0) → (1.0, 1.0)
AUC: Area Under the ROC Curve
The is the area under the ROC curve, ranging from 0 to 1.
For the worked example above, the area can be computed geometrically or via the trapezoidal rule:
- the ROC curve points, from (0,0) to (1,1)
- change in FPR between consecutive points
Computing for our 5-example case:
- Segment (0,0)→(0,0.33): width=0, area=0
- Segment (0,0.33)→(0,0.67): width=0, area=0
- Segment (0,0.67)→(0.5,0.67): width=0.5, avg height=0.67, area=0.333
- Segment (0.5,0.67)→(0.5,1.0): width=0, area=0
- Segment (0.5,1.0)→(1.0,1.0): width=0.5, avg height=1.0, area=0.5
AUC = 0 + 0 + 0.333 + 0 + 0.5 = 0.833
The Probability Interpretation
AUC has an elegant probabilistic meaning:
AUC = P(score of a random positive > score of a random negative)
In plain English: imagine picking one patient who has the disease and one who doesn't, completely at random. AUC tells you how often the model gives the sick patient a higher risk score than the healthy one. An AUC of 0.85 means the model ranks the sick patient higher 85% of the time — regardless of where you set the threshold. It measures how well the model separates the two classes, not how well it predicts at any one specific cutoff.
In our example, AUC ≈ 0.83 means: if we pick one positive and one negative at random, 83% of the time the model gives a higher score to the positive. This interpretation holds regardless of class balance or threshold.
| AUC | Interpretation |
|---|---|
| 1.0 | Perfect: every positive outranks every negative |
| 0.9–1.0 | Excellent |
| 0.7–0.9 | Good |
| 0.5–0.7 | Poor but better than random |
| 0.5 | Equivalent to random guessing |
| < 0.5 | Worse than random (flip predictions) |
Code: ROC and AUC in Scikit-learn
from sklearn.metrics import roc_curve, roc_auc_score, RocCurveDisplay
import matplotlib.pyplot as plt
# y_true: array of 0/1 labels
# y_scores: array of model probability outputs
fpr, tpr, thresholds = roc_curve(y_true, y_scores)
auc = roc_auc_score(y_true, y_scores)
print(f"AUC: {auc:.3f}")
# Plot the ROC curve
RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=auc).plot()
plt.plot([0,1], [0,1], 'k--', label='Random')
plt.title(f'ROC Curve (AUC = {auc:.3f})')
plt.show()
Always plot the ROC curve in addition to reporting AUC — the shape of the curve contains information that the scalar AUC hides.