ROC curves and AUC — Classification

The Threshold Problem

A trained logistic regression or neural network outputs a score between 0 and 1 for each example. To make a binary prediction, we apply a threshold: if score > τ, predict positive; otherwise predict negative.

ROC curves and AUC are the standard evaluation metrics in medicine, fraud detection, and ad ranking — any domain where the cost of false positives and false negatives differ. A single accuracy number hides exactly the tradeoffs these metrics expose.

The default is often τ = 0.5, but this is arbitrary. The same model with different thresholds produces radically different precision and recall values:

Threshold	TP	FP	TN	FN	Precision	Recall
0.9	20	1	99	30	0.952	0.400
0.5	42	8	92	8	0.840	0.840
0.2	48	25	75	2	0.658	0.960

Which threshold is "best" depends on the application. But comparing two models using metrics at a single threshold is unfair — a model might win at threshold 0.5 and lose at threshold 0.7.

ROC curves solve this: they evaluate the model across all thresholds at once.

Building the ROC Curve

For each threshold value, compute:

Symbol: = TP / (TP + FN) — plot on y-axis
Symbol: = FP / (FP + TN) — plot on x-axis

As we lower the threshold from 1.0 to 0.0:

More examples are predicted positive → TP increases (better recall) but so does FP
The curve moves from (0, 0) toward (1, 1)

Worked Example: 5 Examples

Sorted by model score (highest to lowest):

Example	Score	True Label	TP	FP	TPR	FPR
Start	—	—	0	0	0.00	0.00
A	0.92	+	1	0	0.33	0.00
B	0.81	+	2	0	0.67	0.00
C	0.68	−	2	1	0.67	0.50
D	0.45	+	3	1	1.00	0.50
E	0.22	−	3	2	1.00	1.00

(There are 3 positives and 2 negatives total)

The ROC curve passes through: (0,0) → (0, 0.33) → (0, 0.67) → (0.5, 0.67) → (0.5, 1.0) → (1.0, 1.0)

AUC: Area Under the ROC Curve

The is the area under the ROC curve, ranging from 0 to 1.

For the worked example above, the area can be computed geometrically or via the trapezoidal rule:

\text{AUC} = \sum_{i=1}^{n-1} \frac{(y_i + y_{i+1})}{2} \cdot (x_{i+1} - x_i)

$(x_1, y_1), \dots, (x_n, y_n)$: the ROC curve points, from (0,0) to (1,1)
$\Delta x_i$: change in FPR between consecutive points

Computing for our 5-example case:

Segment (0,0)→(0,0.33): width=0, area=0
Segment (0,0.33)→(0,0.67): width=0, area=0
Segment (0,0.67)→(0.5,0.67): width=0.5, avg height=0.67, area=0.333
Segment (0.5,0.67)→(0.5,1.0): width=0, area=0
Segment (0.5,1.0)→(1.0,1.0): width=0.5, avg height=1.0, area=0.5

AUC = 0 + 0 + 0.333 + 0 + 0.5 = 0.833

The Probability Interpretation

AUC has an elegant probabilistic meaning:

AUC = P(score of a random positive > score of a random negative)

In plain English: imagine picking one patient who has the disease and one who doesn't, completely at random. AUC tells you how often the model gives the sick patient a higher risk score than the healthy one. An AUC of 0.85 means the model ranks the sick patient higher 85% of the time — regardless of where you set the threshold. It measures how well the model separates the two classes, not how well it predicts at any one specific cutoff.

In our example, AUC ≈ 0.83 means: if we pick one positive and one negative at random, 83% of the time the model gives a higher score to the positive. This interpretation holds regardless of class balance or threshold.

AUC	Interpretation
1.0	Perfect: every positive outranks every negative
0.9–1.0	Excellent
0.7–0.9	Good
0.5–0.7	Poor but better than random
0.5	Equivalent to random guessing
< 0.5	Worse than random (flip predictions)

Code: ROC and AUC in Scikit-learn

from sklearn.metrics import roc_curve, roc_auc_score, RocCurveDisplay
import matplotlib.pyplot as plt

# y_true: array of 0/1 labels
# y_scores: array of model probability outputs
fpr, tpr, thresholds = roc_curve(y_true, y_scores)
auc = roc_auc_score(y_true, y_scores)

print(f"AUC: {auc:.3f}")

# Plot the ROC curve
RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=auc).plot()
plt.plot([0,1], [0,1], 'k--', label='Random')
plt.title(f'ROC Curve (AUC = {auc:.3f})')
plt.show()

Always plot the ROC curve in addition to reporting AUC — the shape of the curve contains information that the scalar AUC hides.