The language of uncertainty and compression
What is information? Entropy
KL divergence: distance between distributions
Mutual information
Cross-entropy as negative log-likelihood