%matplotlib inline
from d2l import torch as d2l
import random
import torch
from torch.distributions.multinomial import MultinomialMost of machine learning is inference under uncertainty:
The chapter’s running example: tossing a fair coin. As the sample count grows, empirical frequencies converge to the true P = 0.5:
The standard d2l prelude (plus a multinomial distribution we’ll use shortly):
A cleaner abstraction: a Multinomial over the categories {heads, tails} with probabilities [0.5, 0.5]. One call returns the count vector for 100 tosses:
tensor([45., 55.])
With 10 000 tosses, the empirical frequencies sit much closer to 0.5:
tensor([0.5022, 0.4978])
This is the law of large numbers: as n \to \infty the empirical mean converges to the true mean.
Plot the running estimate of P(\text{heads}) and P(\text{tails}) vs. sample count — the curves zigzag toward 0.5:
counts = Multinomial(1, fair_probs).sample((10000,))
cum_counts = counts.cumsum(dim=0)
estimates = cum_counts / cum_counts.sum(dim=1, keepdims=True)
estimates = estimates.numpy()
d2l.set_figsize((4.5, 3.5))
d2l.plt.plot(estimates[:, 0], label=("P(coin=heads)"))
d2l.plt.plot(estimates[:, 1], label=("P(coin=tails)"))
d2l.plt.axhline(y=0.5, color='black', linestyle='dashed')
d2l.plt.gca().set_xlabel('Samples')
d2l.plt.gca().set_ylabel('Estimated probability')
d2l.plt.legend();The variance of the estimate shrinks like 1/\sqrt{n} — doubling accuracy means quadrupling the sample budget.