%matplotlib inline
from d2l import mxnet as d2l
from matplotlib_inline import backend_inline
from mxnet import np, npx
npx.set_np()Training a neural net = minimizing a loss. Calculus tells us which way to step:
The geometric picture: a function and its tangent line at a point. The tangent’s slope is the derivative.
The derivative of f at x is f'(x) = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}.
Take a concrete example, u = f(x) = 3x^2 - 4x (analytic derivative: f'(x) = 6x - 4, so f'(1) = 2):
At x = 1, the difference quotient \frac{f(x+h) - f(x)}{h} should approach f'(1) = 2 as h \to 0:
It does — but small h runs into floating-point cancellation, which is exactly the problem autograd sidesteps in the next chapter.
Plot u = f(x) alongside its tangent at x=1, y = 2x - 3:
The tangent’s slope is f'(1) = 2 — derivatives are slopes, made geometric. (The d2l package wraps a few matplotlib helpers — set_figsize, plot, set_axes — used throughout the book. See the source if you’re curious.)