%matplotlib inline
from d2l import tensorflow as d2l
from matplotlib_inline import backend_inline
import numpy as npTraining a neural net = minimizing a loss. Calculus tells us which way to step:
The geometric picture: a function and its tangent line at a point. The tangent’s slope is the derivative.
The derivative of f at x is f'(x) = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}.
Take a concrete example, u = f(x) = 3x^2 - 4x (analytic derivative: f'(x) = 6x - 4, so f'(1) = 2):
At x = 1, the difference quotient \frac{f(x+h) - f(x)}{h} should approach f'(1) = 2 as h \to 0:
h=0.10000, numerical limit=2.30000
h=0.01000, numerical limit=2.03000
h=0.00100, numerical limit=2.00300
h=0.00010, numerical limit=2.00030
h=0.00001, numerical limit=2.00003
It does — but small h runs into floating-point cancellation, which is exactly the problem autograd sidesteps in the next chapter.
Plot u = f(x) alongside its tangent at x=1, y = 2x - 3:
The tangent’s slope is f'(1) = 2 — derivatives are slopes, made geometric. (The d2l package wraps a few matplotlib helpers — set_figsize, plot, set_axes — used throughout the book. See the source if you’re curious.)