%matplotlib inline
from d2l import mxnet as d2l
from IPython import display
from mxnet import np, npx
npx.set_np()
# Plot a function in a normal range
x_big = np.arange(0.01, 3.01, 0.01)
ys = np.sin(x_big**x_big)
d2l.plot(x_big, ys, 'x', 'f(x)')The single-variable calculus toolkit underlying gradient descent. The derivative
f'(x) = \lim_{\epsilon \to 0} \frac{f(x+\epsilon) - f(x)}{\epsilon}
is the local linear approximation of f at x. The gradient-descent update x \leftarrow x - \eta f'(x) uses exactly this approximation: take a small step opposite the slope.
This deck visualizes derivatives, linear approximation, and Taylor expansion — the local quadratic and beyond.
Plot a function on a normal scale, then zoom in. Smooth functions look more and more like a straight line — that line is the tangent, its slope is f'(x):
As the view narrows around a smooth point, curvature becomes less visible and the tangent line becomes the right local model.
f(x + \epsilon) \approx f(x) + \epsilon f'(x) — the first-order Taylor term. Valid for small \epsilon; foundation of GD analysis:
The second derivative measures how the slope itself changes: f''(x) > 0 curves upward, f''(x) < 0 curves downward.
f(x + \epsilon) = \sum_{k=0}^\infty \frac{f^{(k)}(x)}{k!} \epsilon^k. Truncating gives polynomial approximations of any order:
# Compute the exponential function
xs = np.arange(0, 3, 0.01)
ys = np.exp(xs)
# Compute a few Taylor series approximations
P1 = 1 + xs
P2 = 1 + xs + xs**2 / 2
P5 = 1 + xs + xs**2 / 2 + xs**3 / 6 + xs**4 / 24 + xs**5 / 120
d2l.plot(xs, [ys, P1, P2, P5], 'x', 'f(x)', legend=[
"Exponential", "Degree 1 Taylor Series", "Degree 2 Taylor Series",
"Degree 5 Taylor Series"])