%matplotlib inline
from d2l import tensorflow as d2l
from IPython import display
import tensorflow as tf
tf.linalg.eig(tf.constant([[2, 1], [2, 3]], dtype=tf.float64))A square matrix \mathbf{A} has eigenvalue \lambda and eigenvector \mathbf{v} when
\mathbf{A}\mathbf{v} = \lambda \mathbf{v}.
Geometrically: \mathbf{A} stretches \mathbf{v} by \lambda but doesn’t rotate it. If \mathbf{A} is diagonalizable: \mathbf{A} = \mathbf{V}\mathbf{\Lambda}\mathbf{V}^{-1} — a basis change in which the action is just stretching along axes.
Why we care: matrix powers \mathbf{A}^t are governed by \lambda^t. Repeated application of \mathbf{A} aligns arbitrary inputs with the dominant eigenvector. That’s the heart of vanishing/exploding gradients in RNNs, of PageRank, and of every iterative solver.
Use a small matrix so the geometry is visible: applying \mathbf{A} to an eigenvector changes scale but not direction.
(<tf.Tensor: shape=(2,), dtype=complex128, numpy=array([1.+0.j, 4.+0.j])>,
<tf.Tensor: shape=(2, 2), dtype=complex128, numpy=
array([[-0.70710678+0.j, -0.4472136 +0.j],
[ 0.70710678+0.j, -0.89442719+0.j]])>)
Cheap eigenvalue bounds without computing them: eigenvalues lie in the union of disks centered at a_{ii} with radius \sum_{j \ne i} |a_{ij}|. Useful for stability arguments:
<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.99228525, 2.9734395 , 4.953943 , 9.080336 ], dtype=float32)>
Power iteration: keep multiplying by \mathbf{A}. The direction converges to the leading eigenvector; the norm grows like \lambda_1^t:
<tf.Tensor: shape=(5, 5), dtype=float64, numpy=
array([[ 1.03149539, 0.41232531, 0.02266882, 0.67373294, -0.87260178],
[-1.0966355 , -0.39447736, 0.57244043, -0.51296719, -0.33833431],
[-0.44988705, -1.01188798, -1.66646424, -1.84863467, -0.63040954],
[ 1.25508111, -1.64698523, 0.89168499, 0.8669014 , 0.09465951],
[ 1.191481 , -1.45418164, 0.21448572, 0.80032122, 1.53658541]])>
After repeated multiplication, normalize the vector to read off the direction; the scale factor estimates the dominant eigenvalue.
norms of eigenvalues: [0.05364470876910993, 1.8603626913408091, 1.8603626913408091, 2.2231936803962955, 2.2231936803962955]
# Rescale the matrix `A`
A /= norm_eigs[-1]
# Do the same experiment again
v_in = tf.random.normal((k, 1), dtype=tf.float64)
norm_list = [tf.norm(v_in).numpy()]
for i in range(1, 100):
v_in = tf.matmul(A, v_in)
norm_list.append(tf.norm(v_in).numpy())
d2l.plot(tf.range(0, 100), norm_list, 'Iteration', 'Value')