%matplotlib inline
from d2l import torch as d2l
import torch
from torch import nnSequences are everywhere — text, speech, time-series, video. Three concepts set up the rest of the chapter:
This deck demos on a noisy sine wave: “predict the next value” is much easier than “predict the next 64 values.”
A noisy sine wave, 1000 time steps:
Each example is the next value x_t given the last \tau: \mathbf{x}_t = [x_{t-\tau}, \ldots, x_{t-1}]. Train a linear regressor on the first 600 windows:
def get_dataloader(self, train):
features = [self.x[i : self.T-self.tau+i] for i in range(self.tau)]
self.features = d2l.stack(features, 1)
self.labels = d2l.reshape(self.x[self.tau:], (-1, 1))
i = slice(0, self.num_train) if train else slice(self.num_train, None)
return self.get_tensorloader([self.features, self.labels], train, i)Predict \hat{x}_t from the true previous \tau values. Looks great:
But forecasting more than one step requires feeding predicted values back as inputs — errors compound:
def k_step_pred(k):
features = []
for i in range(data.tau):
features.append(data.x[i : i+data.T-data.tau-k+1])
# The (i+tau)-th element stores the (i+1)-step-ahead predictions
for i in range(k):
preds = model(d2l.stack(features[i : i+data.tau], 1))
features.append(d2l.reshape(preds, -1))
return features[data.tau:]The 1- and 4-step curves track the truth; 16- and 64-step predictions decay to noise. Long-horizon forecasting is hard.