LSTM for Time Series Prediction in PyTorch

hasnainmehdi1172@gmail.com

2 months ago

Long Short-Term Memory (LSTM) networks are a specialized type of recurrent neural network (RNN) designed to handle sequential data effectively. They excel in applications such as time series prediction and natural language processing. In this article, you will learn how to build an LSTM model for time series forecasting using PyTorch.

In particular, you will discover:

An overview of LSTM networks and how they differ from standard RNNs.
How to develop an LSTM network specifically for time series prediction.
How to train an LSTM model.

Let’s get started!

Table of Contents

Toggle

Overview

This tutorial is structured into three sections:

Understanding LSTM Networks
Building an LSTM for Time Series Prediction
Training and Evaluating Your LSTM Network

Understanding LSTM Networks

LSTM cells serve as the foundation for constructing larger neural networks. Unlike standard building blocks, such as fully-connected layers that perform simple matrix multiplications, LSTM cells are more complex and designed to retain memory over long sequences of input.

A typical LSTM cell processes one time step of an input tensor along with a stored cell state and a hidden state. Initially, both the cell memory and the hidden state can be set to zero. Within each LSTM cell, the input, hidden state, and cell memory are manipulated through multiple weight tensors and activation functions, updating the state for the next time step.

This design allows LSTMs to effectively handle long-range dependencies within data, a significant advancement over traditional RNNs.

Building an LSTM for Time Series Prediction

For this tutorial, you’ll focus on the international airline passengers prediction problem. Given a specific year and month, the goal is to forecast the number of international airline passengers. This dataset spans from January 1949 to December 1960 and includes a total of 144 observations.

First, you need to load the data. Save the dataset as airline-passengers.csv and ensure the structure is as follows:

"Month","Passengers"
"1949-01",112
"1949-02",118
"1949-03",132
...

Use the following code snippet to read the data:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('airline-passengers.csv')
timeseries = df[["Passengers"]].values.astype('float32')

Preparing the Data

Next, split the dataset into training and testing sets without shuffling, as it’s important to maintain the time order:

train_size = int(len(timeseries) * 0.67)
train, test = timeseries[:train_size], timeseries[train_size:]

To create the input-output pairs for the LSTM model, define a function to form a dataset based on a specified look-back window:

import torch

def create_dataset(dataset, lookback):
    X, y = [], []
    for i in range(len(dataset) - lookback):
        feature = dataset[i:i + lookback]
        target = dataset[i + lookback]
        X.append(feature)
        y.append(target)
    return torch.tensor(X), torch.tensor(y)

With a look-back of, say, 1, transform the training and testing data:

lookback = 1
X_train, y_train = create_dataset(train, lookback)
X_test, y_test = create_dataset(test, lookback)

Building the LSTM Model

Define your LSTM network architecture in PyTorch, incorporating an LSTM layer along with a fully connected layer:

import torch.nn as nn

class AirModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.lstm = nn.LSTM(input_size=1, hidden_size=50, num_layers=1, batch_first=True)
        self.linear = nn.Linear(50, 1)

    def forward(self, x):
        x, _ = self.lstm(x)
        x = self.linear(x[:, -1, :])  # Only take the output from the last time step
        return x

Training the LSTM Model

Utilize a DataLoader for batching the dataset during training. Set up the loss function and the optimizer, and then define the training loop:

from torch.utils.data import DataLoader
import torch.optim as optim

train_loader = DataLoader(data.TensorDataset(X_train, y_train), shuffle=True, batch_size=8)

model = AirModel()
optimizer = optim.Adam(model.parameters())
loss_fn = nn.MSELoss()

n_epochs = 2000
for epoch in range(n_epochs):
    model.train()
    for X_batch, y_batch in train_loader:
        y_pred = model(X_batch.unsqueeze(-1))  # Ensure correct input shape
        loss = loss_fn(y_pred, y_batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    if epoch % 100 == 0:
        print(f'Epoch {epoch}: Loss: {loss.item():.4f}')

Verifying Model Performance

After training, evaluate the model’s performance on the test set:

model.eval()
with torch.no_grad():
    y_pred = model(X_test.unsqueeze(-1))
    test_loss = loss_fn(y_pred, y_test)
    print(f'Test Loss: {test_loss.item():.4f}')

Conclusion

In this tutorial, you learned how to develop an LSTM model for time series prediction in PyTorch. Specifically, you covered:

How to prepare data for modeling.
The structure and components of an LSTM network.
The training process and how to verify the model’s predictions.

With this foundation in place, you can harness the power of LSTM networks to tackle various sequential prediction problems in your deep learning projects.