Building a Multiclass Classification Model in PyTorch

PyTorch is a popular library used for deep learning applications, capable of solving both regression and classification problems. In this tutorial, you will learn how to build and evaluate neural network models for multiclass classification tasks using PyTorch.

After completing this tutorial, you will be able to:

Load data from a CSV file and prepare it for PyTorch.
Prepare data for multiclass classification modeling with neural networks.
Use cross-validation techniques to evaluate a PyTorch neural network model.

Table of Contents

Let’s Get Started

Problem Description

In this tutorial, we will use the famous Iris dataset, a well-known benchmark in machine learning. The Iris dataset contains four numeric input features that represent measurements of iris flowers in centimeters, allowing us to classify the species based on these properties.

The dataset comprises three species of iris flowers: Iris-setosa, Iris-versicolor, and Iris-virginica. Thus, it presents a multiclass classification problem.

You can download the Iris dataset and save it as iris.csv in your working directory, or fetch it from the UCI Machine Learning Repository.

Loading the Dataset

To load and preprocess the dataset, we will use the pandas library. This will allow us to easily manage the data, including slicing and dicing the datasets as needed.

import pandas as pd

# Load the dataset
data = pd.read_csv("iris.csv", header=None)
X = data.iloc[:, 0:4]
y = data.iloc[:, 4]

In this example, X represents the feature columns, and y represents the target labels.

Encoding Categorical Variables

Since the species labels are strings, we need to convert them into numerical values for model training. We can achieve this using the LabelEncoder from Scikit-learn:

from sklearn.preprocessing import LabelEncoder

# Initialize the encoder
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)

Now the class labels will be represented as integers, where each species corresponds to an integer value.

You can verify the transformation:

print(encoder.classes_)

This will display:

['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']

And printing y will show:

[0 0 0 1 1 1 2 2 2 ...]

Preparing the Data for PyTorch

To work with the data in PyTorch, we need to convert X and y into tensors:

import torch

X = torch.tensor(X.values, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

Defining the Neural Network Model

Now, let’s create a neural network capable of handling the multiclass classification task. The model will consist of an input layer, one or more hidden layers, and an output layer.

In this case, we will create a simple feedforward neural network. The output layer will use the softmax activation function to represent class probabilities, but we’ll utilize PyTorch’s CrossEntropyLoss, which combines this function with the loss calculation for stability.

import torch.nn as nn

class Multiclass(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = nn.Linear(4, 8)  # Input layer to first hidden layer
        self.relu = nn.ReLU()           # Activation function
        self.output = nn.Linear(8, 3)   # Hidden layer to output layer

    def forward(self, x):
        x = self.relu(self.hidden(x))   # Apply activation to hidden layer
        x = self.output(x)               # Get output from output layer
        return x

model = Multiclass()

Setting Up the Loss Function and Optimizer

For this multiclass classification problem, we will use cross-entropy loss. The Adam optimizer is a suitable choice for training the model:

import torch.optim as optim

loss_fn = nn.CrossEntropyLoss()  # Cross-entropy loss for multi-class classification
optimizer = optim.Adam(model.parameters(), lr=0.001)

Training the Model

We will train the model using a training loop, performing forward and backward passes to update the weights. We will also implement k-fold cross-validation to validate the model’s performance.

from sklearn.model_selection import train_test_split
import tqdm

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, shuffle=True)

# Training parameters
n_epochs = 200
batch_size = 5
batches_per_epoch = len(X_train) // batch_size

for epoch in range(n_epochs):
    model.train()
    epoch_loss = []
    with tqdm.trange(batches_per_epoch, unit="batch", mininterval=0) as bar:
        bar.set_description(f"Epoch {epoch}")
        for i in bar:
            start = i * batch_size
            X_batch = X_train[start:start + batch_size]
            y_batch = y_train[start:start + batch_size]

            # Forward pass
            y_pred = model(X_batch)
            loss = loss_fn(y_pred, y_batch.squeeze().long())

            # Backward pass
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            epoch_loss.append(loss.item())
            bar.set_postfix(loss=np.mean(epoch_loss))

    # Evaluation
    model.eval()
    with torch.no_grad():
        test_pred = model(X_test)
        test_loss = loss_fn(test_pred, y_test.squeeze().long())
        accuracy = (torch.argmax(test_pred, 1) == y_test.squeeze().long()).float().mean()
        print(f"Epoch {epoch}: Test Loss = {test_loss:.4f}, Accuracy = {accuracy:.4f}")

ROC Curve

After training and validating the model, you can create a Receiver Operating Characteristic (ROC) curve for deeper performance analysis:

from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt

with torch.no_grad():
    y_prob = model(X_test)
    fpr, tpr, thresholds = roc_curve(y_test.numpy(), y_prob.numpy()[:, 1])  # Use probabilities
    plt.plot(fpr, tpr)
    plt.title("Receiver Operating Characteristic")
    plt.xlabel("False Positive Rate")
    plt.ylabel("True Positive Rate")
    plt.show()

Complete Code

Here is the complete code encompassing all steps discussed:

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import tqdm
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

# Load the dataset
data = pd.read_csv("iris.csv", header=None)
X = data.iloc[:, 0:4]
y = data.iloc[:, 4]

# Binary encoding of labels
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)

# Convert to PyTorch tensors
X = torch.tensor(X.values, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# Define the multiclass model
class Multiclass(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = nn.Linear(4, 8)
        self.relu = nn.ReLU()
        self.output = nn.Linear(8, 3)

    def forward(self, x):
        x = self.relu(self.hidden(x))
        x = self.output(x)
        return x

# Initialize model, loss function and optimizer
model = Multiclass()
loss_fn = nn.CrossEntropyLoss()  # Cross-entropy loss for multi-class classification
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, shuffle=True)

# Training loop
n_epochs = 200
batch_size = 5
batches_per_epoch = len(X_train) // batch_size

for epoch in range(n_epochs):
    model.train()
    epoch_loss = []
    with tqdm.trange(batches_per_epoch, unit="batch", mininterval=0) as bar:
        bar.set_description(f"Epoch {epoch}")
        for i in bar:
            start = i * batch_size
            X_batch = X_train[start:start + batch_size]
            y_batch = y_train[start:start + batch_size]

            # Forward pass
            y_pred = model(X_batch)
            loss = loss_fn(y_pred, y_batch.squeeze().long())

            # Backward pass
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            epoch_loss.append(loss.item())
            bar.set_postfix(loss=np.mean(epoch_loss))

    # Evaluation
    model.eval()
    with torch.no_grad():
        test_pred = model(X_test)
        test_loss = loss_fn(test_pred, y_test.squeeze().long())
        accuracy = (torch.argmax(test_pred, 1) == y_test.squeeze().long()).float().mean()
        print(f"Epoch {epoch}: Test Loss = {test_loss:.4f}, Accuracy = {accuracy:.4f}")

# ROC Curve
from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt

with torch.no_grad():
    y_prob = model(X_test)
    fpr, tpr, thresholds = roc_curve(y_test.numpy(), y_prob.numpy()[:, 1])  # Use probabilities
    plt.plot(fpr, tpr)
    plt.title("Receiver Operating Characteristic")
    plt.xlabel("False Positive Rate")
    plt.ylabel("True Positive Rate")
    plt.show()

Summary

In this tutorial, you learned how to create a neural network model for multiclass classification using PyTorch. You covered loading and preparing the dataset, encoding categorical labels, designing and training the model, and validating performance with cross-validation and ROC curves. This structured approach equips you with the ability to tackle multiclass classification tasks effectively.

This rewritten version retains the core content while enhancing clarity and readability.