Building a Multiclass Classification Model in PyTorch

The PyTorch library is a leading framework for deep learning, widely used for both regression and classification tasks. In this tutorial, you will learn how to use PyTorch to create and evaluate neural network models specifically for multiclass classification problems.

By the end of this tutorial, you will be able to:

Load data from a CSV file and prepare it for use in PyTorch.
Format data for multiclass classification modeling with neural networks.
Evaluate a PyTorch neural network model using cross-validation.

Table of Contents

Let’s Get Started

Problem Description

In this tutorial, you will work with the well-known Iris dataset. This dataset is ideal for practice as it consists of measurements of iris flowers and aims to classify them into three different species based on four features.

The dataset contains four numeric input features, all representing measurements in centimeters. The output is the species label, making it a classic multiclass classification problem.

You can download the Iris dataset from the UCI Machine Learning Repository or directly use the file named iris.csv, saved in your working directory.

Loading the Dataset

To efficiently read and manage the dataset, we’ll use the pandas library. This allows for easy manipulation, including separating the features from the labels.

import pandas as pd

# Load the dataset
data = pd.read_csv("iris.csv", header=None)
X = data.iloc[:, 0:4]  # Input features
y = data.iloc[:, 4]     # Target labels

Encoding Categorical Variables

The target labels need to be converted from strings to numerical values, as neural networks operate on numerical data. We can use Scikit-learn’s LabelEncoder for this task:

from sklearn.preprocessing import LabelEncoder

# Initialize and fit the encoder
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)  # Transform labels to numbers

You can verify the transformation:

print(encoder.classes_)  # Displaying the original categories

This will display:

['Iris-setosa' 'Iris-versicolor' 'Iris-virginica']

And when printing the encoded labels, you’ll see:

[0 0 0 1 1 1 2 2 2 ...]

Preparing the Data for PyTorch

Now that the target labels are encoded as integers, we need to convert both the features and the labels into PyTorch tensors:

import torch

X = torch.tensor(X.values, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

Defining the Neural Network Model

Next, we will define a neural network capable of handling multiclass classification. This model will consist of an input layer, at least one hidden layer, and an output layer configured for three output classes.

A straightforward architecture includes one hidden layer with ReLU activation and a final output layer with a softmax activation function (we’ll use CrossEntropyLoss, which incorporates softmax internally):

import torch.nn as nn

class Multiclass(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = nn.Linear(4, 8)  # 4 input features to 8 neurons in a hidden layer
        self.relu = nn.ReLU()           # Activation function for hidden layer
        self.output = nn.Linear(8, 3)   # 8 hidden neurons to 3 output classes

    def forward(self, x):
        x = self.relu(self.hidden(x))   # Apply ReLU activation
        x = self.output(x)               # Output layer
        return x

model = Multiclass()

Setting Up the Loss Function and Optimizer

For multiclass classification, use CrossEntropyLoss as the loss function. The Adam optimizer can effectively optimize the model during training:

import torch.optim as optim

loss_fn = nn.CrossEntropyLoss()  # Loss function for multiclass classification
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer

Training the Model

We will train the model using a training loop, implementing forward and backward passes to update the weights. Additionally, we will use k-fold cross-validation for model validation.

from sklearn.model_selection import train_test_split
import tqdm

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, shuffle=True)

# Training parameters
n_epochs = 200
batch_size = 5
batches_per_epoch = len(X_train) // batch_size

for epoch in range(n_epochs):
    model.train()  # Set the model to training mode
    epoch_loss = []
    with tqdm.trange(batches_per_epoch, unit="batch", mininterval=0) as bar:
        bar.set_description(f"Epoch {epoch}")
        for i in bar:
            start = i * batch_size
            X_batch = X_train[start:start + batch_size]
            y_batch = y_train[start:start + batch_size]

            # Forward pass
            y_pred = model(X_batch)
            loss = loss_fn(y_pred, y_batch.squeeze().long())

            # Backward pass
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            epoch_loss.append(loss.item())
            bar.set_postfix(loss=np.mean(epoch_loss))

    # Evaluation
    model.eval()
    with torch.no_grad():
        test_pred = model(X_test)
        test_loss = loss_fn(test_pred, y_test.squeeze().long())
        accuracy = (torch.argmax(test_pred, 1) == y_test.squeeze().long()).float().mean()
        print(f"Epoch {epoch}: Test Loss = {test_loss:.4f}, Accuracy = {accuracy:.4f}")

Complete Code

Here’s the consolidated code for building a multiclass classification model in PyTorch:

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import tqdm
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder

# Load the dataset
data = pd.read_csv("iris.csv", header=None)
X = data.iloc[:, 0:4]  # Features
y = data.iloc[:, 4]     # Labels

# Convert labels to integers
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)

# Convert to PyTorch tensors
X = torch.tensor(X.values, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)

# Define the model
class Multiclass(nn.Module):
    def __init__(self):
        super().__init__()
        self.hidden = nn.Linear(4, 8)
        self.relu = nn.ReLU()
        self.output = nn.Linear(8, 3)  # 3 classes for output

    def forward(self, x):
        x = self.relu(self.hidden(x))
        x = self.output(x)
        return x

# Initialize model, loss function, and optimizer
model = Multiclass()
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, shuffle=True)

# Training loop
n_epochs = 200
batch_size = 5
batches_per_epoch = len(X_train) // batch_size

for epoch in range(n_epochs):
    model.train()
    epoch_loss = []
    with tqdm.trange(batches_per_epoch, unit="batch", mininterval=0) as bar:
        bar.set_description(f"Epoch {epoch}")
        for i in bar:
            start = i * batch_size
            X_batch = X_train[start:start + batch_size]
            y_batch = y_train[start:start + batch_size]

            # Forward pass
            y_pred = model(X_batch)
            loss = loss_fn(y_pred, y_batch.squeeze().long())

            # Backward pass
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            epoch_loss.append(loss.item())
            bar.set_postfix(loss=np.mean(epoch_loss))

    # Evaluation
    model.eval()
    with torch.no_grad():
        test_pred = model(X_test)
        test_loss = loss_fn(test_pred, y_test.squeeze().long())
        accuracy = (torch.argmax(test_pred, 1) == y_test.squeeze().long()).float().mean()
        print(f"Epoch {epoch}: Test Loss = {test_loss:.4f}, Accuracy = {accuracy:.4f}")

# Optionally, implement additional evaluation methods or visualizations.

Summary

In this tutorial, you learned how to build a neural network model for multiclass classification using PyTorch. You compressed the steps of data loading and preprocessing, label encoding, model design, training, and evaluation. This structured approach equips you with important techniques for tackling multiclass classification tasks effectively.

This rewritten version maintains the content while enhancing clarity and readability.