K-Means Clustering for Image Classification Using OpenCV

In our previous discussion, we explored the k-means clustering algorithm as an unsupervised machine learning technique designed to group similar data points into distinct clusters, thereby revealing patterns within the data.

Thus far, we’ve applied the k-means algorithm to both a simple two-dimensional dataset containing clear clusters and for the task of image color quantization.

In this tutorial, you will learn how to use OpenCV’s k-Nearest Neighbors algorithm for image classification.

After completing this tutorial, you will understand:

Why k-means clustering is effective for image classification.
How to apply the k-means clustering algorithm to the digit dataset in OpenCV for classification.
Methods to reduce digit variations due to skew, enhancing the accuracy of the k-means clustering algorithm.

Let’s dive in!

Table of Contents

Tutorial Overview

This tutorial consists of two main parts:

Understanding k-Means Clustering as an Unsupervised Learning Technique
Applying k-Means Clustering to Image Classification

Understanding k-Means Clustering as an Unsupervised Learning Technique

The k-means clustering algorithm allows us to automatically group data into distinct categories (or clusters), where data within each cluster exhibit similarity while differing from other clusters. This process aims to uncover hidden patterns that might not be observable prior to clustering.

Previously, we applied the k-means algorithm to a two-dimensional dataset comprising five clusters to categorize the data points accordingly. We also utilized this algorithm for color quantization, reducing the number of distinct colors in an image.

In this tutorial, we will leverage k-means clustering’s power once again to group images of similar handwritten digits without relying on ground truth labels using the OpenCV digits dataset.

Applying k-Means Clustering to Image Classification

We will start by loading the OpenCV digits image, segmenting it into individual sub-images that display handwritten digits from 0 to 9, and creating the corresponding ground truth labels. This will allow us to evaluate the performance of the k-means clustering algorithm later on:

# Load the digits image and divide it into sub-images
img, sub_imgs = split_images('Images/digits.png', 20)

# Generate the ground truth labels
imgs, labels_true, _, _ = split_data(20, sub_imgs, 1.0)

The resultant imgs array contains 5,000 sub-images arranged in a row-wise format, each represented as a flattened vector of 400 pixels:

# Display the shape of the 'imgs' array
print(imgs.shape)  # Output: (5000, 400)

Next, we will set up the k-means algorithm with parameters similar to those we used for color quantization but with the imgs array as input data, and we will specify the number of clusters K as 10 (for the ten digits):

# Define the algorithm’s termination criteria
criteria = (TERM_CRITERIA_MAX_ITER + TERM_CRITERIA_EPS, 10, 1.0)

# Execute the k-means clustering algorithm on the image data
compactness, clusters, centers = kmeans(data=imgs.astype(float32), K=10,
                                        bestLabels=None, criteria=criteria,
                                        attempts=10, flags=KMEANS_RANDOM_CENTERS)

The kmeans function will return a centers array, containing a representative image for each cluster. The shape of the centers array will be 10 x 400, so we’ll need to reshape it back into 20×20 pixel images for visualization:

# Reshape the array into 20x20 images
imgs_centers = centers.reshape(-1, 20, 20)

# Visualize the cluster centers
fig, ax = subplots(2, 5)

for i, center in zip(ax.flat, imgs_centers):
    i.imshow(center)

show()

The representative images generated by the k-means algorithm should closely resemble the handwritten digits from the OpenCV digits dataset.

You may notice that the order of cluster centers does not necessarily align with the digits from 0 to 9, as the k-means algorithm groups similar data together without acknowledging their numerical order. This discrepancy can complicate the comparison of predicted labels with the ground truth ones. To address this issue, we need to reorder the cluster labels appropriately:

# Define the cluster labels found
labels = array([2, 0, 7, 5, 1, 4, 6, 9, 3, 8])

labels_pred = zeros(labels_true.shape, dtype='int')

# Reorder the cluster labels
for i in range(10):
    mask = clusters.ravel() == i
    labels_pred[mask] = labels[i]

Now we’re prepared to calculate the accuracy of the algorithm by determining the percentage of predicted labels that match the ground truth:

# Calculate the accuracy of the algorithm
accuracy = (sum(labels_true == labels_pred) / labels_true.size) * 100

# Print the accuracy
print("Accuracy: {0:.2f}%".format(accuracy))

Improving Accuracy with Deskewing

Initially, we can expect an accuracy of around 54.80%. To improve this, we can introduce a method to correct for skew in the digit images using a process that applies an affine transformation based on skew calculations derived from image moments.

Here’s a basic outline of the deskewing algorithm:

from cv2 import moments, warpAffine, INTER_CUBIC

def deskew_image(img):
    # Calculate moments of the image
    img_moments = moments(img)

    if abs(img_moments['mu02']) > 1e-2:
        img_skew = (img_moments['mu11'] / img_moments['mu02'])
        m = float32([[1, img_skew, -0.6 * img.shape[0] * img_skew], [0, 1, 0]])
        img_deskew = warpAffine(src=img, M=m, dsize=img.shape, flags=INTER_CUBIC)
    else:
        img_deskew = img.copy()

    return img_deskew

We can apply this function to each image in our dataset prior to processing, which generally leads to a better classification performance. This approach can significantly impact the accuracy of the k-means algorithm.

Conclusion

In this tutorial, you explored how to apply OpenCV’s k-means clustering algorithm to classify handwritten digits. You learned about:

The applicability of k-means clustering to image classification.
How to implement k-means clustering with the OpenCV digits dataset for image classification.
Techniques to reduce digit variations due to skew, thereby improving classification accuracy.