In our previous discussion, we explored the k-means clustering algorithm as an unsupervised machine learning technique designed to group similar data points into distinct clusters, thereby revealing patterns within the data.
Thus far, we’ve applied the k-means algorithm to both a simple two-dimensional dataset containing clear clusters and for the task of image color quantization.
In this tutorial, you will learn how to use OpenCV’s k-Nearest Neighbors algorithm for image classification.
After completing this tutorial, you will understand:
- Why k-means clustering is effective for image classification.
- How to apply the k-means clustering algorithm to the digit dataset in OpenCV for classification.
- Methods to reduce digit variations due to skew, enhancing the accuracy of the k-means clustering algorithm.
Let’s dive in!
Tutorial Overview
This tutorial consists of two main parts:
- Understanding k-Means Clustering as an Unsupervised Learning Technique
- Applying k-Means Clustering to Image Classification
Understanding k-Means Clustering as an Unsupervised Learning Technique
The k-means clustering algorithm allows us to automatically group data into distinct categories (or clusters), where data within each cluster exhibit similarity while differing from other clusters. This process aims to uncover hidden patterns that might not be observable prior to clustering.
Previously, we applied the k-means algorithm to a two-dimensional dataset comprising five clusters to categorize the data points accordingly. We also utilized this algorithm for color quantization, reducing the number of distinct colors in an image.
In this tutorial, we will leverage k-means clustering’s power once again to group images of similar handwritten digits without relying on ground truth labels using the OpenCV digits dataset.
Applying k-Means Clustering to Image Classification
We will start by loading the OpenCV digits image, segmenting it into individual sub-images that display handwritten digits from 0 to 9, and creating the corresponding ground truth labels. This will allow us to evaluate the performance of the k-means clustering algorithm later on:
# Load the digits image and divide it into sub-images
img, sub_imgs = split_images('Images/digits.png', 20)
# Generate the ground truth labels
imgs, labels_true, _, _ = split_data(20, sub_imgs, 1.0)
The resultant imgs
array contains 5,000 sub-images arranged in a row-wise format, each represented as a flattened vector of 400 pixels:
# Display the shape of the 'imgs' array
print(imgs.shape) # Output: (5000, 400)
Next, we will set up the k-means algorithm with parameters similar to those we used for color quantization but with the imgs
array as input data, and we will specify the number of clusters K as 10 (for the ten digits):
# Define the algorithm’s termination criteria
criteria = (TERM_CRITERIA_MAX_ITER + TERM_CRITERIA_EPS, 10, 1.0)
# Execute the k-means clustering algorithm on the image data
compactness, clusters, centers = kmeans(data=imgs.astype(float32), K=10,
bestLabels=None, criteria=criteria,
attempts=10, flags=KMEANS_RANDOM_CENTERS)
The kmeans
function will return a centers
array, containing a representative image for each cluster. The shape of the centers
array will be 10 x 400
, so we’ll need to reshape it back into 20×20 pixel images for visualization:
# Reshape the array into 20x20 images
imgs_centers = centers.reshape(-1, 20, 20)
# Visualize the cluster centers
fig, ax = subplots(2, 5)
for i, center in zip(ax.flat, imgs_centers):
i.imshow(center)
show()
The representative images generated by the k-means algorithm should closely resemble the handwritten digits from the OpenCV digits dataset.
You may notice that the order of cluster centers does not necessarily align with the digits from 0 to 9, as the k-means algorithm groups similar data together without acknowledging their numerical order. This discrepancy can complicate the comparison of predicted labels with the ground truth ones. To address this issue, we need to reorder the cluster labels appropriately:
# Define the cluster labels found
labels = array([2, 0, 7, 5, 1, 4, 6, 9, 3, 8])
labels_pred = zeros(labels_true.shape, dtype='int')
# Reorder the cluster labels
for i in range(10):
mask = clusters.ravel() == i
labels_pred[mask] = labels[i]
Now we’re prepared to calculate the accuracy of the algorithm by determining the percentage of predicted labels that match the ground truth:
# Calculate the accuracy of the algorithm
accuracy = (sum(labels_true == labels_pred) / labels_true.size) * 100
# Print the accuracy
print("Accuracy: {0:.2f}%".format(accuracy))
Improving Accuracy with Deskewing
Initially, we can expect an accuracy of around 54.80%. To improve this, we can introduce a method to correct for skew in the digit images using a process that applies an affine transformation based on skew calculations derived from image moments.
Here’s a basic outline of the deskewing algorithm:
from cv2 import moments, warpAffine, INTER_CUBIC
def deskew_image(img):
# Calculate moments of the image
img_moments = moments(img)
if abs(img_moments['mu02']) > 1e-2:
img_skew = (img_moments['mu11'] / img_moments['mu02'])
m = float32([[1, img_skew, -0.6 * img.shape[0] * img_skew], [0, 1, 0]])
img_deskew = warpAffine(src=img, M=m, dsize=img.shape, flags=INTER_CUBIC)
else:
img_deskew = img.copy()
return img_deskew
We can apply this function to each image in our dataset prior to processing, which generally leads to a better classification performance. This approach can significantly impact the accuracy of the k-means algorithm.
Conclusion
In this tutorial, you explored how to apply OpenCV’s k-means clustering algorithm to classify handwritten digits. You learned about:
- The applicability of k-means clustering to image classification.
- How to implement k-means clustering with the OpenCV digits dataset for image classification.
- Techniques to reduce digit variations due to skew, thereby improving classification accuracy.
Further Reading
For additional resources, consider the following:
Books:
- Machine Learning for OpenCV, 2017
- Mastering OpenCV 4 with Python, 2019
Websites:
Please let me know if you need any further modifications or additional information!