Image Compression Using Principle Component Analysis (PCA)

Machine Learning and Deep Learning serve the technological industries by solving some of the most challenging and unsolved tasks. Every organization is collecting data, and the amount is increasing daily. If we specifically talk about image data, it takes a lot of bandwidth to store and transmit them because of their bulky nature. In such situations, the ultimate need is to develop a technology that can keep the same data with nearly the same information but using a lesser memory. 

In this blog, we will build our image data compressor using an unsupervised learning technique, PCA. We will be discussing these topics:

  • Image Types and Quantization
  • PCA Overview
  • Step-wise implementation of PCA for image compression. 82% reduction in size by just compromising ~2% of the information.

Let's begin with the basics of image data first.

Image Types

We might have used Whatsapp and are perfectly aware of the scenarios that the image that we send over this personalized chat platform automatically gets reduced in size. It is done to ensure that we can store more information within the same amount of storage. Many different websites take our bulky image as input and claim to produce the same image but with a reduced size.

We represent an image via a matrix with pixel values as the entries that represent the color. Based on the number of channels present, we can classify images into two types:

  1. Grayscale image and
  2.  RGB image. 

A grayscale image can be depicted in a single matrix with pixel values of just one channel, while a colored image is usually a matrix of three stacked RGB channels (Red, Green, and Blue).

Quantization of Images

13 Color shades of a gray image

The above figure shows the 13 properly distinguishable shades of grayscale. While our eye can easily distinguish between thousands of colors, it can hardly detect only a dozen different shades of color. Each of the 13 can further be divided into multiple different shades. With the incorporation of more shades, the clarity between two neighboring shades diminishes. This is the fundamental property that is exploited while performing image compression. 

In general, 8-bits represent the shades of an image (0 to 255, i.e., 0 to 2⁸-1). Projecting certain bits to a threshold makes no difference to the eye in terms of quality. To understand it better, suppose we replace the odd pixel values (1, 3, 249, etc.) with the next even pixel value (2, 4, 250, etc.). The change in the image will not be visible or distinguishable. This is called quantization and is one of the simplest ways of performing image compression.

How Quantization of image pixels work?

However, we will use Principal Component Analysis (PCA) to compress the image. PCA-based compression works more or less on the same principle as quantization. Still, instead of projecting the pixel bits on a certain threshold, we will be projecting them along the principal component. One threshold projection example can be replacing the image bits with the nearest bit having a value in multiple of 10s. This project demands a basic understanding of PCA, so we recommend you see this blog first.

PCA Overview

PCA is an unsupervised dimensionality reduction technique where we try to transform the existing features into new features such that 

  1. Every new feature is orthonormal to each other. 
  2. Features are arranged in decreasing order of information they carry.  

For example, suppose we have ten features in a dataset and use PCA. In that case, we can create (let's say) a five-feature dataset that can retain maximum information from the original ten features. To know the working of PCA in greater detail, please read this blog.

What is the purpose of this project?

The purpose of image compression is to reduce the size of the image such that the transmission consumes lesser internet. An image is also a signal transmitted from one place to another. As there is limited bandwidth, we don't want to send the entire image with 100% information but the compressed version that contains as much information as possible. People would reconstruct the same image (say, 512x512) at the receiver's end with the transmitted data. 

Theoretical Steps for PCA used for RGB Image Compression 

These are the steps involved in PCA-based image compression:

  1. The computer will read our image as a matrix of shape height*width*channel. For grayscale images, the value of channels would be 1. If the image is in RGB format with a resolution of 1024, the matrix shape will be 1024*1024*3. We will have to consider each channel separately to perform PCA for RGB images. 
  2. For every channel, the shape of the matrix would be 1024*1024. We need to compute the mean and covariance of this matrix. With this, we can calculate the Eigen values and Eigen vectors. Later, these values and vectors are arranged into a decreasing order because eigenvectors corresponding to the highest eigenvalue will contain the most information.
  3. Please select the first few components for the compression and remove the rest, as they do not provide much information. 
  4. After that, we can project the processed data along the selected components and add the mean of the image into it to get the final compressed version of our original image.

PCA steps for Image compression in sequential manner

The image is reconstructed from the dominant eigenvectors formed using the Principal Component Analysis in the above pipeline.

Now it's time to build the image compression project end-to-end.

Color Image Compression Using PCA In Python

Step 1: Import the necessary libraries in Python

Here, we will import four basic libraries required for this application:

  • PIL (Python Image Library): It enables the reading and manipulation of the image file. 
  • NumPy: It enables mathematical computations and manipulations of images (arrays/matrices).
  • PCA: We will use PCA from the Scikit-learn framework to make our process faster. PCA is built under sklearn.decomposition
  • Matplotlib: It is used for the visualization of various plots.
from PIL import Image, ImageOps               # Image Manipulation
from sklearn.decomposition import PCA
import numpy as np                     
import os
import matplotlib.pyplot as plt               # Visualization

Step 2: Reading the original image

Let's take a sample image of a random girl who does not exist. Yes! There is an AI-based website thispersondoesnotexist.com that generates a random face of a person who does not live in the real world. One can choose any other image from the mentioned website.

Original image with it's properties

We need to read our image and extract some critical properties from it. For that, we can use the Image function from the PIL library. 

def img_data(imgPath,disp=True):
    
    orig_img = Image.open(imgPath)
    
    img_size_kb = os.stat(imgPath).st_size/1024
    data = orig_img.getdata()    
    ori_pixels=np.array(data).reshape(*orig_img.size, -1)
    img_dim = ori_pixels.shape
    
    if disp:
        plt.imshow(orig_img)
        plt.show()
    
    data_dict = {}
    data_dict['img_size_kb'] = img_size_kb
    data_dict['img_dim'] = img_dim
    
    return data_dict

The above function takes in the image file's location as input and returns a python dictionary containing image properties, such as the size and dimension of the image. In our case, the image is an RGB image with dimensions 1024*1024*3, and the size is around 466 KBs.

Our Image compression project aims to reduce the file size (in KB) while keeping the overall dimensions the same as (1024*1024). Let's compute the components.

Step 3: Computing the principal components for the input image

The input image has three channels, i.e., Red, Green, and Blue. In this step, we will have to decompose the image into separate channels and then fit the PCA algorithm on each channel.

def pca_compose(imgPath):
    
    orig_img = Image.open(imgPath)
    # 1. Read the image
    orig_img = Image.open(imgPath)
    
    # 2. Convert the reading into a 2D numpy array
    img = np.array(orig_img.getdata())
    
    # 3. Reshape 2D to 3D array 
    # The asterisk (*) operator helps in unpacking the sequence/collection as positional arguments. 
    # So, instead of using indices of elements separately, we can use * and perform action on it.
    # print(orig_img.size) = (1024, 1024) --> print(*orig_img.size) = 1024 1024
    img = img.reshape(*orig_img.size, -1)
    
    # Seperate channels from image and use PCA on each channel
    pca_channel = {}
    img_t = np.transpose(img) # transposing the image 
    
    for i in range(img.shape[-1]):    # For each RGB channel compute the PCA
        
        per_channel = img_t[i] # It will be in a shape (1,1024,1024)
        
        # Converting (1, 1024, 1024) to (1024, 1024)
        channel = img_t[i].reshape(*img.shape[:-1])  # obtain channel
        
        pca = PCA(random_state = 42)                #initialize PCA
        
        fit_pca = pca.fit_transform(channel)        #fit PCA
        
        pca_channel[i] = (pca,fit_pca)  #save PCA models for each channel
        
    return pca_channel

In the above function, we take image location as the input and then convert the image data into a numpy array. All the channels in that array (RGB) are individually fitted in different instances of PCA. The models and the transformed data are saved and returned in a dictionary format. Please note that for an image of size 1024*1024, we will have 1024 components as PCA transforms the existing features into the same number of new features arranged in descending order of importance.

Now, we have fit the PCA model corresponding to each channel. Let's understand how we will reconstruct the final compressed image.

Step 4: Lossless Reconstruction of the image using a reduced number of components

In this stage, the saved PCA models for each channel obtained in the previous step are used for reconstruction. The variable n_components decides how many of the top 1024 principal components will be retained for the reconstruction. One good observation can be made that if we retain all 1024 components, there won't be any size reduction. Our objective is to reduce the size of the image at the cost of very slight information loss. In the case of an image, we can sense that information from the clarity of the output image.

# Function to select the desired number of components
def pca_transform(pca_channel, n_components):
    
    temp_res = []
    
    # Looping over all the channels we created from pca_compose function
    
    for channel in range(len(pca_channel)):
        
        pca, fit_pca = pca_channel[channel]
        
        # Selecting image pixels across first n components
        pca_pixel = fit_pca[:, :n_components]
        
        # First n-components
        pca_comp = pca.components_[:n_components, :]
        
        # Projecting the selected pixels along the desired n-components (De-standardization)
        compressed_pixel = np.dot(pca_pixel, pca_comp) + pca.mean_
        
        # Stacking channels corresponding to Red Green and Blue
        temp_res.append(compressed_pixel)
            
    # transforming (channel, width, height) to (height, width, channel)
    compressed_image = np.transpose(temp_res)
    
    # Forming the compressed image
    compressed_image = np.array(compressed_image,dtype=np.uint8)
    
    return compressed_image

The transformed data is projected onto the chosen eigenvectors, and the mean is added to obtain the final compressed image. We can choose to retain around 5%of 1024(~= 50) components for the original image and project the data on the same.

Comparison of output image with input image

When we are using just 5% of the principal components, a reduction of ~380KB (81.34%) was obtained while retaining 97.778% of the variance/information of the original image. But one interesting question to ask is, 

How do we decide the value of n_components

Let's understand the technicalities behind how we can find the number of components we need to retain not to lose much information from the final image. For that, we first need to know which component carries the percentage of information with it. As we already know, PCA arranges the components in decreasing order. Still, the number of components needs to be retained to safe keep most information easily if we know the exact percentage contained by individual components.

Variance Explained vs. Principal Components

The function below tells us how much information will be retained by selecting n number of components. This n_components is an input parameter of the function, and we can vary it to observe the changes.

# Function to tell the percentage of explained variance by n number of components
def explained_var_n(pca_channel, n_components):
    
    var_exp_channel = []; var_exp=0
    
    for channel in pca_channel:
        
        pca,_ = pca_channel[channel]
        
        var_exp_channel.append(np.cumsum(pca.explained_variance_ratio_))
        
        var_exp += var_exp_channel[channel][n_components]
        
    var_exp = var_exp/len(pca_channel)
    
    return var_exp

In the image below, the first image shows the individual information percentage carried by every principal component, and the second image shows the aggregate of the information held by the first n_components.

Individual components and their contribution (cumulative & individual)

One can see that the first principal component retained ~43% of the total variance. As we move to the 20th principal component, less than 1% of the total variance was owned by that component. We can observe that the cumulative retention of the variance is directly proportional to the number of principal components taken. In simple terms, if we keep increasing the number of components, the more information we will retain, resulting in a poor compression percentage.

Similarly, we can plot the percentage reduction in the size of the compressed image (KBs) vs. the number of components we will retain. 

File Size (kB) vs. Principal Components

The percentage reduction in size is inversely proportional to the number of principal components used. We should use the minimum number of principal components if we want more reduction.

In our example, we can reduce the image to 50% of its original by selecting only six principal components. We need at least 25 principal components to retain 96% of the variance of the original image. There is a tradeoff between retaining more information and better compression in size. We should always keep an optimum number of components that balance the explained variability and the image quality with the desired size.

One can find the complete code in EnjoyAlgorithm's Machine Learning GitHub repository.

Possible Interview Questions on Image Compression Using PCA

In case readers are planning to put this project into their resumes, these are the possible questions the interviewer can ask:

  • Question: What is the basic principle used in Image Compression?
    Answer: Quantization is the most fundamental principle used to perform image compression. 
  • Question: What is Image Quantization?
    Answer: We project/replace the image pixel values with a particular threshold value in quantization.
  • Question: Explain the step-wise working of PCA.
    Answer: Please follow our PCA blog for a detailed explanation.
  • Question: Can we perform compression without losing any information?
    Answer: The shortest answer would be No because, in compression, we try to solve the tradeoff between Compressed image size and the number of components retained.
  • Question: How do we decide the number of components required to achieve the compression?
    Answer: PCA arranges the components in a descending order w.r.t the information they carry. If we progressively keep increasing the number of components and simultaneously observe the percentage reduction, we can stop at some optimum value.
  • Question: Why do we add the mean of the image for reconstruction?
    Answer: The input data is getting standardized before fitting PCA. Hence at the time of reconstruction, we bring the image pixels into the same space by adding mean.

Conclusion

In this complete machine learning project blog, we used an unsupervised learning approach, PCA, to reduce the size of the image. We learned how to decide the number of components to keep and simultaneously focused on the part where we could retain as much information as possible. Implementation steps were also discussed, along with the machine learning model implementation. We also discussed the reconstruction of the images after performing PCA. As a bonus, we also presented the complete code of this implementation. We hope you enjoyed the article.

Enjoy Learning! Enjoy Algorithms!

Share feedback with us

More blogs to explore

Our weekly newsletter

Subscribe to get weekly content on data structure and algorithms, machine learning, system design and oops.

© 2022 Code Algorithms Pvt. Ltd.

All rights reserved.