Getting started with OpenCV in Python: Basics of Image Processing

7 min readMar 5, 2023

Image processing is a technique used to enhance or manipulate digital images. It has become an important aspect of modern-day technology and is widely used in various applications such as medical imaging, remote sensing, robotics, and many others. Python has become a popular language for image processing due to its ease of use and powerful libraries such as OpenCV. In this article, we will discuss the basics of image processing in Python with OpenCV.

What is OpenCV?

OpenCV (Open source Computer Vision) is a library of programming functions primarily designed for real-time Computer Vision applications. OpenCV is widely recognized as one of the most comprehensive and reliable image processing libraries, offering an extensive range of functionalities for Computer Vision applications.

The OpenCV-Python library allows users to access OpenCV’s functionalities using Python programming language. It provides an easy-to-use and efficient interface for image processing, making it an ideal choice for beginners and experienced developers alike. The library is designed to handle both still images and video streams with ease, making it an incredibly versatile tool for a wide range of applications.

# Import necessary libraries
import cv2                 
import numpy as np              
from dataPath import DATA_PATH  # dataPath module for data path to the images
import matplotlib.pyplot as plt         
%matplotlib inline

Image Reading and Displaying

The first step in image processing is reading an image. OpenCV allows reading different types of images (JPG, PNG, etc.). We can load grayscale images, color images or you can also load images with Alpha channel. OpenCV use the cv2.imread() function to read an image. The cv2.imread() function takes the image file path as an argument and returns the image data as a NumPy array, because in python, the digital image is a NumPy array.

imagePath = DATA_PATH + "/images/number_zero.jpg"

# Read image in Grayscale format
img = cv2.imread(imagePath,cv2.IMREAD_GRAYSCALE)  

print("Image Dimensions = {}\n".format(img.shape))
print(img)

Python display of an image as a NumPy 2D array

We print the 2-D array to see what the image is. We can make out that the image signifies a `0`. Whenever cv2.imread() is used, the variable (img) will be a NumPy array, shows the intensity value for each pixel. The pixel values ranges from 0 (completely black) to 255 (completely white).

We can display the image using the imshow() function. The cv2.imshow() function takes the window name and the image data as arguments.

cv2.imshow('Image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

However, I want to display the image in a Jupyter Notebook, to this end, I will go with the matplotlib approach. Both approaches produce the same output.

plt.imshow(testImage)  # matplotlib function

number_zero.jpg

Besides this, it is important to go through the color images, because it opens a new dimension of handling channels. Color images has 3 dimensions, where 3rd dimension indicates the number of channel an image has. For a JPEG image, its a simple Red Green Blue (RGB) image constitutes 3 channels, however, a PNG image has additional “alpha” channel in addition to RGB.

In OpenCV, the order of channels R, G and B is reverse. i.e. In the image matrix, the Blue channel is indexed first, followed by the Green Channel and finally the Red Channel. I am using “matplotlib” to show the image and in order to show the image properly, it is important to convert BGR to RGB color space using cv2.cvtColor() function.

imagePath = DATA_PATH + "/images/musk.jpg"

# Read the image
img = cv2.imread(imagePath)
print("image Dimension ={}".format(img.shape))

# Convert BGR to RGB colorspace
imgRGB = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
plt.imshow(imgRGB)       # matplotlib function

Basic Image Operations

Image operations are a fundamental aspect of image processing and computer vision. OpenCV is a popular library for image processing and provides many functions for image operations. Image operations can be categorized into two main categories: pixel-level operations and geometric operations. Pixel-level operations are applied to each pixel of the image, whereas geometric operations are applied to the entire image.

Pixel-level operations include image arithmetic, image thresholding, and color space conversions. Image arithmetic includes operations like addition, subtraction, multiplication, and division between two images. Image thresholding is used to separate objects from the background based on their intensity levels. Color space conversions are used to convert an image from one color space to another, such as from RGB to grayscale.

Geometric operations include image rotation, scaling, and translation. Image rotation is used to rotate an image by a certain angle. Image scaling is used to increase or decrease the size of an image. Image translation is used to move an image in a certain direction.

Let’s look at an example of how to perform some of these operations in Python with OpenCV. We will start by importing the necessary libraries:

import cv2
import numpy as np

To perform pixel-level operations, we can use the built-in functions provided by OpenCV. For example, to perform image arithmetic, we can use the cv2.add()function. The cv2.add() function is used for adding two images or scalar values. When we use the cv2.add() function to add two images, the pixel values of the images are added together. However, it is important to note that the resulting pixel values may exceed the maximum pixel value that can be represented by the data type of the input images. In such cases, the resulting pixel value is truncated to the maximum value that can be represented.

img1 = cv2.imread('image1.jpg', 0)  # Load the first grayscale image
img2 = cv2.imread('image2.jpg', 0)  # Load the second grayscale image

img3 = cv2.add(img1, img2)  # Add the two images

In this case, the pixel values of img1 and img2 are added together to form the pixel values of img3. If the resulting pixel values of img3 are greater than 255 (which is the maximum pixel value that can be represented by an 8-bit grayscale image), the values are truncated to 255.

To perform geometric operations, we can use the built-in functions provided by OpenCV. For example, to rotate an image, we can use the cv2.rotate() function:

import cv2

# Load the image
img = cv2.imread('image.jpg')

# Rotate the image by 45 degrees
rotated_img = cv2.rotate(img, cv2.cv2.ROTATE_45_CLOCKWISE)

# Display the original and rotated images
cv2.imshow('Original Image', img)
cv2.imshow('Rotated Image', rotated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()

The cv2.rotate() function in OpenCV is used to rotate an image by a specified angle. When you apply the cv2.rotate() function to an image, the pixels of the image are transformed by rotating them clockwise or counterclockwise, depending on the specified angle.

The function takes two arguments: the input image, and the rotation angle in degrees. If the angle is positive, the image is rotated clockwise, and if the angle is negative, the image is rotated counterclockwise.

When the cv2.rotate() function is applied to an image, the size of the image may change, depending on the angle of rotation. In general, the size of the output image will be larger than the input image because the rotated image needs to contain all the pixels of the original image.

The function uses a process called affine transformation, which involves matrix multiplication to rotate the pixels of the image. The transformation matrix used by the function is generated using the cv2.getRotationmatrix2D()function.

To scale an image, we can use the cv2.resize() function. We can get output by 2 ways, either specifying width and height of an output image explicitly, or specify the scaling factor. Further interpolation must be explicitly mentioned, as it measure the quality and computation cost of an output. There is always a trade off between speed and quality of results.

image = cv2.imread(DATA_PATH+"images/boy.jpg")

# Set rows and columns
resizeDownWidth = 300
resizeDownHeight = 200
resizedDown = cv2.resize(image, (resizeDownWidth, resizeDownHeight), interpolation= cv2.INTER_LINEAR)

# Mess up with the aspect ratio
resizeUpWidth = 600
resizeUpHeight = 900
resizedUp = cv2.resize(image, (resizeUpWidth, resizeUpHeight), interpolation= cv2.INTER_LINEAR)

plt.figure(figsize=[15,15])
plt.subplot(131);plt.imshow(image[:,:,::-1]);plt.title("Original Image")
plt.subplot(132);plt.imshow(resizedUp[:,:,::-1]);plt.title("Scaled Up Image")
plt.subplot(133);plt.imshow(resizedDown[:,:,::-1]);plt.title("Scaled Down Image")

output: such that aspect ratio is preserved

# Scaling Down the image 1.5 times by specifying both scaling factors
scaleUpX = 1.5
scaleUpY = 1.5

# Scaling Down the image 0.6 times specifying a single scale factor.
scaleDown = 0.6

scaledDown = cv2.resize(image, None, fx= scaleDown, fy= scaleDown, interpolation= cv2.INTER_LINEAR)

scaledUp = cv2.resize(image, None, fx= scaleUpX, fy= scaleUpY, interpolation= cv2.INTER_LINEAR)

# We can also use the following syntax for displaying image
plt.figure(figsize=[15,15])
plt.subplot(121);plt.imshow(scaledDown[...,::-1]);plt.title("Scaled Down Image, size = {}".format(scaledDown.shape[:2]));
plt.subplot(122);plt.imshow(scaledUp[...,::-1]);plt.title("Scaled Up Image, size = {}".format(scaledUp.shape[:2]));

The image on the right is of higher quality than the image on the left. This is because both are of different sizes, however, we intentionally displays them at the same size. So the larger image seems to have a higher quality, as there are more pixels in a limited area. on the other hand, for the same size, few pixels have to cover the whole area and thus blur is obvious.

In conclusion, image operations are a fundamental aspect of image processing and computer vision. OpenCV provides many functions for performing image operations in Python. By understanding the basics of image operations, we can build more advanced image processing and computer vision applications.

important Links:

https://circuitdigest.com/tutorial/getting-started-with-opencv-image-processing

Image Operations — OpenCV with Python for Image and Video Analysis 4 — YouTube

OpenCV: Getting Started with Images

OpenCV: Basic Operations on Images

Getting started with OpenCV in Python: Basics of Image Processing

What is OpenCV?

Image Reading and Displaying

Basic Image Operations

Written by Naveed Ul Mustafa

No responses yet