A 2D image can be represented as a rectangular grid, composed of many square cells, called pixels.
How do these pixels look like?
Just try to imagine a chessboard:
Just like the black and white squares on the chessboard, pixels are nicely aligned in straight lines, both horizontally and vertically. We will refer to the horizontal ones as rows and to the vertical ones as columns. It is easy to see that a chessboard has 8 rows and 8 columns.
But can you guess how many rows and columns of pixels this image has?
To find this out using Python and OpenCV:
1) Save the image in the folder where you wish to run your Python code.
2) Import OpenCV library in your Python interpreter or .py file (if you are not familiar with running Python code, just look it up, it should be pretty easy).
3) Load the image in memory (we will refer to this operation as reading an image).
image = cv2.imread('<replace_with_image_name>.jpg')
4) Print the image dimensions.
The program should print
(232, 223, 3). The first two numbers are the number of rows and number of columns. So, 232 x 223 = 51736 pixels in total. For now, don’t worry about the third number, it represents the number of colors or, more formally, the number of color channels. We will talk about it in the last section.
Now that you know how to find the number of pixels in an image, how do you differentiate between them?
First of all through their address. In real life we use street names, house numbers, city, country etc. Pixels could also be seen as having a street name and a house number.
Street Name = How many rows are there upwards of the pixel?
House Number = How many columns are there to the left of the pixel?
A pixel location is formally defined as a pair (row, column), both indexed from 0 (their numbering starts from 0, instead of 1). See if you can guess the location of the red, green and blue pixels on the following grid.
The right answers are:
Red pixel at (3,5)
Green pixel at (13,8)
Blue pixel at (10,21)
In Python, for accessing a pixel at location
(row, column), we would have to specify the image that we are referring to, and the row and column in square brackets. For example, the pixel in the center of the image would be accessed as follows:
import cv2 image = cv2.imread('<replace_with_image_name>.jpg') no_rows = image.shape no_cols = image.shape # Remember that image.shape returns (no_rows, no_cols, no_channels) #  fetches the first number in brackets,  - the second one and so on row = no_rows/2 col = no_cols/2 print image[row, col]
If you run this on the image with Spiderman, the script should print
[ 63 75 141].
What color do you expect it to have?
We can visualize the subimage that contains this only one pixel. Generally, if you want to crop a rectangular portionwith top-left pixel at
(start_row, start_col) and bottom-right pixel at
(end_row, end_col), you would use:
sub_image = image[start_row:end_row + 1, start_col:end_col + 1]
and for our particular case:
pixel_image = image[row:row+1, col:col+1]
Because this is just a single-pixel image, it is more difficult to visualize, so I recommend resizing the image before displaying it. This can be done as follows:
desired_no_rows = 100 desired_no_cols = 100 pixel_image = cv2.resize(pixel_image, (desired_no_cols, desired_no_rows))
Now, if you just want to display the image obtained, there is the
cv2.imshow(window_name, image) function, which is perfect especially for debugging, when you change an image a number of times and you want to make sure that you are doing the right thing after each step.
For writing an image to disk, you would use the
cv2.imwrite(image_name, image) function.
Code for our case:
cv2.imshow('This looks like one pixel', pixel_image) cv2.waitKey() # For resuming running the code once a key is pressed cv2.imwrite('pixel.png', pixel_image)
And the pixel that I got is this one
Looks reddish, which makes sense because the center of the image falls on a red patch from Spiderman’s costume.
Another useful tool when analyzing images is the iteration through every pixel. This might turn out helpful in finding pixels or regions with certain properties, or in couting certain events that occur in the image.
Row by row (one street at a time):
no_rows = image.shape no_cols = image.shape for row in range(no_rows): for col in range(no_cols): print image[row, col], 'at row', row, 'column', col
Column by column (one house number at a time):
no_rows = image.shape no_cols = image.shape for col in range(no_cols): for row in range(no_rows): print image[row, col], 'at row', row, 'column', col
Let us get back to the chessboard analogy. Each cell was either black or white. Pixels are the same, but instead of just 2 possible values to describe one pixel
we have 256 possibilities for in-between, transitional colors for grayscale images
and 256 x 256 x 256 possibilities for colored(BGR) images.
So the color of pixels can be represented in a 3-dimensional space.
[ 63 75 141]?
The description of the pixel in the center.
And the third number in
(232, 223, 3)?
Which showed the number of color channels.
Well, one component (axis) is to measure the strength of blue(63), one to measure the strength of green(75) and one to measure the strength of red(141). It turns out that combinations of these 3 colors are enough to create most of the visually perceivable colors by humans. Thus, the BGR color space can be viewed as a cube, where 3 opposite points represent blue, green and red. All the other points represent combinations of these 3 core colors.
Where blue(255,0,0) and red(0,0,255) meet, we get magenta(255,0,255).
Where red(0,0,255) and green(0,255,0) meet, we get yellow(0,255,255).
Where green(0,255,0) and blue(255,0,0) meet, we get cyan(255,255,0).
What can you notice about colors in the BGR color space?
Their components are summed up independently.
What BGR components do you think black has?
What BGR components do you think white has?
Have a look at the highlighted region in the image with Spiderman…
…put under the microscope
If you look closely, you will notice that I only used black, blue, green and red pixels. Therefore no yellow and no white. They are simply an illusion. Feel free to inspect the white regions and see that white is created by strong blue, strong green and strong red. On the other hand, black is the result of the absence of blue, green and red components. This is why the BGR color space is called additive. The more color you add, the brighter the result is.
Another way to come to the same conclusion is to display one component at a time. That is, if I want to see the green component of an image, I’ll simply make the other two components equal zero. And the same with blue and green independently.
blue_component = image.copy() green_component = image.copy() red_component = image.copy() # First copy image into 3 separate images # Then make other components equal 0 # Can use the code for iterating an image no_rows = image.shape no_cols = image.shape # Isolate blue component for row in range(no_rows): for col in range(no_cols): blue = blue_component[row, col, 0] blue_component[row, col] = (blue, 0, 0) cv2.imwrite('blue.jpg', blue_component) # Isolate green component for row in range(no_rows): for col in range(no_cols): green = green_component[row, col, 1] green_component[row, col] = (0, green, 0) cv2.imwrite('green.jpg', green_component) # Isolate red component for row in range(no_rows): for col in range(no_cols): red = red_component[row, col, 2] red_component[row, col] = (0, 0, red) cv2.imwrite('red.jpg', red_component)
The wiser alternative is to use
cv2.merge as follows:
zeros = np.zeros((no_rows, no_cols), np.uint8) B,G,R = cv2.split(image) blue_component = cv2.merge((B, zeros, zeros)) green_component = cv2.merge((zeros, G, zeros)) red_component = cv2.merge((zeros, zeros, R))
And the result is:
The last image is in a different colorspace, called grayscale. We have seen that this space has only 256 variations of light intensity (or the color gray). Having just one component (in this case the intensity component) can turn out very convenient, because it can enable us to view an image as a landscape, which can provide many valuable intuitions for certain image processing tasks.
Conversion from BGR to grayscale:
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Viewed from above it is very similar to the initial image
From the sides we can start seeing the dark spots as valleys
And the white spots as mountains, peaks or plateaus
Transitions can be smooth or abrupt and help us distinguish borders and regions
After reading this article you should know or keep in mind the following:
- How images have a grid-like representation, with small square cells named pixels.
- Using cv2.imread to store image in memory.
- Using shape to get image dimensions.
- Using cv2.imshow to display an image, good for debugging purposes.
- Using cv2.imwrite to save an image to disk.
- Getting the pixel at a specific location, given by row and column.
- Iterating through an image.
- Extracting a rectangular region from an image.
- Using cv2.resize to adapt an image to certain dimensions.
- Color representation: grayscale vs. BGR.
- Using cv2.cvtColor to convert between different color spaces.
- Separation of the blue, green and red components in BGR images (cv2.split, cv2.merge).
- Landscape representation of grayscale images.
1.. Mirror the Image
Using the iteration process of an image, flip an image around the vertical axis. Now flip it around the horizontal axis. If you are already familiar with writing such for loops, note that you can use the
cv2.flip(image, flipCode) function.
# Run: python mirror.py <image_name> import cv2 import sys # flips around the horizontal axis, done manually(with iteration) def flipHorizontallyWithIteration(image): flipped = image.copy() no_rows = flipped.shape no_cols = flipped.shape for col in range(no_cols): halfway = no_rows/2 for row in range(halfway): pix = flipped[row, col].copy() flipped[row, col] = flipped[no_rows - row - 1, col].copy() flipped[no_rows - row - 1, col] = pix return flipped # flips around the vertical axis, done manually(with iteration) def flipVerticallyWithIteration(image): flipped = image.copy() no_rows = flipped.shape no_cols = flipped.shape for row in range(no_rows): halfway = no_cols/2 for col in range(halfway): pix = flipped[row, col].copy() flipped[row, col] = flipped[row, no_cols - col - 1].copy() flipped[row, no_cols - col - 1] = pix return flipped # flips around the horizontal axis def flipHorizontally(image): flipped = cv2.flip(image, 0) return flipped # flips around the vertical axis def flipVertically(image): flipped = cv2.flip(image, 1) return flipped # fetch the name of the image and store it in memory image_name = sys.argv image = cv2.imread(image_name) vertical = flipVerticallyWithIteration(image) horizontal = flipHorizontallyWithIteration(image) # output the images cv2.imwrite('vertical_flip.jpg', vertical) cv2.imwrite('horizontal_flip.jpg', horizontal)
2.. Make a Puzzle
Write a method that first divides the image into square blocks with a given length(a certain number of pixels on each side). To make sure each block contains the same number of pixels from the initial image, adjust the image dimensions to a multiple of the requested length. Your method should then scramble the blocks in a random or different order. To make sure you correctly divide the image into blocks, add borders to them before merging them back together. You could use the
cv2.copyMakeBorder(image, pixels_on_top, pixels_on_bottom, pixels_on_left, pixels_on_right) function.
Example result (60 pixels):
Example result (30 pixels):
Example result (15 pixels):
# Run: python puzzle.py <image_name> <no pixels/block side> <border thickness in pixels> import cv2 import numpy as np import sys import random # Input: which image to scramble, # the number of pixels on one side of a square block, # the number of pixels for border between blocks # Output: scrambled image def scrambleImage(image, pixs, addBorder): scrambled = image.copy() no_rows = scrambled.shape no_cols = scrambled.shape rows_n = (no_rows/pixs)*pixs cols_n = (no_cols/pixs)*pixs # resize to make sure each block has the same number of pixels from the initial image scrambled = cv2.resize(scrambled, (cols_n, rows_n)) # store all the blocks in a list, which we shuffle later and remerge allBlocks =  # for loops to extract all blocks for row in range(rows_n-pixs+1): for col in range(cols_n-pixs+1): if (row % pixs == 0 and col % pixs == 0): block = scrambled[row:row+pixs, col:col+pixs].copy() allBlocks.append(block) # randomly shuffle random.shuffle(allBlocks) # recompute the number of rows and columns after border is added hm_rows = rows_n/pixs hm_cols = cols_n/pixs rows_n = rows_n + 2*hm_rows*addBorder cols_n = cols_n + 2*hm_cols*addBorder scrambled = np.zeros((rows_n, cols_n, 3), np.uint8) pixs = pixs + 2*addBorder next_block = 0 # reassemble the blocks for row in range(rows_n-pixs+1): for col in range(cols_n-pixs+1): if (row % pixs == 0 and col % pixs == 0): cur_block = allBlocks[next_block].copy() next_block += 1 with_borders = cv2.copyMakeBorder(cur_block, addBorder,addBorder,addBorder,addBorder,cv2.BORDER_CONSTANT,value = 0) scrambled[row:row+pixs, col:col+pixs] = with_borders return scrambled image_name = sys.argv pix_per_square = int(sys.argv) border_thick = int(sys.argv) image = cv2.imread(image_name) scrambled = scrambleImage(image, pix_per_square, border_thick) cv2.imwrite(str(pix_per_square) + '_scrambled_' + image_name, scrambled)
3.. Replace by the Average Pixel
Write a method that replaces a given rectangular region specified by the top-left pixel and bottom-right pixel with the average of the pixels contained.
Example result (region consisting of top 15 rows replaced by average):
Example result (all rows replaced by their average):
Example result (all columns replaced by their average):
# Run: python average.py <image_name> <start_row> <end_row> <start_col> <end_col> import cv2 import numpy as np import sys # replaces a given rectangluar portion by its average def replaceRectByAverage(image, startRow, endRow, startCol, endCol): blue_sum = 0.0 green_sum = 0.0 red_sum = 0.0 pixel_count = 0 rectangle = image[startRow:endRow+1, startCol:endCol+1] rows = rectangle.shape cols = rectangle.shape for row in range(rows): for col in range(cols): pixel = rectangle[row, col] blue_sum += float(pixel) green_sum += float(pixel) red_sum += float(pixel) pixel_count += 1 #print blue_sum, green_sum, red_sum, pixel_count average_blue = int(blue_sum/float(pixel_count)) average_green = int(green_sum/float(pixel_count)) average_red = int(red_sum/float(pixel_count)) #print average_blue, average_green, average_red new_image = image.copy() new_rectangle = np.zeros((1,1,3), np.uint8) new_rectangle[0,0] = (average_blue, average_green, average_red) new_rectangle = cv2.resize(new_rectangle, (cols, rows)) new_image[startRow:endRow+1, startCol:endCol+1] = new_rectangle return new_image # replaces each row by the average pixel in the row def replaceEachRowByAverage(image): average_rows = image.copy() no_rows = image.shape no_cols = image.shape for row in range(no_rows): average_rows = replaceRectByAverage(average_rows, row, row, 0, no_cols - 1) cv2.imwrite('average_rows.jpg', average_rows) # replaces each column by the average pixel in the column def replaceEachColByAverage(image): average_cols = image.copy() no_rows = image.shape no_cols = image.shape for col in range(no_cols): average_cols = replaceRectByAverage(average_cols, 0, no_rows - 1, col, col) cv2.imwrite('average_cols.jpg', average_cols) image_name = sys.argv sr = int(sys.argv) # start row er = int(sys.argv) # end row sc = int(sys.argv) # start col ec = int(sys.argv) # end col image = cv2.imread(image_name) average_image = replaceRectByAverage(image, sr, er, sc, ec) cv2.imwrite('average_' + image_name, average_image) replaceEachRowByAverage(image) replaceEachColByAverage(image)
If you found this useful please take a moment to share this with your friends.