Gathering Information from Color
In this article I would like to further explore what you can do with OpenCV by analyzing how often certain colors appear and in what proportions they are to be found in different types of images. I will also introduce the concept of histograms and go through some interesting applications. Histograms naturally link image processing to probability theory. And since probability and statistics serve as one of the core mathematical theories here, learning parts of it can improve your ability to apply your own ideas to computer vision. Note that the theoretic content of this article is considerably more advanced than the previous ones, but requires less coding though. Here are some questions to get you started.
How reliable do you think color is?
What can it tell us about an image?
Color is very sensitive to lighting conditions (illumination). While a colorful object may seem black at night, a piece of dark metal could appear white and shiny on a sunny day. As humans, we are able to acknowledge this matter of fact, but for a machine this would be way more difficult to grasp.
Consequently, grayscale (colorless) images are used more often when applying object recognition algorithms. However, using color as a simple feature which can be extracted from images is a very good way to prepare yourself before learning about more complex features.
Take a look at this image. All you can see is sand.
The sand is perceived differently under different illumination conditions:
Normal sand – control condition, yellowish – light brownish color
Sand covered by water – reflects more light and thus appears shinier, with a slight tendency towards blue
Sand in the shadow – receives less light from the sun and thus appears darker, with a strong tendency towards black
|Condition||Average light intensity|
|Sand covered by water||76%|
|Sand in the shadow||30%|
The sand can be visually perceived in variations of yellow, blue, black and possibly other colors, too. This proves just how unreliable color can be when it comes to automated detection (made by computers). The difficulty to use color comes from the uncertainty originating in the variable structure of the input and the conditions that affect it. And since probability theory can be used to solve problems involving uncertainty, I think you can already forsee its utility.
# Run: python saveRectagleS.py <image_name> # Gets several rectangular regions from an image using mouse clicks # Image is scaled in memory to have 500 rows only # 4 clicks needed # Process can be repeated # Program quits when 'q' key is pressed # Obtained images saved under: rectangle_i.jpg, highlighted.jpg import cv2 import numpy as np import sys # Saves the rectangle after 4th point is clicked def save_rectangle(): low_x = min(click_x) high_x = max(click_x) low_y = min(click_y) high_y = max(click_y) print 'rows:', low_x, high_x print 'cols:', low_y, high_y global real_image, highlighted_region sub_image = real_image[low_y:high_y, low_x:high_x] highlighted_region = real_image.copy() cv2.rectangle(highlighted_region, (low_x, low_y), (high_x, high_y), (0,255,0), 2) global rect_count cv2.imwrite('rectangle_' + str(rect_count) + '.jpg', sub_image) cv2.imwrite('highlighted.jpg', highlighted_region) rect_count += 1 real_image = highlighted_region.copy() # Comment previous line if you wish to show all extracted rectangles # Cleans up image for further rectangular extraction def wash_image(): global display_image, highlighted_region display_image = highlighted_region.copy() # Mouse event action def make_click(event, x, y, flags, param): if event == cv2.EVENT_LBUTTONUP: global click_count if click_count == 0: wash_image() cv2.imshow('image', display_image) print click_count click_x[click_count] = x click_y[click_count] = y cv2.circle(display_image, (x,y), 8, (0,255,0), 2) cv2.imshow('image', display_image) print 'drew circle' click_count += 1 if click_count == 4: save_rectangle() click_count = 0 click_count = 0 rect_count = 0 click_x = [0,0,0,0] click_y = [0,0,0,0] image_name = sys.argv real_image = cv2.imread(image_name) rows, cols, no_channels = real_image.shape maxsz = 500.0 if rows > maxsz: ratio = float(rows)/maxsz n_rows = int(float(rows)/ratio) n_cols = int(float(cols)/ratio) real_image = cv2.resize(real_image, (n_cols, n_rows)) display_image = real_image.copy() highlighted_region = real_image.copy() cv2.namedWindow('image') cv2.setMouseCallback('image',make_click) while(1): cv2.imshow('image',display_image) key = cv2.waitKey() if key == ord('q'): break cv2.destroyAllWindows()
# Extracts the average intensity for a number of images # The result is scaled to the interval [0,1] # Run: python average_intensity.py <image_name_1> <image_name_2> ... import cv2 import sys import numpy as np # Get the number of images in the input no_images = len(sys.argv) - 1 # Fetch all image names image_name =  for i in range(no_images): image_name.append(sys.argv[i+1]) # Iterate through all images for i in range(no_images): image = cv2.imread(image_name[i], 0) no_rows = image.shape no_cols = image.shape sumall = 0.0 # Iterate through all pixels and compute the sum of their intensities for row in range(no_rows): for col in range(no_cols): sumall += float(image[row, col]) average = sumall/float(no_rows*no_cols) # Scale result to the interval [0,1] and print it print average/255.0
Short version (using Numpy’s average):
# Iterate through all images for i in range(no_images): image = cv2.imread(image_name[i], 0) average = np.average(image) # Scale result to the interval [0,1] and print it print average/255.0
Before you read on, take a moment to think about the following questions.
How do you distinguish between the foreground and background of an image?
How do you know whether a photo is taken during the day or during the night?
Or even whether Picasso’s paintings are from his Blue or Rose Period?
Histograms are the best way to visualize the different amounts of color/intensity in an image. Do you know how to create a histogram? It is fairly similar to ordering some objects by their categories and then losing all information specific to those objects, while only preserving information about the category. The pixels in an image have a certain color and position. If the categories were colors, then the image histogram would lose the information about where pixels are located and would only be able to tell how many pixels of a certain color there are.
Long story short, a histogram counts the number of pixels of a certain type and displays their numbers in a graph.
Have you noticed, while looking at the two histograms, the two peaks and the valley between them?
What do you think they might represent?
In the image with the coins, the first peak stands for the pixels representing coins, the foreground, while the second one represents the background pixels. Notice that the gap between them is more visible than in the image with the hand.
Why do you think there are more pixels in the gap between the peaks in the second image?
The pixels that are associated with the gap have transitional intensities between the brightest and darkest objects in the image. We can notice them in the left part of the image, where the hand is slightly shadowed. Still, foreground and background can be separated quite distinctly.
These 2-peak types of histogram are called bimodal. This name comes from statistics and it refers to probability distributions. The mode is the maxima of the distribution. A distribution with two maximas will have 2 modes -> bimodal, while one with 3 maximas will be called trimodal and so on.
This section contains all the technical things that you might need to compute and plot histograms.
Short version – one image and one channel:
cv2.calcHist([image], [channel], mask, [no_bins], [start_bin_inclusive, end_bin_exclusive])
Typical grayscale example:
cv2.calcHist([image], , None, , [0, 256])
In case you are working with a grayscale image, simply replace
channel by 0. With BGR images, you can specify 0, 1 or 2 for blue, green and red channels.
Long version – more images and more channels:
cv2.calcHist([image1, image2, ...], [channel1, channel2, ...], mask, [no_bins_channel1, no_bins_channel2, ...], [low_channel1, high_channel1, low_channel2, high_channel2, ...])
This will return an
no_bins_channel_1 x no_bins_channel_2 x no_bins_channel_3 x ... array. The number of dimensions is equal to the length of the arrays in the 2nd and 4th arguments.
Note that the images specified must have the same size and depth.
For grayscale images, 0 will refer to the only channel in the first image, 1 will refer to the only channel in the second one and so on. For BGR images, 0, 1 and 2 will be used for blue, green and red channels in the first image, 3, 4 and 5 for the second image and so on.
This is an optional field, but if specified, it must have the same size as the images. All 0-valued pixels will be discarded when computing the histogram.
An array specifying into how many bins/categories the pixels in the corresponding channel are divided.
This refers to this part
[low_channel1, high_channel1, low_channel2, high_channel2, ...]. For color counting, the lows will be 0 and the highs 256. But if you work with something else, this changes. For example, if you simply wish to count only the brighter half of the color spectrum, you will specify low to be 128 and high to be 256. Note that this will also affect the number of bins.
For a more detailed explanation, take a look at the documentation.
The simplest way to create an image from the obtained histogram is to use the plot function from the Matplotlib library.
Grayscale image histogram:
from matplotlib import pyplot as plt image_name = sys.argv image = cv2.imread(image_name, 0) hist = cv2.calcHist([image], , None, , [0,256]) # Plotting the histogram and saving the figure plt.plot(hist) plt.savefig('hist_' + image_name)
Matplotlib even has a method that can plot the histogram directly:
plt.hist(image.ravel(),256,[0,256], color = 'gray') plt.savefig('hist_' + image_name)
Alternatively, this can also be done using an OpenCV method:
cv2.polylines(image, points, is_closed, color)
but you will need to convert the histogram into an array of points. The code here should do exactly this.
Having shown what histograms are and how to obtain them in Python, let us see whether we can decide between images depicting Blue and Rose Periods and between photos taken during daytime and nightime, just by looking at their histograms.
For our first example, we need to define what bluish and redish pixels are and count them. How would you define them?
To keep this simple, count any pixel with
blue_component >= 100 as bluish and those with
red_component >= 100 as redish. This is a little bit incorrect, but proves the point for our case. In the following sections/parts we will come up with a better definition.
Count (bluish) -> 77027 pixels
Count (redish) -> 178872 pixels
Percentage(bluish) -> 19%
Percentage(redish) -> 45%
Count (bluish) -> 184052 pixels
Count (redish) -> 54010 pixels
Percentage (bluish) -> 46%
Percentage (redish) -> 13%
You can see that even with our clumsy definition of blue and red it is already obvious from the 4 numbers which image belongs to which period.
Before plotting the histograms of the two images, it is recommendable either to scale them so that they have approximately the same number of pixels or to divide the numbers by the total number of pixels (compute percentages). Otherwise, a small patch of blue pixels in the image that is supposed to contain less blue might make you think it contains more. That can happen for instance if the image with the blue patch is much larger.
new_no_rows * new_no_cols = min_no_pix
new_no_rows = no_rows/ratio
new_no_cols = no_cols/ratio
By replacing 2 and 3 in 1, we get
(no_rows * no_cols)/ratio^2 = min_no_pix
which leads to
ratio = sqrt((no_rows * no_cols)/min_no_pix)
# Run: python scale_same_no_pix.py <image_name_1> <image_name_2> ... import cv2 import sys import math # Get the number of images provided no_images = len(sys.argv) - 1 image_name =  # Get all the names of the images that we want to resize to a common number of pixels for i in range(no_images): cur_name = sys.argv[i+1] image_name.append(cur_name) image =  for i in range(no_images): cur_image = cv2.imread(image_name[i]) image.append(cur_image) # Initialize minimum number of pixels to the one from the first image min_no_pix = image.shape * image.shape # Find out the minimum number of pixels out of all images for i in range(no_images): no_rows = image[i].shape no_cols = image[i].shape min_no_pix = min(min_no_pix, no_rows * no_cols) # Resize all images so that they contain about the same number of pixels as the minimum number of pixels for i in range(no_images): no_rows = image[i].shape no_cols = image[i].shape ratio = math.sqrt((float(no_rows)*float(no_cols))/float(min_no_pix)) n_rows = int(float(no_rows)/ratio) n_cols = int(float(no_cols)/ratio) image[i] = cv2.resize(image[i], (n_cols, n_rows)) cv2.imwrite(str(i) + '.jpg', image[i])
For our second example, we do not need to take color so much into account as intensity. So let us work with the grayscale version of the image when plotting the histogram. We can use a definition for bright pixels similar to the aforementioned one:
grayscale_component >= 100.
Percentage (bright) -> 53%
Percentage (bright) -> 12%
If you want to look at images and histograms at the same time, you might find this useful:
# Run: python concatenate_images.py <border_length> <image_name_1> <image_name_2> ... # Concatenates images horizontally # All images are resized to have the number of rows, equal the minimum among all images import cv2 import sys import numpy as np border_len = int(sys.argv) no_images = len(sys.argv) - 2 image_name =  for i in range(no_images): image_name.append(sys.argv[i+2]) image =  # Find the number of rows in the concatenated image min_no_rows = 5000 for i in range(no_images): cur_image = cv2.imread(image_name[i], 1) image.append(cur_image) no_rows = cur_image.shape min_no_rows = min(min_no_rows, no_rows) # Find the number of columns in the concatenated image no_cols_total = (no_images - 1)*border_len for i in range(no_images): cur_image = image[i] no_rows = cur_image.shape no_cols = cur_image.shape ratio = float(no_rows)/float(min_no_rows) new_no_cols = int(float(no_cols)/ratio) no_cols_total += new_no_cols appended_image = cv2.resize(cur_image, (new_no_cols, min_no_rows)) image[i] = appended_image concatenation = np.zeros((min_no_rows, no_cols_total, 3), np.uint8) col_start = 0 for i in range(no_images): if i != 0: col_start += border_len cur_image = image[i] no_rows = cur_image.shape no_cols = cur_image.shape concatenation[:,col_start:col_start+no_cols,:] = cur_image.copy() col_start += no_cols concatenation = cv2.copyMakeBorder(concatenation, border_len, border_len, border_len, border_len, cv2.BORDER_CONSTANT,value = 0) cv2.imwrite('concatenation.jpg', concatenation)
In the next post I shall talk about how to improve image search using histograms and how to count pixels more efficiently. Some problem solving coming up, too. Stay tuned.
If you found this useful please take a moment to share this with your friends.