All images are essentially matrices with a variable number of dimensions where each element represents the value of one pixel. The different dimensions and the pixel values can have very different meanings depending on the type of image considered, but the structure is the same.
Python does not allow by default to gracefully handle multi-dimensional data. In particular it is not desgined to handle matrix operations. Numpy was developed to fill in this blank and offers a very similar framework as the one offered by Matlab. It is underlying a large number of packages and has become abolsutely essential to Python scientific programming. In particular it underlies the functions of scikit-image. The latter in turn forms the basis of other software like CellProfiler. It is thus essential to have a good understanding of Numpy to proceed.
Instead of introducing Numpy in an abstract way, we are going here to present it through the lense of image processing in order to focus on the most useful features in the context of this course.
Some test images are provided directly in skimage, so let us look at one (we'll deal with the details of image import later). First let us import the necessary packages.
import numpy as np
import skimage
import matplotlib.pyplot as plt
plt.gray(); # MZ: nsure it will use gray scale for the plotting
image = skimage.data.coins()
# submodule skimage.data => provide images
# MZ: added to have all outputs
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
a=5
a
b=2
b
# => will print 5 and 2 and not only 2
The first thing we can do with the image is simply look at the output:
image # MZ: it is a numpy arrray
We see that Numpy tells us we have an array
and we don't have a simple list of pixels, but a list of lists representing the fact that we are dealing with a two-dimensional object. Each list represents one row of pixels. Numpy smartly only shows us the first/last rows/columns. We can use the .shape
method to check the size of the array:
image.shape # MZ: give the dimension
This means that we have an image of 303 rows and 384 columns. We can also visualize the image using matplotlib:
plt.imshow(image);
%matplotlib inline
# %matplotlib notebook
# with notebook -> you can zoom, convenient for notebook
# MZ: magic lines for jupyter with %
image
In the output above we see that we have one additional piece of information: the array has dtype = uint8
, which means that the image is of type unsigned integer 8 bit. We can also get the type of an array by using:
image.dtype # MZ: dtype is an attribute of "image" (// shape)
Standard formats we are going to see are 8bit (uint8), 16bit (uint16) and non-integers (usually float64). The type of the image pixels set what values they can take. For example 8bit means values from $0$ to $2^8 -1= 256-1 = 255$. Just like for example in Fiji, one cane change the type of the image. If we know we are going to do operations requiring non-integers we can turn the pixels into floats trough the .astype()
function.
# MZ:
# a bit more careful with types of images !
# if integer or not it really matters !
# numpy different from Python philosophy and dynamic typing
# be careful, e.g. if values > 255 -> can behave weird
image_float = image.astype(float)
Notice the '.':
image_float
image_float.dtype
The importance of the image type goes slightly against Python's philosophy of dynamics typing (no need to specify a type when creating a variable), but a necessity when handling images. We are going to see now what types of operations we can do with arrays, and the importance of types is going to be more obvious.
Numpy is written in a smart way such that it is able to handle operations between arrays of different sizes. In the simplest case, one can combine a scalar and an array, for example through an addition:
image
image+10 # add 10 to each element of the array
# MZ: advantage of using nupy ! will not work with list ! here it works pixel-wise
Here Numpy automatically added the scalar 10 to each element of the array. Beyond the scalar case, operations between arrays of different sizes are also possible through a mechanism called broadcasting. This is an advanced (and sometimes confusing) features that we won't use in this course but about which you can read for example here.
The only case we are going to consider here is operations between arrays of same size. For example we can multiply the image by itself. We use first the float version of the image:
image_sq = image_float*image_float
# MZ:
# does not perform matrix multiplication !, but multiply each pixel with each pixel at the same position
# (will not perform like in linear algebra) (will have to use other numpy functions)
image_sq
image_float
Looking at the first row we see $47^2 = 2209$ and $123^2=15129$ etc. which means that the multiplication operation has happened pixel-wise. Note that this is NOT a classical matrix multiplication. We can also see that the output has the same size as the original arrays:
image_sq.shape
image_float.shape
Let's see now what happens when we square the original 8bit image:
image*image
We see that we don't get at all the expected result. Since we multiplied two 8bit images, Numpy assumes we want an 8bit output. And therefore the values are bound between 0-255. For example the first value is just the remainder of the modulo 256:
# MZ:
# what is above 255 get reassigned to a 0-255 value
# as numpy assumed that we have 8bit int !!!
# if you want > 255 values -> first make the matrix as float
2209%256
The same thing happens e.g. if we add an integer scaler to the matrix:
print(image+230)
Clearly something went wrong as we get values that are smaller than 230. Again any value "over-flowing" above 255 goes back to 0.
This problem can be alleviated in different ways. For example we can combine a integer array with a float scaler and Numpy will automatically give a result using the "most complex" type:
image_plus_float = image+230.0
print(image_plus_float) # MZ: e.g. has removed 256: 277-256 = 21
To be on the safe side we can also explicitely change the type when we know we might run into this kind of trouble. This can be done via the .astype()
method:
# MZ:
# combine integer with float -> Python logic, use the most complex type
# will convert int to float and the output will be float
image_float = image.astype(float)
image_float.dtype
Again, if we combine floats and integers the output is going to be a float:
image_float+230
A set of important operations when processing images are logical (or boolean) operations that allow to create masks for features to segment. Those have a very simple syntax in Numpy. For example, let's compare pixel intensities to some value a:
threshold = 100
image > threshold
We see that the result is again a pixel-wise comparison with a, generating in the end a boolean or logical matrix. We can directly assign this logical matrix to a variable and verify its shape and type and plot it:
image_threshold = image > threshold
image_threshold.shape
image_threshold.dtype
image_threshold
plt.imshow(image_threshold);
Of course other logical operator can be used (<, >, ==, !=) and the resulting boolean matrices combined:
threshold1 = 70
threshold2 = 100
image_threshold1 = image > threshold1
image_threshold2 = image < threshold2
# MZ
# logical: often use of masks
# e.g. you have a mask for dog and a mask for houses -> apply the masks to the images using logicals
# MZ: here we deal with logical matrices
image_AND = image_threshold1 & image_threshold2 # MZ: True in the 2 matrices
image_XOR = image_threshold1 ^ image_threshold2 # MZ: what is True in 1 matrix but not in the other one
# MZ: multiple panels on matplot
plt.figure(figsize=(15,15)) # set the figure sizes
plt.subplot(1,4,1) # how many 1 row, 4 columns, and what is the 1st element
plt.imshow(image_threshold1)
plt.subplot(1,4,2) # in the subplot where 1 row and 4 columns, what should be the 2nd element
plt.imshow(image_threshold2)
plt.subplot(1,4,3)
plt.imshow(image_AND)
plt.subplot(1,4,4)
plt.imshow(image_XOR);
To broadly summarize, one can say that Numpy offers three types of operations: 1. Creation of various types of arrays, 2. Pixel-wise modifications of arrays, 3. Operations changing array dimensions, 4. Combinations of arrays.
Often we are going to create new arrays that later transform them. Functions creating arrays usually take arguments spcifying both the content of the array and its dimensions.
Some of the most useful functions create 1D arrays of ordered values. For example to create a sequence of numbers separated by a given step size:
np.arange(0,20,2) # MZ: from where to where in step of what
Or to create an array with a given number of equidistant values:
np.linspace(0,20,5)
In higher dimensions, the simplest example is the creation of arrays full of ones or zeros. In that case one only has to specify the dimensions. For example to create a 3x5 array of zeros:
np.zeros((3,5))
Same for an array filled with ones:
np.ones((3,5))
Until now we have only created one-dimensional lists of 2D arrays. However Numpy is designed to work with arrays of arbitrary dimensions. For example we can easily create a three-dimensional "ones-array" of dimension 5x8x4:
array3D = np.ones((2,6,5))
array3D
array3D.shape
# MZ: you should decide which dimension is the channel/volume (usually the 1st or the last)
# MZ: numpy functions can easily deal with any dimension
# (e.g. it is easy to convert code written for 2D to code for 3D objects)
And all operations that we have seen until now and the following ones apply to such high-dimensional arrays exactly in the same way as before:
array3D*5
We can also create more complex arrays. For example an array filled with numbers drawn from a normal distribution:
np.random.standard_normal((3,5))
As mentioned before, some array-creating functions take additional arguments. For example we can draw samples from a gaussian distribution whose mean and variance we can specify.
np.random.normal(10, 2, (5,2))
# MZ: NB "Tab" for auto-completion; "Shift+Tab" to get the help for the function
Numpy has a large trove of functions to do all common mathematical operations matrix-wise. For example you can take the cosine of a matrix:
angles = np.random.random_sample(5)
angles
np.cos(angles)
Or to calculate exponential values:
np.exp(angles)
And many many more.
Some functions are accessible in the form of method, i.e. they are called using the dot notation. For example to find the maximum in an array:
angles.max() # MZ: return the max value inside the array
Alternatively there's also a maximum function:
np.max(angles) # MZ: same as above but calling directly as a function
The max
function like many others (min, mean, median etc.) can also be applied to a given axis. Let's imagine we have a 3D image (multiple planes) of 10x10x4 pixels:
volume = np.random.random((10,10,4))
#volume
If we want to do a maximum projection along the third axis, we can specify:
projection = np.max(volume, axis = 2)
# MZ: specify an axis
# 0 1 2
# maximum along the 3 -> axis = 2
# creates a projection
projection.shape
projection2 = np.max(volume, axis = 0)
projection2.shape
projection3 = np.max(volume, axis = 1)
projection3.shape
We see that we have indeed a new array with one dimension less because of the projection.
Finally arrays can be combined in multiple ways. For example if we want to assemble to images with the same size into a stack, we can use the stack function:
image1 = np.ones((4,4))
image2 = np.zeros((4,4))
stack = np.stack([image1, image2],axis = 2)
stack.shape
Just like broadcasting, the selection of parts of arrays by slicing or indexing can become very sophisticated. We present here only the very basics to avoid confusion. There are often multiple ways to do slicing/indexing and we favor here easier to understant but sometimes less efficient solutions.
To simplify the visualisation, we use here a natural image included in the skimage package.
image = skimage.data.chelsea()
image.shape # MZ: 300x451 pixels and 3 planes: RGB
We see that the image has three dimensions, probably it's a stack of three images of size 300x400. Let us try to have a look at this image hoping that dimensions are handled gracefully:
plt.imshow(image); # MZ: if pass an image with 3 planes as last dim -> implicitly assumes it is an RGB image
So we have an image of a cat with dimensions 300x400. The image being in natural colors, the three dimensions probably indicate an RGB (red, green, blue) format, and the plotting function just knows what to do in that case.
Let us now just look at one of the three planes composing the image. To do that, we are going the select a portion of the image array by slicing it. One can give:
What portion is selected has to be specified for each dimensions of an array. In our particular case, we want to select all rows, all columns and a single plane of the image:
image.shape
image[:,:,1].shape # MZ: select only the 2nd plane
plt.imshow(image[:,:,0],cmap='gray')
# MZ: cmap argument -> here redondant with plt.gray();
# different colormaps provided by matplotlib (map pixel-values to colors)
plt.title('First plane: Red');
We see now the red layer of the image. We can do the same for the others by specifying planes 0, 1, and 2:
plt.figure(figsize=(10,10))
plt.subplot(1,3,1)
plt.imshow(image[:,:,0],cmap='gray')
plt.title('First plane: Red')
plt.subplot(1,3,2)
plt.imshow(image[:,:,1],cmap='gray')
plt.title('Second plane: Green')
plt.subplot(1,3,3)
plt.imshow(image[:,:,2],cmap='gray')
plt.title('Third plane: Blue');
# MZ:
# no physical meaning to the colormaps, you can put what ever you want as colors
# is only the rendering of the pixel values
Logically intensities are high for the red channel and low for the blue channel as the image has red/brown patterns. We can confirm that by measuring the mean of each plane. To do that we use the same function as above but apply it to a singel sliced plane:
image0 = image[:,:,0] # MZ: retain only 1st dim
np.mean(image0) # MZ: mean of all pixels
and for all planes using a comprehension list:
[np.mean(image[:,:,i]) for i in range(3)] # MZ: calculat the mean of every plane
To look at some more details let us focus on a smaller portion of the image e.g. one of the cat's eyes. For that we are going to take a slice of the red image and store it in a new variable and display the selection. We consider pixel rows from 80 to 150 and columns from 130 to 210 of the first plane (0).
image_red = image[80:150,130:210,0]
plt.imshow(image_red,cmap='gray');
There are different ways to select parts of an array. For example one can select every n'th element by giving a step size. In the case of an image, this subsamples the data:
image_subsample = image[80:150:3,130:210:3,0]
plt.imshow(image_subsample,cmap='gray');
In addition to slicing an array, we can also select specific values out of it. There are many different ways to achieve that, but we focus here on two main ones.
First, one might have a list of pixel positions and one wishes to get the values of those pixels. By passing two lists of the same size containing the rows and columns positions of those pixels, one can recover them:
row_position = [0,1,2,3]
col_position = [0,1,0,1]
print(image_red[0:5,0:5])
# MZ: pass the 2 lists -> assumes that you mean the pixels you want
image_red[row_position,col_position]
# MZ: output is just a list of pixels, not in 3 dim anymore ! output is 1D
# MZ => you can extract either with 3-dot notation or by passing a list
Alternatively, one can pass a logical array of the same dimensions as the original array, and only the True pixels are selected. For example, let us create a logical array by picking values above a threshold:
threshold_image = image_red>120
Let's visualize it. Matplotlib handles logical arrays simply as a binary image:
plt.imshow(threshold_image)
plt.title('Thresholded logical image');
We can recover the value of all the "white" (True) pixels in the original image by indexing one array with the other:
selected_pixels = image_red[threshold_image]
# MZ:
# create a mask with logical array
# pass another image, of the same size, should be a boolean array and
# instead of passing explicit lists of rows/columns -> direct pass an array
# output is again a list
# useful e.g. for segmentation (create a mask where you have the cells only to extract
# from other panes where you have light emission and average the light emission)
print(selected_pixels)
And now ask how many pixels are above threshold and what their average value is.
len(selected_pixels)
np.mean(selected_pixels)
threshold_image # MZ: mask is a boolean array 2D
np.argwhere(threshold_image)
# MZ: 2 dim arrays -> gives where are the True values in x,y coordinates
# MZ: to have all attributes and functions associated with an object
#dir(threshold_image)
# MZ: same works for packages
#dir(np)
We now know that there are 2585 pixels above the threshold and that their mean is 153.6
# to plot with transprency: (e.g. to plot 1 fig on the top of another)
# imshow(alpha=0.5)