We have already seen how to create arrays and how to modify their dimensions. One last operation we can do is to combine multiple arrays. There are two ways to do that: by assembling arrays of same dimensions (concatenation, stacking etc.) or by combining arrays of different dimensions using broadcasting. Like in the previous chapter, we illustrate with small arrays and a real image.
import numpy as np
import matplotlib.pyplot as plt
import skimage
plt.gray();
image = skimage.data.chelsea()
Let's start by creating a few two 2D arrays:
array1 = np.ones((10,5))
array2 = 2*np.ones((10,3))
array3 = 3*np.ones((10,5))
The first operation we can perform is concatenation, i.e. assembling the two 2D arrays into a larger 2D array. Of course we have to be careful with the size of each dimension. For example if we try to concatenate array1
and array2
along the first dimension, we get:
np.concatenate([array1, array2])
Both array have 10 lines, but one has 3 and the other 5 columns. We can therefore only concatenate them along the second dimensions:
array_conc = np.concatenate([array1, array2], axis = 1)
array_conc.shape
plt.imshow(array_conc, cmap = 'gray');
If we now use our example of real image, we can for example concatenate the two first channels of our RGB image:
plt.imshow(np.concatenate([image[:,:,0], image[:,:,1]]));
plt.imshow(np.concatenate([image[:,:,0], image[:,:,1]], axis=1));
If we have several arrays with exact same sizes, we can also stack them, i.e. assemble them along a new dimension. For example we can create a 3D stack out of two 2D arrays:
array_stack = np.stack([array1, array3])
array_stack.shape
We can select the dimension along which to stack, again by using the axis
keyword. For example if we want our new dimensions to be the third axis we can write:
array_stack = np.stack([array1, array3], axis = 2)
array_stack.shape
With our real image, we can for example stack the different channels in a new order (note that one could do that easily with np.swapaxis
):
image_stack = np.stack([image[:,:,2], image[:,:,0], image[:,:,1]], axis=2)
plt.imshow(image_stack);
As we placed the red channel, which has the highest intensity, at the position of the green one (second position) our image now is dominated by green tones.
Numpy has a powerful feature called broadcasting. This is the feature that for example allows you to write:
2 * array1
Here we just combined a single number with an array and Numpy re-used or broadcasted the element with less dimensions (the number 2) across the entire array1
. This does not only work with single numbers but also with arrays of different dimensions. Broadcasting can become very complex, so we limit ourselves here to a few common examples.
The general rule is that in an operation with arrays of different dimensions, missing dimensions or dimensions of size 1 get repeated to create two arrays of same size. Note that comparisons of dimension size start from the last dimensions. For example if we have a 1D array and a 2D array:
array1D = np.arange(4)
array1D
array2D = np.ones((6,4))
array2D
array1D * array2D
Here array1D
which has a single line got broadcasted over each line of the 2D array array2D
. Note the the size of each dimension is important. If array1D
had for example more columns, that broadcasting could not work:
array1D = np.arange(3)
array1D
array1D * array2D
As mentioned above, dimension sizes comparison start from the last dimension, so for example if array1D
had a length of 6, like the first dimension of array2D
, broadcasting would fail:
array1D = np.arange(6)
array1D.shape
array2D.shape
array1D * array2D
Broadcasting can be done in higher dimensional cases. Imagine for example that you have an RGB image with dimensions $NxMx3$. If you want to modify each channel independently, for example to rescale them, you can use broadcasting. We can use again our real image:
image.shape
scale_factor = np.array([0.5, 0.1, 1])
scale_factor
rescaled_image = scale_factor * image
rescaled_image
plt.imshow(rescaled_image.astype(int))
Note that if we the image has the dimensions $3xNxM$ (RGB planes in the first dimension), we encounter the same problem as before: a mismatch in size for the last dimension:
image2 = np.rollaxis(image, axis=2)
image2.shape
scale_factor.shape
scale_factor * image2
As seen above, if we have a mismatch in dimension size, the broadcasting mechanism doesn't work. To salvage such cases, we still have the possibility to add empty axes in an array to restore the matching of the non-empty dimension.
In the above example our arrays have the following shapes:
image2.shape
scale_factor.shape
So we need to add two "empty" axes after the single dimension of scale_factor
:
scale_factor_corr = scale_factor[:, np.newaxis, np.newaxis]
scale_factor_corr.shape
image2_rescaled = scale_factor_corr * image2