3.3 Manipulating image data

Images as Numpy arrays

With AsPyLib, image data is stored in Numpy array variables.

Numpy arrays are multi-dimensional arrays, very similar to the arrays in Matlab. A tutorial to learn what are Numpy arrays and how to manipulate them can be found here. The differences with Matlab arrays are explained there.

One peculiarity is that Numpy arrays have a data type. There are for instance Numpy arrays of integers, Numpy arrays of single-precision reals, of double-precision reals, of booleans, etc. In AsPyLib, the image data is always stored as a single-precision reals (32 bits). This is a strong improvement with respect to the use of integers for calculations, because then all the rounding errors disappear and the accuracy is improved.

There is another, very strong advantage, of using the Numpy arrays for image data. Then the Numpy and Scipy modules (two “official” Python modules), which contain all the mathematical functions you can ever dream of, can be directly used with the image data in AsPyLib.

Images and image lists

The same instruction can be used to load the data of a single image, or the data of a list of images. As can be seen, the image data that is returned by the loading functions is stored in 2-dimensional or 3-dimensional arrays:

# -*- coding: iso_8859_1 -*-
from aspylib import astro

#------ image names ------
Folder = u"C:\\Images\\"
Images = [Folder + "Images-" + str(i+1) + ".fit" for i in range(5)]

#--- loads data of first image in the list, and prints size of data array to console ---
data0 = astro.get_imagedata([Images[0]])
print np.shape(data0)

#--- loads data of the full image list, and prints size of data array to console ---
data = astro.get_imagedata(Images)
print np.shape(data)

raw_input()

Here is the text that is displayed in the console when executing the script:

--- loading FITS image ---
Images-1.fit
(1266, 1676)

--- loading FITS image ---
Images-1.fit
Images-2.fit
Images-3.fit
Images-4.fit
Images-5.fit
(5, 1266, 1676)

In AsPyLib, most functions works exactly the same when we provide them, as an input parameter, the 3D array that corresponds to the data of an image list, or the 2D array that corresponds to the data of a single image. We just don’t have to pay attention to this.

Eventually, it may happen that some 3D array contains a single image. For instance if the following code is added to the script:

#--- loads data of first image in the list, and prints size of data array to console ---
data1 = data[0:1,:,:]
print np.shape(data1)

Then this text is displayed:

(1, 1266, 1676)

A 3D array with a single image is of course also perfectly accepted by the AsPyLib functions. This is not needed, but it is always possible to remove the third dimension with the functions available in the Numpy module, for instance by adding the following code:

#--- reshape the image ---
import numpy as np
data1 = np.squeeze(data1)

Accessing image pixels (indexing) or subimages (slicing)

With Numpy, the first dimension of an image is vertical. This feature may seem unusual but it is a convention that comes from well-established Python modules (Numpy, Pyfits) and we deliberately choosed to keep it in AsPyLib. The three dimensions of a Numpy array are thus as follows:

(image number, vertical dimension, horizontal dimension)

This organisation is actually quite logical : in Numpy, the first dimension is on the right, and dimensions are growing to the left.

When accessing elements in a Numpy array, the indices start at 0. For a 1D array with 4 elements, the indices are 0,1,2,3. The technique to select sub-arrays (called slicing) is a bit unusual, and may be confusing at first sight. For instance, the columns 11 to 20 in a single image are obtained with:

col_10_to_20 = images[:,10:20,:]

The indices 10 and 20 mean the following:

  • you pick up the first 20 columns from the full image
  • you remove the first 10 columns from this selection

In very much the same way, the 4 last columns, minus the last one, are obtained with:

col_4last_except1 = images[:,-4:-1,:]

which means:

  • you pick up the last 4 columns from the full image
  • you remove the last column from this selection

If you need only the last column:

col_last = images[:,-1:,:]

If you need all columns except the last one:

col_all_exceptlast = images[:,:-1,:]

etc, etc.

Naming convention : what are X and Y directions ?

Due to the internal structure of the standard Python modules (Numpy, Pyfits..) the image dimensions are as follow:

(image number, vertical dimension, horizontal dimension)

In AsPyLib 1.1.1 we call the vertical dimension X, and the horizontal dimension Y. Then the image dimensions are as follows:

(image number, X dimension, Y dimension)

and the programming is very easy : X is always before Y.

We think this convention was the right choice because then, when writing scripts one does not have to switch between “Y before X” for manipulating image arrays, and “X before Y” for the user interface. The choice of a vertical X axis allows to have the same ordering of X and Y at both levels: programming level, and user level.

Depending on the feedback from the users, the convention may be changed in the future though this is not considered at the moment.