2D/3D image registration with C++ library ITK

Comparison between mineral heterogeneity from a iliac creat bone sample recorded using (a) micro-radiography and (b) micro-tomography

The objective of this project was to assess if a bench micro-CT (Skyscan 1174) could give a good estimation of the heterogeneity of bone mineral density.

Microradiography is one of the gold standard methods to measure bone mineral density at the tissue level. Thin section of bone tissue with calibrated thickness are radiographed at high resolution using a monochromatic X-ray source. The density of the tissue is measure thanks to  calibrated standards.

To compare the two acquisition modalities, the mineral density of a cylindrical bone sample was first recorded using microtomography providing a three-dimensional image of the sample. A thin section was then extracted from the same sample for microradiography measurement giving a two dimensional evaluation of the mineral density of bone sample.

Finally, 2D/3D image registration was used to retrieve the two-dimension image from the micro-radiography measurement inside the three-dimension micro-tomography image to compare the two acquisition methods. The registration have been carried out using the segmentation and registration toolkit ITK implemented in an homemade C++ program.


Microradiography measurement of a trabecular bone sample next to an aluminum standard used for the calibration of bone mineral density

Contact microradiography of 100 µm-thick bone sections was performed to assess the degree of mineralization of bone (DMB) using an X-ray diffraction unit PW 1830/40 (Philips, Limeil Brévannes, France). The nickel-filtered copper Kα radiation was used under 25 kV and 25 mA. Both bone sample and an aluminum standard X-ray absorption was recorded on a high-resolution film exposed for 20 min (Geola, Slavich International Wholesale Office, Vilnius, Lithuania). Acquisitions of microradiographs were performed with a digital camera (image pixel size: 5.64 µm). After calibration using the aluminum standard, the mean gray level of BSUs were converted into degree of mineralization values (in g mineral/cm3).

Microtomography measurements

(a) Iliac crest biopsy used for the experiment. (b) X-ray absorption before reconstruction of the 3D image. (c) axial and (d) sagittal view of the 3D reconstruction

First, bone samples mineral density have been measured using a high resolution CT-scanner (Skyscan 1174). This laboratory scanner uses a conical X-ray


source that is recorded on a CCD camera combined with an X-ray scintillator. The CCD camera has a resolution of 1024×1024 pixels. The scanner can achieve a final voxel size between 6 and 30 μm  for a maximal resolution of 12 μm (10% of the modulation transfer function or MTF).

In this study, the image acquisition voxel size has been assigned to 8 μm which proved to be the best compromised between image resolution and image quality (less imaging artifacts).

A complete calibration of the sys

tem has been performed before the study allowing for a numerical correction of sample rotation alignment, a calibration of image voxel size, an adjustment of the camera focus point and , a flat-field correction to minimize camera-scintillator system inter pixel sensitivity.  The micro-scanner source was setup  to 50kV and 800 μA. Image acquisition have been performed on 180 degrees with a 0.4 degree step size with an averaging of two image per step. Before reconstruction, images have been smoothed with a gaussian filter (σ = 1, k = 2) to minimize high frequency noise and an ring artifact correction has been applied.

Image registration

To make a comparison between the two modality, it is necessary to compare the measurement of the same parts of the sample. Therefore, we used a 2D/3D registration algorithm to find the 2D microtomography section into the 3D volume obtained by microtomography. Once the 2D section has been found in the 3D volume, a comparison of the two modalities could be achieved.

Image registration principle

Image registration consist in putting to image in relation to compare them of combine their information. It therefore consist in finding a geometrical transformation which allow to transform one image into the other. Image can be from different modalities, CT-scan and MRI for example.

Image registration is performed in several steps as illustrated in the following image:

The reference image f(X) is fixed. The second image, or moving image m(X), will be subjected to the transformations. Image registration is an optimization process aiming at finding the geometric transform T(X) that will allow to align the moving image to the reference image. In mathematical language the registration problem can be written as follow:


  • f and m, the fixed and moving images,
  • T, the Transform,
  • E, the ensemble of transforms in which T belongs to,
  • S, a similarity criteria,
  • min, the minimum function which represent the optimization criteria.

The transform can be linear, which includes rigid transforms (rotation, translation), similarity, affine and projection. Transform can also be non-linear (finite element type registration). Increasing registration parameters and image dimensions will lead to more complex and time consuming registrations, and lower chance of success.

Images are generally defined in a discrete space and divided into pixels (or voxels in 3D). To minimize the problem due to different image pixel size, images are converted into a continuous physical space.

The metric is comparing the intensity values between the two images after the transform has been applied. When a point is converted from one space to the other, this point will most likely not correspond to a point of the second image.

The intensity at the coincidence point is computed by interpolation. The interpolation method can impact the results. Some common interpolation algorithm are the nearest neighbor, the linear interpolation or the B-spline interpolation.

The metric objective is to evaluate quantitatively the similarity between the moving image and the fixed image after transformation by comparing their intensities. It is a crucial element of the registration process and it should be chosen with care depending on the type of problem to solve (image type and similarity criteria (contours or grey level).

The optimizer determine the parameters of the transform that allow an optimization of the metric. The optimization algorithm depend on the type of transform to optimize and the criteria given by the metric.

Choice of registration components

All algorithms are part of the C++ library, The Insight Toolkit (ITK) which is dedicated to image segmentation and registration. This library is principally dedicated to the development of software for the processing and analysis of medical images.

Among the transforms proposed in ITK, we used the transforms named VersorRigid3DTransform et Similarity3DTransform . The former translate into a rigid body transform in a 3D space and can be defined by a rotation expressed by a versor (3 paramerters) followed by a translation (3 parameters) necessitating  the optimization of 6 parameters. The second transform add a scaling factor which bring the total of parameters to optimize to 7.

For the interpolator algorithm, we used a simple linear interpolation algorithm (LinearInterpolateImageFunction).

Despite both image acquisition method being based on X-ray absorption, the two modalities are significantly different. Even if we anticipate that both systems image intensities are related to bone mineral density, the relationship between the gray levels and the mineral density varies from one system to the other. We therefore assumed that the images contained different intensities information and used the multimodal metric named  MattesMutulaInformationImageToImageMetric. Mutlimodal metrics quantify mutual information between image A and B by measuring a relationship between a random variable (in term of intensity) from A with a random variable from B. Since the relationship between the two images does not need to be specified, images from completely different acquisition systems can be compared (ex. CT-scanner and MRI) as long a there is a similarity between the two images. The mutual information is defined in term of enthropy.

We used a classic optimizer using a step gradient algorithm named RegularStepGradientDescent. This algorithm evolve step by step in the direction opposed to the gradient. This is based on the observation that, if the function to optimize F(X) can be derived in a point a, then F(X) decrease more rapidly in the direction opposite to F gradient in a. By defining b as:

with γ>0, the algorithm step, we have F(a) ≥F(b). By initializing the algorithm to x0, the first approximation of F local minimum:

The series (xn) converge towards the local minimum researched as illustrated in the following figure:

Evolution of the step gradient algorithm on a bowl function.

On top of all the classical registration elements, we used a multi-resolution approach which allow to improve the accuracy and reliability of the results while reducing computing times. It consist in performing a first registration at low image resolution and use the results to initialize a registration at higher resolution. This allow to increase success chances by removing local minimum at lower resolutions.


#include "itkImage.h"
#include "itkImageFileReader.h"
#include "itkImageFileWriter.h"
#include "itkResampleImageFilter.h"
#include "itkCastImageFilter.h"

Followed by the headers dedicated to image registration

#include "itkSimilarity3DTransform.h"
#include "itkLinearInterpolateImageFunction.h"
#include "itkNormalizedCorrelationImageToImageMetric.h"
#include "itkRegularStepGradientDescentOptimizer.h"

The component for multi-resolution analysis are also necessary. The filter RecursiveMultiResolutionPyramidImageFilter is used to defined pyramidal analysis parameters.

#include "itkMultiResolutionImageRegistrationMethod.h"
#include "itkRecursiveMultiResolutionPyramidImageFilter.h"

And finally, math function that are always useful:

#include "math.h"

The code describing how to follow the evolution of the registration process (function itkCommand.h) is not described here.  More information can be found in the ITK documentation (www.ITK.org).

With the transform function used here, the transform needs to be described by a versor. The image processing software we used to get the initial solution of the registration give a transform using Euler’s angles. We therefore implemented a conversion module form Euler’s angles to a versor.

const double degreesToRadians = vcl_atan(1.0) / 45.0;
double c1 = cos(rx*degreesToRadians/2);
double s1 = sin(rx*degreesToRadians/2);
double c2 = cos(ry*degreesToRadians/2);
double s2 = sin(ry*degreesToRadians/2);
double c3 = cos(rz*degreesToRadians/2);
double s3 = sin(rz*degreesToRadians/2);

aw = c1*c2*c3 – s1*s2*s3;
ax = c1*c2*s3 + s1*s2*c3;
ay = s1*c2*c3 + c1*s2*s3;
az = c1*s2*c3 – s1*c2*s3;
angle = 2 * acos(aw);
double norm = ax*ax+ay*ay+az*az;

If all Euler angles are equal to zero, the versor angle is also equal to zero. In that case, the direction vector is set to an arbitrary value to avoid division by zero.

if (norm < 0.001)
ax = 1;
ay = az = 0;
norm = sqrt(norm);
ax /= norm;
ay /= norm;
az /= norm;

Then , the images are created. The micro-radiography is used as the reference image and is kept fixed during the registration process. In ITK, the registration must be done in a fixed dimension space. Therefore, it is not possible to perform a 2D/3D registration as is. To do so, the 2D image has to be converted in the 3D space by using a third unit dimension. Both initial images are defined as 8 bits grey level images:

const unsigned int Dimension = 3;
typedef unsigned short PixelType;
typedef itk::Image< PixelType, Dimension > FixedImageType;
typedef itk::Image< PixelType, Dimension > MovingImageType;

To improve registration precision, the images are converted into 32 bit during the transformations:

typedef float InternalPixelType;
typedef itk::Image < InternalPixelType, Dimension > InternalImageType;

The registration component are then declareded:

typedef itk::Similarity3DTransform< double > TransformType;
typedef itk::LinearInterpolateImageFunction< InternalImageType, double > InterpolatorType;
typedef itk::NormalizedCorrelationImageToImageMetric< InternalImageType, InternalImageType > MetricType;
typedef itk::RegularStepGradientDescentOptimizer OptimizerType;

Followed by the multi-resolution components:

typedef itk::MultiResolutionImageRegistrationMethod< InternalImageType, InternalImageType > RegistrationType;
typedef itk::RecursiveMultiResolutionPyramidImageFilter< InternalImageType, InternalImageType > FixedImagePyramidType;
typedef itk::RecursiveMultiResolutionPyramidImageFilter< InternalImageType, InternalImageType > MovingImagePyramidType;

Each component are then initialized:

TransformType::Pointer transform = TransformType::New();
OptimizerType::Pointer optimizer = OptimizerType::New();
InterpolatorType::Pointer interpolator = InterpolatorType::New();
RegistrationType::Pointer registration = RegistrationType::New();
MetricType::Pointer metric = MetricType::New();
FixedImagePyramidType::Pointer fixedImagePyramid =
MovingImagePyramidType::Pointer movingImagePyramid =

The components are connected to the registration method:

registration->SetTransform( transform );
registration->SetInterpolator( interpolator );
registration->SetMetric( metric );
registration->SetOptimizer( optimizer );
registration->SetFixedImagePyramid( fixedImagePyramid );
registration->SetMovingImagePyramid( movingImagePyramid );

To be able to read and writw images, the classes ImageFileReader and ImageFileWriter are used.

typedef itk::ImageFileReader< FixedImageType > FixedImageReaderType;
typedef itk::ImageFileReader< MovingImageType > MovingImageReaderType;
FixedImageReaderType::Pointer fixedImageReader = FixedImageReaderType::New();
MovingImageReaderType::Pointer movingImageReader = MovingImageReaderType::New();
fixedImageReader->SetFileName( fixedImageName );
movingImageReader->SetFileName( movingImageName );

The function CastImageFilter is used to convert input images to 32 bits:

typedef itk::CastImageFilter< FixedImageType, InternalImageType > FixedCastFilterType;
typedef itk::CastImageFilter< MovingImageType, InternalImageType > MovingCastFilterType;
FixedCastFilterType::Pointer fixedCaster = FixedCastFilterType::New();
MovingCastFilterType::Pointer movingCaster = MovingCastFilterType::New();
fixedCaster->SetInput( fixedImageReader->GetOutput() );
movingCaster->SetInput( movingImageReader->GetOutput() );

The 32 bits images are then used as input to the registration function:

registration->SetFixedImage( fixedCaster->GetOutput() );
registration->SetMovingImage( movingCaster->GetOutput() );
fixedCaster->GetOutput()->GetBufferedRegion() );

The image pixel spacing is important so that images are represented at their original scale. The pixel spacing is determined during image acquisition, after proper calibration:

FixedImageType::Pointer fixedImage = fixedImageReader->GetOutput();
FixedImageType::SpacingType fixedSpacing;
fixedSpacing[0] = 2.85; // Spacing along X
fixedSpacing[1] = 2.85; // Spacing along Y
fixedSpacing[2] = 2.85; // Spacing along Z
MovingImageType::Pointer movingImage = movingImageReader->GetOutput();
MovingImageType::SpacingType movingSpacing;
movingSpacing[0] = 8.07; // Spacing along X
movingSpacing[1] = 8.07; // Spacing along Y
movingSpacing[2] = 8.07; // Spacing along Z

By default, to create multi-resolution pyramids, the resloution of the pyramid at the level n-1 is reached by dividing by 2 the number of pixel in each dimension and by doubing the pixel spacing. Therfore for in image with 1024 x 1024 x 800 voxels with a 8.07 μm, at the lower level, the image size would be 512 x 512 x 400 voxels with a pixel spacing of 16.14 μm. We used the automatic method for the 3D microtomography image. However, for the 2D image, the third added dimension has remain fixed to 1 pixel.


Level        Pyramid 2D     Pyramid 3D
Level 1         [4 4 1]                [4 4 4]
Level 2         [2 2 1]                [2 2 2]
Level 3         [1 1 1]                [1 1 1]


The number of levels in the pyramid is defined by the parameter nScales. The following code is used to define the sub-sampling of the 2D image:

typedef FixedImagePyramidType::ScheduleType FixedScheduleType;
FixedScheduleType fixedSchedule = fixedImagePyramid->GetSchedule();
for ( unsigned int dim = 0; dim < FixedImageType::ImageDimension; dim++ )
fixedSchedule[ nScales-1 ][ dim ] = maxScale;
for ( int level=nScales-2; level >= 0; level-- )
for ( unsigned int dim = 0; dim < FixedImageType::ImageDimension; dim++ )
fixedSchedule[ level ][ dim ] = 2*fixedSchedule[ level+1 ][ dim ];

The schedule is then applied to the 2D pyramid

typedef FixedImageRegionType::IndexType FixedIndexType;
FixedIndexType inputStart2D = fixedRegion.GetIndex();
for ( unsigned int level=0; level < nScales; level++ )
FixedImageSizeType size;
FixedIndexType start;
sampledRes2D[ level ] = 0.;
for ( unsigned int dim = 0; dim < FixedImageType::ImageDimension; dim++ )
float scaleFactor = static_cast<float>( fixedSchedule[ level ][ dim ] );
size[ dim ] = static_cast<FixedImageSizeType::SizeValueType>(
floor( static_cast<float>( fixedSize[ dim ] ) / scaleFactor ) );
if( size[ dim ] < 1 )
size[ dim ] = 1;
fixedSchedule[ level ][ dim ] = 1;
std::cout << level << " " << dim << " "
<< size[ dim ] << " " << fixedSchedule[ level ][ dim ]
<< std::endl;
scaleFactor = static_cast<float>( fixedSchedule[ level ][ dim ] );
start[ dim ] = static_cast<FixedIndexType::IndexValueType>(
ceil( static_cast<float>( inputStart2D[ dim ] ) / scaleFactor ) );
if ( dim < 2 )
sampledRes2D[ level ] += scaleFactor*fixedResolution[ dim ]
*scaleFactor*fixedResolution[ dim ];
sampledRes2D[ level ] = sqrt( sampledRes2D[ level ] );
fixedImageRegionPyramid[ level ].SetSize( size );
fixedImageRegionPyramid[ level ].SetIndex( start );
fixedImagePyramid->SetSchedule( fixedSchedule );

For the 3D pyramid, the function SetNumberOfLevels compute the schedule automatically:

movingImagePyramid->SetNumberOfLevels( nScales );

To initialize the transform, it is necessary to define the center of rotation of the two images. The center is set as the geometric center

double movingOrigin[ Dimension ];

const itk::Vector<double, 3> movingResolution =

typedef MovingImageType::RegionType MovingImageRegionType;
typedef MovingImageRegionType::SizeType MovingImageSizeType;

MovingImageRegionType movingRegion =
MovingImageSizeType movingSize = movingRegion.GetSize();

movingOrigin[0] = movingResolution[0]*((double) movingSize[0])/2.;
movingOrigin[1] = movingResolution[1]*((double) movingSize[1])/2.;
movingOrigin[2] = movingResolution[2]*((double) movingSize[2])/2.;

TransformType::InputPointType center;

center[0] = cx + movingOrigin[0];
center[1] = cy + movingOrigin[1];
center[2] = cz + movingOrigin[2];


double fixedOrigin[ Dimension ];

const itk::Vector<double, 3> fixedResolution =

typedef FixedImageType::RegionType FixedImageRegionType;
typedef FixedImageRegionType::SizeType FixedImageSizeType;

FixedImageRegionType fixedRegion =
FixedImageSizeType fixedSize = fixedRegion.GetSize();
fixedOrigin[0] = movingOrigin[0]
– fixedResolution[0]*((double) fixedSize[0])/2.;
fixedOrigin[1] = movingOrigin[1]
– fixedResolution[1]*((double) fixedSize[1])/2.;
fixedOrigin[2] = movingOrigin[2];

fixedImageReader->GetOutput()->SetOrigin( fixedOrigin );

The rotation is defined by a versor with a unit vector [ax ay az] and a rotation angle angle. The stranslation is defined by [tx ty tx] and the scaling factor by the parameter scale.

An initial guess can be used as the starting poing of the registration.

typedef TransformType::VersorType VersorType;
typedef VersorType::VectorType VectorType;

VersorType rotation;
VectorType axis;

axis[0] = ax;
axis[1] = ay;
axis[2] = az;

rotation.Set( axis, angle );

typedef TransformType::OutputVectorType VectorType;

VectorType translation;

translation[0] =tx;
translation[1] =ty;
translation[2] =tz;

The parameters are given to the registration process:

transform->SetRotation( rotation );


The rotation and translation parameters do not evolve with the same dynamic. The rotation varies in the range [-1, 1] while the translation can typically vary between 0 and 100. This dynamic difference can lower significantly the performances of the optimizer. To compensate this, it is possible to give some parameter scaling factors to the optimizer in order to normalize the parameters before the optimization. Here we applied a factor 1.0 to the rotation and scaling parameters and a factor 1/1000 to the translation parameters.

typedef OptimizerType::ScalesType OptimizerScalesType;

OptimizerScalesType optimizerScales( transform->GetNumberOfParameters() );

const double rotationWeight = 1.0;
const double translationWeight = 1.0 / 1000.0;
const double scalingWeight = 1.0;

optimizerScales[0] = rotationScale;
optimizerScales[1] = rotationScale;
optimizerScales[2] = rotationScale;
optimizerScales[3] = translationScale;
optimizerScales[4] = translationScale;
optimizerScales[5] = translationScale;
optimizerScales[6] = scalingWeight;

optimizer->SetScales( optimizerScales );

The optimizer maximum iteration number, a relaxation factor, and a maximum and minimum step. These values have been set to 20’000, 0.9, 1.0 and 0.01 respectively.

optimizer->SetNumberOfIterations( 20000 );
optimizer->SetRelaxationFactor( 0.9 );

The number of level in the pyramid is given to the registration pocess:

registration->SetNumberOfLevels( nScales );

The registration has been statred inside a try loop:

catch( itk::ExceptionObject & err )
std::cout << "ExceptionObject caught !" << std::endl;
std::cout << err << std::endl;

The registration results is given is a table containing all the optimized transform paramters.

typedef RegistrationType::ParametersType ParametersType;
ParametersType finalParameters = registration->GetLastTransformParameters();

const double versorX = finalParameters[0];
const double versorY = finalParameters[1];
const double versorZ = finalParameters[2];
const double TranslationX = finalParameters[3];
const double TranslationY = finalParameters[4];
const double TranslationZ = finalParameters[5];
const double ScaleFactor = finalParameters[6];


The following images show the final images after the registration process. To get comparable images, we took the average of 12 images adjacent to the image found after registration. This allow us to simulate a 97 μm thick microradiography (12 images with a thickness of 1 voxel measuring 8.07 μm).  The comparison has been performed in ImageJ (U.S. National Institutes of Health, Bethesda, Maryland, USA,
http://imagej.nih.gov/ij). Image extracted from the 3D volume have been sesgmented to remove the background and eroded by 1 pixel to remove the surface artifacts. The contrast was normalized for both images to minimize differences.



3 thoughts on “2D/3D image registration with C++ library ITK

  1. You actually make it seem so easy with your presentation but I find
    this matter to be really something which I believe I might by no means understand.
    It kind of feels too complex and very huge for me.

    I’m taking a look ahead on your subsequent submit, I will
    attempt to get the grasp of it!

    1. You are right, image registration is not an easy task. A registration instance require several components that are influencing each other and the choice of each of these component is crucial in getting meaningful results. However tools like ITK make this work relatively easy. The documentation is excellent and you shouldn’t have too much difficulty in choosing the right components. Another important thing to look at is the quality of the images and how similar the information is between the two images you are trying to match. To give you a rough idea, if you cannot distinguish similar features in the two modalities, chances are that the registration process will be difficult.

  2. Hi, I read about your work and It is so interesting. I work in the same subject but I have some questions about 2D/3D registration in your optimizer ITK recommend used Optimizer with versor, in this case, VersorRigid3DTransformOptimizer and the first thing that API said is
    “Versors are not in a vector space, for that reason, the classical gradient descent algorithm has to be modified in order to be applicable to Versors (unit quaternions)… ” but you used itkRegularStepGradientDescentOptimizer I used Fletcher-Reeves-Polak-Ribiere Optimizer So What is the difference If you would use the VersorRigid3DTransformOptimizer.

Leave a Reply

Your email address will not be published.