From PanoTools.org Wiki
Revision as of 00:03, 25 April 2008 by Bruno (talk | contribs) (Really needs a rewrite)
Jump to: navigation, search


MatchPoint is a next generation CP generator. The result of a GSoC2007 project SoC2007_project_Feature_Descriptor. Currently it offers only keypoint detection but not matching (matching was part of other GSoC 07 project that was not carried out) and can currently be used only as a replacement for generatekeys from autopano-sift or autopano-sift-C.

A current version of matchpoint can be obtained via the hugin project.

Goal of this MatchPoint is to create a complete control point suite that will be used by hugin as a replacement(or at least an alternative) to existing autopano suites.

If you want to use MatchPoint in the process of creating panoramic images now(when not all features are implemented) you will have to use autopano-sift also.

// first extract features from the first image and output them to the image1.key:
MatchPoint image1.jpg image1.key

// for second image:
MatchPoint image2.jpg image2.key

// match features from the two generated files using autopano:
autopano project.pto image1.key image2.key

// open the project file in hugin:
hugin project.pto

Command line usage


MatchPoint [options] image1.jpg output.key


verbose output
generate keypoint file for matlab test suite(file name is generated using formula: image1.jpg.key)


Path to image to be analyzed.
Output keypoint file.

Algorithm description

Integral images

Before detection process images are integrated. Each element-pixel (x,y) of integral image represents sum of pixels from (0,0) to (x,y) on initial image. This makes calculating sum of a region much faster. In addition, convolution at any scale has equal time complexity. This part is neccessary only when using box filters for detection.

Feature detection

Points are detected with Hessian Detector using box filters. Here is a description of detection process over time. This may not be entirely compatible flow with code's flow, because some parts are simultaneous(e.q. first two steps).

Convolution of pixels
Convolution of a pixel at certain scale is carried out with approximation of Gauss 2D function - this is called box filter detection. Each kernel has 4 convolution filters(3 of them are unique - xy and yx filters are the same). The resulting value is then the determinant of Hessian matrix where elements represent convolution values of 4 filters.

Kernel sizes for convolution are:
9,15,21,27, (1st octave)
15,27,39,51, (2nd octave)
21,45,69,93 (3rd octave)

MatchPoint features also offers two ways of convolution:

  • box filter: faster and preferable way
  • sliding window(implemented for convenience but needs debug): slower, more accurate but also more sensitive to noise

Finding maxima
In order to achieve invariance to scaling, detection needs to find maxima of Hessian determinant across many scales. To handle this, octaves were introduced. Octaves are interpolation of determinants at various scales(usually two scales). MatchPoint offers detection at max 3 octaves(by setting a parameter). E.q. at first octave a point can be detected at scale 1.6(=((9+15/2)*1.2)/9 where 1.2 is initial scale and 9 is initial kernel size) or 3.2(=((21+27/2)*1.2)/9. Keypoint's main scale is then selected according to the highest value of Hessian determinant.

Selecting points

Next step is to select points with high values of their determinants(regardless of scale where points where detected) that represent invariant keypoints that will be used further processing. This is achieved using a fixed threshold which should be set low(because otherwise it may happen that no points would be detected).

Then non-maxima suppression is applied(only points with local maxima of determinants are selected).

At this point we have a list of points that can vary in length, because threshold from previous step is hardcoded. This can result in over 200.000 points for larger images and that is too much(regardless of image size we need the same amount of control points for all images - min 3 control point/overlapping images), so we need to strip down the list below 10.000 points (this number is set intuitively and it is based on consumed time). (Note: this work is progress).

Feature description

Orientation assignment
Scale invariance is achieved by assigning main orientation of the interest point using special algorithm proposed by Herbert Bay. Efficiency of this algorithm has not been yet tested, therefore the executable of MatchPoint does not use any orientation assignment.

Shape Context descriptors
To each interest point a 36 element descriptor is assigned. Elements of this descriptors are organized according to Shape Context descriptor proposed by S. Belongie, J. Malik and J. Puzicha and adapted by K. Mikolajczyk for the purpose of finding control points.

Steps in creating descriptors:

  1. detect edges in the region around the interest point(region size depends on the scale at which point was detected). MatchPoint uses vigra's API(Canny edge detection) for this.
  2. based on relative location of edge elements create a histogram with 36 values. Use log-polar values. Edge point contribution to the histogram is based on it's magnitude and orientation.

See reference papers for details.