Difference between revisions of "Historical:MatchPoint"

From PanoTools.org Wiki
Jump to navigation Jump to search
m (update some text formatting to use correct Wiki syntax, move example usage to usage section, correct some typographical errors)
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
MatchPoint is a next generation CP generator. The result of a GSoC2007 project [[SoC2007_project_Feature_Descriptor]]. Currently it offers only keypoint detection but not matching(matching was part of other GSoC 07 project that was not carried out) and can currently be used only as a replacement for generatekeys from autopano.  
+
MatchPoint is a next generation CP generator. The result of a GSoC2007 project [[SoC2007_project_Feature_Descriptor]]. Currently it offers only keypoint detection but not matching (matching was part of other GSoC 07 project that was not carried out) and can currently be used only as a replacement for generatekeys from [[autopano-sift]] or [[autopano-sift-C]].  
  
Goal of this MatchPoint is to create a complete control point suite that will be used by hugin as a replacement(or at least an alternative) to existing autopano suites.
+
A current version of matchpoint can be obtained via the [[hugin]] project.
 +
 
 +
Goal of this MatchPoint is to create a complete control point suite that will be used by [[hugin]] as a replacement(or at least an alternative) to existing autopano suites.
  
 
If you want to use MatchPoint in the process of creating panoramic images now(when not all features are implemented) you will have to use autopano-sift also.
 
If you want to use MatchPoint in the process of creating panoramic images now(when not all features are implemented) you will have to use autopano-sift also.
 
Usage:<br />
 
// first extract features from the first image and output them to the image1.key: <br />
 
MatchPoint image1.jpg image1.key
 
 
// for second image: <br />
 
MatchPoint image2.jpg image2.key
 
 
// match features from the two generated files using autopano: <br />
 
autopano project.pto image1.key image2.key
 
  
// open the project file in hugin: <br />
 
hugin project.pto
 
  
 
== Command line usage ==
 
== Command line usage ==
Line 24: Line 14:
 
Parameters:
 
Parameters:
 
;-v: verbose output
 
;-v: verbose output
;-t: generate keypoint file for matlab test suite(file name is generated using formula: image1.jpg.key)
+
;-t: generate keypoint file for MATLAB test suite (file name is generated using formula: image1.jpg.key)
 
Arguments:
 
Arguments:
 
;image1.jpg: Path to image to be analyzed.
 
;image1.jpg: Path to image to be analyzed.
 
;output.key: Output keypoint file.
 
;output.key: Output keypoint file.
 +
 +
=== Example Workflow ===
 +
 +
    # first extract features from the first image and output them to the image1.key:
 +
    MatchPoint image1.jpg image1.key
 +
   
 +
    # for second image:
 +
    MatchPoint image2.jpg image2.key
 +
   
 +
    # match features from the two generated files using autopano:
 +
    autopano project.pto image1.key image2.key
 +
   
 +
    # open the project file in Hugin:
 +
    hugin project.pto
 +
  
 
== Algorithm description ==
 
== Algorithm description ==
Line 34: Line 39:
  
 
Before detection process images are integrated. Each element-pixel (x,y) of integral image represents sum of pixels from (0,0) to (x,y) on initial image. This makes calculating sum of a region much faster. In addition, convolution at any scale has equal time complexity.
 
Before detection process images are integrated. Each element-pixel (x,y) of integral image represents sum of pixels from (0,0) to (x,y) on initial image. This makes calculating sum of a region much faster. In addition, convolution at any scale has equal time complexity.
This part is neccessary only when using box filters for detection.
+
This part is necessary only when using box filters for detection.
  
 
=== Feature detection ===
 
=== Feature detection ===
  
Points are detected with Hessian Detector using box filters. Here is a description of detection process over time. This may not be entirely compatible flow with the algorithm's flow, because some parts are simultaneous(e.q. first two steps).
+
Points are detected with Hessian Detector using box filters. Here is a description of detection process over time. This may not be entirely compatible flow with code's flow, because some parts are simultaneous(e.q. first two steps).
 +
 
 +
==== Convolution of Pixels ====
  
''' Convolution of pixels '''<br />
 
 
Convolution of a pixel at certain scale is carried out with approximation of Gauss 2D function - this is called box filter detection. Each kernel has 4 convolution filters(3 of them are unique - xy and yx filters are the same). The resulting value is then the determinant of Hessian matrix where elements represent convolution values of 4 filters.  
 
Convolution of a pixel at certain scale is carried out with approximation of Gauss 2D function - this is called box filter detection. Each kernel has 4 convolution filters(3 of them are unique - xy and yx filters are the same). The resulting value is then the determinant of Hessian matrix where elements represent convolution values of 4 filters.  
  
Kernel sizes for convolution are:<br />  
+
Kernel sizes for convolution are:<br />
 
9,15,21,27, (1st octave)<br />
 
9,15,21,27, (1st octave)<br />
 
15,27,39,51, (2nd octave)<br />
 
15,27,39,51, (2nd octave)<br />
Line 50: Line 56:
 
MatchPoint features also offers two ways of convolution:
 
MatchPoint features also offers two ways of convolution:
 
* box filter: faster and preferable way
 
* box filter: faster and preferable way
* sliding window(implemented for convinience but needs debug): slower, more accurate but also more sensitive to noise
+
* sliding window(implemented for convenience but needs debug): slower, more accurate but also more sensitive to noise
 +
 
 +
==== Finding Maxima ====
  
''' Finding maxima '''<br />
 
 
In order to achieve invariance to scaling, detection needs to find maxima of Hessian determinant across many scales. To handle this, octaves were introduced. Octaves are interpolation of determinants at various scales(usually two scales). MatchPoint offers detection at max 3 octaves(by setting a parameter). E.q. at first octave a point can be detected at scale 1.6(=((9+15/2)*1.2)/9 where 1.2 is initial scale and 9 is initial kernel size) or 3.2(=((21+27/2)*1.2)/9. Keypoint's main scale is then selected according to the highest value of Hessian determinant.
 
In order to achieve invariance to scaling, detection needs to find maxima of Hessian determinant across many scales. To handle this, octaves were introduced. Octaves are interpolation of determinants at various scales(usually two scales). MatchPoint offers detection at max 3 octaves(by setting a parameter). E.q. at first octave a point can be detected at scale 1.6(=((9+15/2)*1.2)/9 where 1.2 is initial scale and 9 is initial kernel size) or 3.2(=((21+27/2)*1.2)/9. Keypoint's main scale is then selected according to the highest value of Hessian determinant.
  
''' Selecting points '''<br />
+
==== Selecting Points ====
  
 
Next step is to select points with high values of their determinants(regardless of scale where points where detected) that represent invariant keypoints that will be used further processing. This is achieved using a fixed threshold which should be set low(because otherwise it may happen that no points would be detected).  
 
Next step is to select points with high values of their determinants(regardless of scale where points where detected) that represent invariant keypoints that will be used further processing. This is achieved using a fixed threshold which should be set low(because otherwise it may happen that no points would be detected).  
  
Then non-maxima supression is applied(only points with local maxima of determinants are selected).
+
Then non-maxima suppression is applied(only points with local maxima of determinants are selected).
  
At this point we have a list of points that can vary in length, because threshold from previous step is hardcoded. This can result in over 200.000 points for larger images and that is too much(regardless of image size we need the same amount of control points for all images - min 3 control point/overlapping images), so we need to strip down the list below 10.000 points(this number is set intuitivelly and it is based on consumed time). (Note: this work is progress).
+
At this point we have a list of points that can vary in length, because threshold from previous step is hard-coded. This can result in over 200.000 points for larger images and that is too much(regardless of image size we need the same amount of control points for all images - min 3 control point/overlapping images), so we need to strip down the list below 10.000 points (this number is set intuitively and it is based on consumed time). (Note: this work is progress).
  
 
=== Feature description ===
 
=== Feature description ===
  
''' Orientation assignment '''<br />
+
==== Orientation Assignment ====
 +
 
 
Scale invariance is achieved by assigning main orientation of the interest point using special algorithm proposed by Herbert Bay. Efficiency of this algorithm has not been yet tested, therefore the executable of MatchPoint does not use any orientation assignment.
 
Scale invariance is achieved by assigning main orientation of the interest point using special algorithm proposed by Herbert Bay. Efficiency of this algorithm has not been yet tested, therefore the executable of MatchPoint does not use any orientation assignment.
  
''' Shape Context descriptors '''<br />
+
==== Shape Context descriptors ====
 +
 
 
To each interest point a 36 element descriptor is assigned. Elements of this descriptors are organized according to Shape Context descriptor proposed by [http://www.eecs.berkeley.edu/Research/Projects/CS/vision/shape/sc_digits.html S. Belongie, J. Malik and J. Puzicha] and adapted by [http://www.robots.ox.ac.uk/~vgg/research/affine/index.html K. Mikolajczyk] for the purpose of finding control points.  
 
To each interest point a 36 element descriptor is assigned. Elements of this descriptors are organized according to Shape Context descriptor proposed by [http://www.eecs.berkeley.edu/Research/Projects/CS/vision/shape/sc_digits.html S. Belongie, J. Malik and J. Puzicha] and adapted by [http://www.robots.ox.ac.uk/~vgg/research/affine/index.html K. Mikolajczyk] for the purpose of finding control points.  
Steps in creating point's descriptor:
+
 
# detect edges in the surrounding region. MatchPoint uses vigra's API(Canny edge detection) for this.
+
Steps in creating descriptors:
# based on relative location of edge elements create a histogram with 36 values. Use log-polar values. Edge point contribution to the histogram is based on it's magnitude and orientation.
+
# Detect edges in the region around the interest point(region size depends on the scale at which point was detected). MatchPoint uses vigra's API(Canny edge detection) for this.
 +
# Based on relative location of edge elements create a histogram with 36 values. Use log-polar values. Edge point contribution to the histogram is based on it's magnitude and orientation.
 
See reference papers for details.
 
See reference papers for details.
  
==References==
+
== References ==
 
* [http://www.vision.ee.ethz.ch/~surf/ Speeded Up Robust Features - SURF]
 
* [http://www.vision.ee.ethz.ch/~surf/ Speeded Up Robust Features - SURF]
 
* [http://www.robots.ox.ac.uk/~vgg/research/affine Matlab suite for testing, papers for detection, descriptors and evaluation by K. Mikolajczyk]
 
* [http://www.robots.ox.ac.uk/~vgg/research/affine Matlab suite for testing, papers for detection, descriptors and evaluation by K. Mikolajczyk]
 
* [http://www.eecs.berkeley.edu/Research/Projects/CS/vision/shape/sc_digits.html Shape Context descriptors]
 
* [http://www.eecs.berkeley.edu/Research/Projects/CS/vision/shape/sc_digits.html Shape Context descriptors]
 
[[Category:Community:Project]]
 
[[Category:Community:Project]]
 +
[[Category:Software:Platform:Windows]]
 +
[[Category:Software:Platform:Linux]]
 +
[[Category:Software:Platform:Mac OS X]]
 +
[[Category:Software:Hugin]]

Revision as of 19:08, 19 December 2014

Introduction

MatchPoint is a next generation CP generator. The result of a GSoC2007 project SoC2007_project_Feature_Descriptor. Currently it offers only keypoint detection but not matching (matching was part of other GSoC 07 project that was not carried out) and can currently be used only as a replacement for generatekeys from autopano-sift or autopano-sift-C.

A current version of matchpoint can be obtained via the hugin project.

Goal of this MatchPoint is to create a complete control point suite that will be used by hugin as a replacement(or at least an alternative) to existing autopano suites.

If you want to use MatchPoint in the process of creating panoramic images now(when not all features are implemented) you will have to use autopano-sift also.


Command line usage

Usage:

MatchPoint [options] image1.jpg output.key

Parameters:

-v
verbose output
-t
generate keypoint file for MATLAB test suite (file name is generated using formula: image1.jpg.key)

Arguments:

image1.jpg
Path to image to be analyzed.
output.key
Output keypoint file.

Example Workflow

   # first extract features from the first image and output them to the image1.key:
   MatchPoint image1.jpg image1.key 
   
   # for second image:
   MatchPoint image2.jpg image2.key 
   
   # match features from the two generated files using autopano:
   autopano project.pto image1.key image2.key 
   
   # open the project file in Hugin:
   hugin project.pto


Algorithm description

Integral images

Before detection process images are integrated. Each element-pixel (x,y) of integral image represents sum of pixels from (0,0) to (x,y) on initial image. This makes calculating sum of a region much faster. In addition, convolution at any scale has equal time complexity. This part is necessary only when using box filters for detection.

Feature detection

Points are detected with Hessian Detector using box filters. Here is a description of detection process over time. This may not be entirely compatible flow with code's flow, because some parts are simultaneous(e.q. first two steps).

Convolution of Pixels

Convolution of a pixel at certain scale is carried out with approximation of Gauss 2D function - this is called box filter detection. Each kernel has 4 convolution filters(3 of them are unique - xy and yx filters are the same). The resulting value is then the determinant of Hessian matrix where elements represent convolution values of 4 filters.

Kernel sizes for convolution are:
9,15,21,27, (1st octave)
15,27,39,51, (2nd octave)
21,45,69,93 (3rd octave)

MatchPoint features also offers two ways of convolution:

  • box filter: faster and preferable way
  • sliding window(implemented for convenience but needs debug): slower, more accurate but also more sensitive to noise

Finding Maxima

In order to achieve invariance to scaling, detection needs to find maxima of Hessian determinant across many scales. To handle this, octaves were introduced. Octaves are interpolation of determinants at various scales(usually two scales). MatchPoint offers detection at max 3 octaves(by setting a parameter). E.q. at first octave a point can be detected at scale 1.6(=((9+15/2)*1.2)/9 where 1.2 is initial scale and 9 is initial kernel size) or 3.2(=((21+27/2)*1.2)/9. Keypoint's main scale is then selected according to the highest value of Hessian determinant.

Selecting Points

Next step is to select points with high values of their determinants(regardless of scale where points where detected) that represent invariant keypoints that will be used further processing. This is achieved using a fixed threshold which should be set low(because otherwise it may happen that no points would be detected).

Then non-maxima suppression is applied(only points with local maxima of determinants are selected).

At this point we have a list of points that can vary in length, because threshold from previous step is hard-coded. This can result in over 200.000 points for larger images and that is too much(regardless of image size we need the same amount of control points for all images - min 3 control point/overlapping images), so we need to strip down the list below 10.000 points (this number is set intuitively and it is based on consumed time). (Note: this work is progress).

Feature description

Orientation Assignment

Scale invariance is achieved by assigning main orientation of the interest point using special algorithm proposed by Herbert Bay. Efficiency of this algorithm has not been yet tested, therefore the executable of MatchPoint does not use any orientation assignment.

Shape Context descriptors

To each interest point a 36 element descriptor is assigned. Elements of this descriptors are organized according to Shape Context descriptor proposed by S. Belongie, J. Malik and J. Puzicha and adapted by K. Mikolajczyk for the purpose of finding control points.

Steps in creating descriptors:

  1. Detect edges in the region around the interest point(region size depends on the scale at which point was detected). MatchPoint uses vigra's API(Canny edge detection) for this.
  2. Based on relative location of edge elements create a histogram with 36 values. Use log-polar values. Edge point contribution to the histogram is based on it's magnitude and orientation.

See reference papers for details.

References