Difference between revisions of "Historical:SoC 2008 Masking in GUI"

Revision as of 08:40, 12 May 2008

Introduction

The objective of this project is to provide the user with an easy to use interface for quickly creating blending masks. After the images are aligned and shown in the preview window, users will have the option of creating blending masks. Currently the goal is to provide option for mask creation in the preview window. Since it already shows the aligned images, it would be easier for users to create appropriate masks from there.

Project Outline

Implementation will be done in two phases. In the first phase, the basic framework will be implemented. Users will be able to mark regions as either positive or negative. Where positive regions(alpha value 255) should be kept in the final result and negative regions(alpha value 0) will be ignored. Based on the marked region a polygonal outline of the marked region will be created. Since the outline may not always be accurate (eg. in the case of low contrast edge), a polygon editing option will be provided (the idea is from [2]). It should be noted that the actual mask representation is pixel-based and the polygonal outline is only to assist with creating mask (this is easier to understand from the video in [2]).

Finally, when the user chooses to create panorama, the masks are generated as output. These masks can be incorporated into the alpha channel of the resulting tif files using enblend-mask or similar tool.

The editing features that are going to be available in this phase are -

Defining mask by drawing brushstrokes
Option for zooming in/out
Set brush stroke size (may not be necessary)

The second phase will focus on improving fine-tuning options and general usability. Support for 3D (2D spatial and sequence of images as the third dimension) segmentation will also be provided. This will be helpful if an object is moving across the image and we want to exclude that object from the final scene. A straight forward approach is to mask that object in every image. But this requires a lot of work. 3D segmentation will simplify this by automatically creating masks in successive images.

The features that are going to be provided in this stage are -

Undo/Redo
Support for fine-tuning boundaries
Automatically segmenting sequence of images

Timeline

1. Before Start of Coding Phase:

Determine input data type, format and how the user will interact
Construct a preliminary design of the software
Outline of how the algorithm will work
Finalize the scope of the project
Start porting the existing implementation of image segmentation to use wxWidget and VIGRA (this will be used for testing the basic functionalities)

2. Coding Phase:

2.1 Before Mid Term Evaluation

Stand-alone application for testing the basic framework

Implement a basic framework that can –

Take a set of aligned images of a particular format.
Allow users to mark regions
Incorporate algorithm to learn the color model from the user defined area
Start implementing support for polygon mode editing.

2.2 After Mid Term Evaluation

Fix issues with first phase(bugs, usability, etc.)
Perform 3D image segmentation
Implement custom max flow/min cut algorithm (the one that is going to be used initially is under research only license).

Deliverables

The final deliverables will be –

An extensible framework for GUI based mask editing.
An extensible interaction system that supports the primary forms of interaction.
A Graph-cut library that can be used for implementing other graph-cut based techniques/solving different problems (eg. HDR Deghosting by specifying desired image to use for certain region [3], constraining control points by marking regions where control points should not be generated, etc).

Design

Algorithm

The main algorithm that is going to be used is outline in [1]. There are also other image segmentatio techniques like SIOX. But one of the advantages of [1] is that it can easily be used for N-D image segmentation and also the result can be iteratively improved in the way demonstrated in [2]. Also the underlying graph cut optimization that is used can be used to implement other features as well.

Architecture

Test Dataset

[test dataset contributed by the community]

Glossary of Terms

Graph Cut: Graph cut is an optimization technique. Problems in computer vision/image processing such as image restoration, segmentation, etc. are posed as an optimization problem. In the case of Graph cut optimization, these optimization problems are represented as a min-cut problem which is solved using max-flow/min-cut algorithm. For some problems like binary segmentation (ie. segmenting as foreground and background) graph cut provides a global optimal solution. For others (eg. multi-label segmentation) it provides an approximate solution.

Image Segmentation: Image segmentation can be considered as a labeling problem where different regions of an image is label differently. For instance, in the case of binary segmentation the foreground and background objects can be labeled as foreground and background respectively.

Multi-label Image Segmentation: In this kind of segmentation problem multiple labels are assigned. For instance different regions of an image can be labeled based on the content of that region eg. people, trees, sky, water, etc.

References

[1] Yury Boykov, Marie-Pierre Jolly, "Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images," Proc. Eighth IEEE ICCV, vol.1, no., pp.105-112 vol.1, 2001. webpage

[2] Yin Li, Jian Sun, Chi-Keung Tang and Heung-Yeung Shum, "Lazy Snapping," ACM Transaction on Graphics(Proceedings of SIGGRAPH), Vol 23, No. 3, April 2004.paper video

[3] Aseem Agarwala, Mira Dontcheva, Maneesh Agrawala, Steven Drucker, Alex Colburn, Brian Curless, David Salesin, Michael Cohen, "Interactive Digital Photomontage," ACM Transactions on Graphics (Proceedings of SIGGRAPH), 2004. webpage

@@ Line 1: / Line 1: @@
+[http://maskingingui.blogspot.com/ Masking in GUI project blog]
 ==Introduction==
 The objective of this project is to provide the user with an easy to use interface for quickly creating blending masks. After the images are aligned and shown in the preview window, users will have the option of creating blending masks. Currently the goal is to provide option for mask creation in the preview window. Since it already shows the aligned images, it would be easier for users to create appropriate masks from there.
+[[Image:Preview_window_toolbar.png]]
 ===Project Outline===
-Implementation will be done in two phases. In the first phase, the basic framework will be implemented. Users will be able to mark regions as either foreground or background and specify the contribution of each segment. Based on the marked region a polygonal outline of the foreground object will be created. Since the outline may not always be accurate (eg. in the case of low contrast edge), a polygon editing option will be provided (the idea is from [2]). When the user chooses to create panorama the masks are generated as output and provided to enblend.The editing features that are going to be available in this phase are -
+Implementation will be done in two phases. In the first phase, the basic framework will be implemented. Users will be able to mark regions as either positive or negative. Where positive regions(alpha value 255) should be kept in the final result  and negative regions(alpha value 0) will be ignored. Based on the marked region a polygonal outline of the marked region will be created. Since the outline may not always be accurate (eg. in the case of low contrast edge), a polygon editing option will be provided (the idea is from [2]). It should be noted that the actual mask representation is pixel-based and the polygonal outline is only to assist with creating mask (this is easier to understand from the video in [2]).
+Finally, when the user chooses to create panorama, the masks are generated as output. These masks can be incorporated into the alpha channel of the resulting tif files using [http://search.cpan.org/dist/Panotools-Script/ enblend-mask] or similar tool.
-*Option for zooming in/out
+The editing features that are going to be available in this phase are -
-*Set brush stroke size
-*Polygon editing mode for fine-tuning boundary regions
-The second phase will focus on 3D segmentation. For instance if an object is moving in front of the camera and we want to exclude that object from the final scene then the user has to mark that object in every image. The second phase will make this simpler by extending the segmentation into 3D.
+* Defining mask by drawing brushstrokes
+* Option for zooming in/out
+* Set brush stroke size (may not be necessary)
+The second phase will focus on improving fine-tuning options and general usability. Support for 3D (2D spatial and sequence of images as the third dimension) segmentation will also be provided. This will be helpful if an object is moving across the image and we want to exclude that object from the final scene. A straight forward approach is to mask that object in every image. But this requires a lot of work. 3D segmentation will simplify this by automatically creating masks in successive images.
+The features that are going to be provided in this stage are -
+* Undo/Redo
+* Support for fine-tuning boundaries
+* Automatically segmenting sequence of images
 ===Timeline===
@@ Line 19: / Line 33: @@
 * Finalize the scope of the project
 * Start porting the existing implementation of image segmentation to use wxWidget and VIGRA (this will be used for testing the basic functionalities)
 . Coding Phase:
@@ Line 24: / Line 39: @@
 * Stand-alone application for testing the basic framework
 Implement a basic framework that can –
-* Take an image stack of a particular format.
+* Take a set of aligned images of a particular format.
 * Allow users to mark regions
 * Incorporate algorithm to learn the color model from the user defined area
-* Start implementing 2D multi-label image segmentation (this may not be necessary)
+* Start implementing support for polygon mode editing.
 .2 After Mid Term Evaluation
 * Fix issues with first phase(bugs, usability, etc.)
-* Perform image segmentation on the stack of images (3D segmentation problem where the user will only need to roughly mark the region on a small subset of the images). At the end of this stage the segmentation algorithm should be able to correctly identify similar region in successive images.
+* Perform 3D image segmentation
-* Implement custom max flow/min cut algorithm
+* Implement custom max flow/min cut algorithm (the one that is going to be used initially is under research only license).
 ===Deliverables===