SoC2009 Seth Berrier
Simple Masking & Bracketed/HDR Exposure Stacks
Panoramas are, by their nature, vast offering a wide view of the world; and the world, unfortunately, is full of things that move. These moving things (animals, well-meaning family members and friends, cars and airplanes) inevitably step into your frame an the most inopportune moments and show up in only one of the overlapping source images. For this reason above all else, masking out pixels is a vital step in the panoramic creation process. Much of the Hugin family of tools is already setup to support this and only a small amount of extra editing is necessary after remapping the images and before blending. But it is not trivial and it is high time this extra step was eliminated.
Simple tools that allow drawing into the image's alpha channel would suffice for masking out these artifacts. I propose extending the panoramic previewer to enable this type of editing. Registering each source image as a separate layer and allowing the user to mask out pixels from each individual layer would provide considerable flexibility for this and other simple photo editing tasks. A layer-based editing interface like this would eliminate the need to fire up Photoshop or GIMP and strong-arm the stitching process except in the most unusual of circumstances.
This also creates implications for bracketed image stacks and HDR exposure sequences. The presence of an anomaly in a single exposure of a sequence of exposures will cause ghosting when an LDR sequence is combined to make an HDR. While automatic techniques exist for removing this ghosting it seems prudent to give the user tools to intervene when the automatic techniques fail. The simple masking interface proposed above seems ideally suited to this. By masking out the anomaly from the exposure in question a user can simply remove the extraneous data in the camera's response curve at the cost of lower sampling of the dynamic range of the scene at that specific point.
As such, I also propose adding the ability to group images into stacks of related exposures so that they are treated as such throughout the Hugin interface. Beyond masking out artifacts in exposure sequences, awareness of related exposures allows other stages in the stitching process to work more intelligently. For example, aligning a stack of exposures that frame the exact same scene is a process that can be fully automated and has been in many other tools. It does not require the interaction necessary for stitching adjacent images. Furthermore, comparing control points across exposure levels is conceptually incorrect and avoiding this requires knowledge of which images are stacked and which are adjacent. All of this is initially contingent on Hugin supporting the grouping of images into exposure stacks and this initial framework is what I propose developing along-side of the simple masking interface. These two tasks together make a single proposal for the 2009 Google Summer of Code.
Masking as a physical technique and as a software tool is described. Specific software that employs masking is examined.
Historically, masking is an artistic technique. Careful placement of tape and stencils enhances the precision of otherwise fast and imprecise tools such as fat brushes and rollers. Use of this technique in industrial paint application has led to referring to painters tape as 'masking' tape. Ultimately, this is the intended purpose of any masking process; namely, making fast, imprecise tools useful in tasks that require precision. (1)
The use of masking in software has extended its use beyond this original intention. Focusing particularly on painting and photo-editing software we see masking being used as an integral part of layer-based tasks. By placing a particular mask on a layer above the item being edited effects can be applied to specific regions such as selective transparency and blending, special filters or even just selective deletion. (2)
But the mask is still used in a form similar to its use in the arts. Selection masks are a key element in paint and photo-editing software and their purpose is to provide a sharp edge over which the effects of an editing tool cannot pass. Quick, imprecise tools become usable in precision situations when a selection mask is employed.
To further understand the two ways masks are used in modern editing software we will examine several such programs: Benjamin Moore's Personal Color Viewer, Adobe Photoshop and the GIMP.
Existing Masking Software
Several commercial software tools have emerged that allow the user to edit a photograph and re-paint the surfaces present with the paint color of a particular manufacturer. One such tool available from Benjamin Moore paints is called Personal Color Viewer.
Personal Color Viewer starts with a photograph (usually containing exterior or interior walls). The user carefully uses an interface to mask off the specific components in the photograph that will be painted. These can include walls, trim and accent pieces such as doors and molding. Once the image has been masked an interface is provided where the user can apply any Benjamin Moore paint color to the various masked regions and visualize the end result.
While the effectiveness of this tool is largely dependent on the quality of the input image, it is also affected by the quality of the masking. Poor masking causes the applied color to bleed over into the wrong areas of the photograph which diminishes the reality of the visualization. While Personal Color Viewer does not aim to provide a complete set of editing tools is does provide two different ways of creating the paint masks.
First, a standard painting brush is given with variable diameter. This tool allows the user to stroke over areas that belong to the mask being defined. Pixels under the brush become part of the mask and will receive paint color in the next step while those not touched will remain unchanged. This type of tool requires a raster to store the mask information generated by the brush as individual pixels are touched by the user.
The second masking tool is a polygon tool. The user can define a large region by specifying the vertices of a closed polygon. Each click adds a vertex to the polygon and is connected by a straight line with the previous vertex. The polygon is closed by connecting the last vertex to the first vertex defined. This approach allows the user to follow the clean, sharp edges that are usually present in architecture between walls and molding, trim or doors. It gives more precision to the user but is only useful if the lines in the photograph are straight. Behind the interface, a different way of storing the mask is required. Either the polygon specified must be rasterized and added to the mask created by the paintbrush tool (a process that will lose some of the precision gained by this tool) or the rest of the software must support specifying some masking regions as polygon vertices and render them accordingly.
Both of these masking tools have pros and cons and complement one another. The paintbrush approach is fast and allows the specification of free-form shapes while the polygon tool takes advantage of the regular geometry that is likely to be present in these input photographs.
Photo Editing Software
Beyond the simplicity of Personal Color Viewer there is a plethora of photo editing programs that range from simple tools like Microsoft Paint to the mother of all digital editing tools, Photoshop (and everything in-between). We see these programs employing masking in two different ways: selection masking and layer masking.
Photoshop and GIMP both provide several tools that enable the user to select a particular sub-region of the photo being edited. These tools include basic geometric masks such as the rectangular mask, circular/ellipse mask and single line masks. They also have freeform selection tools that allow you to draw the selection shape or to define a polygon (the lasso and polygon-lasso tools). Rounding out the set of selection tools are the magic wand (which attempts to find continuous regions of similar color) and the quick selection tools (which uses the user's motion to refine a continuous color region selection).
All of these tools can be combined to add or subtract from the overall mask being defined. Together they fill-in the gaps left by the simpler tools provided by Personal Color Viewer at the cost of adding complexity to the overall masking process.
This particular type of masking (often called selection masking) defines a region whose purpose is similar to the masks in Personal Color Viewer. Once the mask is defined any tool used on the photo will only affect pixels inside the mask. Those outside the mask remain untouched. Just like a painter with a large area to fill, agility is preferred above precision by digital photo editors. Taking the time to create a precise mask will allow the use of coarse, quick tools that cover large areas with ease.
Advanced digital photo editing software always uses a layer-based editing model. This means that a single photo is separated into layers that may be created and filled by the user as they edit the photograph. Layers are ordered by the user so that closer layers obscure deeper ones while not destroying the content in the lower layer that they obscure. While this simple conceptual model is itself a powerful addition to any digital editing task the full advantage comes when the user can control how the layers are blended together.
Of particular interest is the use of a masking layer. In this case, the masking layer defines rules for blending two other layers into one. Typically, a masking layer contains only black and white pixels (or shades of gray). Where it is white layer A is chosen and where it is black layer B is chosen (gray pixels would linearly blend between the contents of A and B). The masking layer, like any other layer, contains an arbitrary raster of pixel information which can be created using any of the tools available in the digital editing software. By painting into the mask layer very unique blending effects can be created allowing subtle control over the transition between the source layers A and B.
Here we see masking taking on a different purpose but one that still serves the original artistic intent. It complements the layered conceptual model giving precise controls over blending of layers while still employing broad, agile tools that lack precision.
Masking Considerations in Hugin
For Hugin, other project members have studied the tradeoffs and considerations for masking including Bruno Postle and Tom Sharpless. It was determined that the simplest way to introduce masking to the current interface and workflow of Hugin would be with masks defined as polygons instead of bitmaps.(**)
Polygons are uniquely defined by their vertices and therefore are easy to store and pass between the different components of Hugin. They are also easy to define with a simple point-and-click interface. Bruno developed some scripts to test this idea (nona-mask, enblend-mask and process-mask) and used a SVG editor to generate the polygon data that was supplied to these scripts.
Bitmap masks require more memory, proportional to the size of the images and since it is common to work with exceptionally large images in Hugin this is a significant requirement. Furthermore, this extra data must be carried throughout the pipeline until it can be provided to enblend at the end which makes the ultimate decision about what to include from which image in the final pano.
On the other hand, there are some drawbacks to using polygonal maks. First, the remapping of images from their source format (anything from rectilinear to cylintrical or fisheye) to the final equirectangular format (or some other projection) is non-linear. The straight edges of the polygon will become curved in this trasformation. If only the vertices of the polygon are stored and passed along (with the edges being linearly interpolated between them) the resulting polygon would not be a proper re-projection. Thus, care must be taken either to define the polygon in the final re-projected space or to sample it densely enough that the linear approximation still covers the features that are being removed.
In addition, there is less agility in drawing a polygon around the artifact than stroking a brush over its pixels. A bitmap brush interface is a very different interface and all of the masking programs discussed in the background saw fit to include both a bitmap brush and a polygon tool. It is these tools together that make the most flexible and user-friendly system for masking.
**) much of this discussion is extrapolated from the original SoC project idea and from Bruno's comments available here.
Outline of Work
Given the experiences of the main Hugin team member it is best to start with their conclusions. From this, other interfaces and work can be added once a better evaluation of their impact and practicality can be ascertained.
- Polygon Masking
- Implement a simple point-and-click inteface to define a masking polygon on the cropping tab of the Hugin interface.
- Allow an arbitrary number of masking polygons to be associated with each input image.
- Store the polygon vertex data in the .pto file for access by other components of the stitching pipeline.
- Implement any changes necessary to nona to read in the masking polygon data and adjust the remapped images accordingly.
To extend this project to something of the scope expected for a SoC project I took another SoC idea and proposed getting it started with the following changes
- Bracket/HDR Exposure Stack Awareness
- Make changes to the Images tab in hugin to allow grouping of images into exposure stacks.
- Add interface widgets to allow the user to manually identify these stacks.
Beyond these two tasks further extensions can be pursued, time permitting:
- Bitmap Masking
- Explore the possibility of bitmap masks and the additional data required to accomodate them.
- Introduce a masking paintbrush to the cropping tab or to the pano previewer for masking out artifacts.
- Make changes necessary to store and process this bitmap data throughout the hugin pipeline.
- Implement a simple algorithm to automatically identify images belonging to the same exposure stack
- Randomly sample N pixels in every image. The same pixels are sampled from each image.
- Convert each pixel to CIE Lab color space.
- Compute the difference in the a and b dimensions for each pairing of pictures summing this difference across the sampled pixels.
- The lowest sum of differences (beyond a certain threshold) will be images of the same exposure stack.
- A Gaussian filter may be necessary prior to sampling the images in order to account for very small misalignments.
- 1) Masking in Painting - http://en.wikipedia.org/wiki/Masking_(in_art)#In_painting
- 2) Layer editing - http://en.wikipedia.org/wiki/Layers_(digital_image_editing)#Layer_Mask