Image Inpainting and Texture Synthesis

Hamilton Chong (hchong@fas.harvard.edu)

Abstract

This project explores various combinations of image inpainting and texture synthesis techniques used to repair damaged digital pictures or remove unwanted objects.  Overall, the plain information theoretically motivated texture synthesis approach seems to do the best when considered over a wide assembly of test cases.  Some of the other approaches that introduce pixel flow ideas used in image inpainting perform better in certain cases, but tend to perform worse in general.  The method that uses flow to choose the important neighbors doesn’t give up too much in performance usually but does run faster most of the time. 

Introduction and Previous Work

Image inpainting refers to the process of changing an image so that the change is not noticeable by an observer.  It is usually applied to the task of restoring damaged images and is considered an art for the careful artist's eye it takes to get all the details right.  However, Bertalmio et. al. show that much of the process can nonetheless be reduced to a mathematical expression so that computers can greatly reduce the amount of human hand-work required [1].  Their approach is to use differential equations to describe the isophotes (lines of same color) and connect them across a masked out region of the image.  The steps of such inpainting are interspersed with anisotropic diffusion to get rid of unwanted noise while preserving the major edges.  The anisotropic diffusion they propose also serves to help curve isophotes according to the curvature at the boundaries.  Paul Harrison takes a texture synthesis approach to removing unwanted objects within the image [3]. This approach turns out to perform surprisingly well, making attempts to add refinements using isophote flow irrelevant in many cases.  The method assigns pixels information theoretic constraints based upon how much additional information neighboring pixels yield in predicting it (along with some assumed distribution of course).  This categorization into neighboring pixels that constrain the central pixel turns out to account for much of the information flow techniques seem to provide already (although not always).  The authors of [2] noted that with a guiding mask, texture synthesis could yield very nice object removal. 

More Details

Image Inpainting requires a number of iterations before it converges to an acceptable answer.  By experiment, blocks of 200 iterations followed by 1 round of diffusion do fine when repeated.  The paper [1] mentions 15 and then 1, which is likely due to the use of a very different anisotropic diffusion algorithm (I used the one by Robert Estes Jr.).  A good time step, as noted in [1], is 0.1.  The inpainting algorithm works by finding the vector perpendicular to the gradient at the image mask interface (boundary).  It then evolves this interface by growing in the found direction of the vector.  This leads to a good amount of diffusion occurring at the interface between colors that differ greatly, because the vector’s magnitude is relatively large.  However, elsewhere, where the differences between neighboring pixels may be small, little spreading of color occurs.  In a couple cases, with simple pictures or thin masks, the anisotropic diffusion just happened to conspire against the inpainting process, making all flows perpendicular to the direction of the mask.  So even if the process were run for very long times, the masked region would not get completely filled.  This could possibly be remedied by considering other smoothness measures (a very simple discrete laplacian was used).  But since the texture synthesis method proved to remedy this anyway, no other smoothness measures were explored.  Here are some results:

Orchids. The left image was scribbled on; the bright green mask covers the damaged portion.  On the right we have the restored image after 20 iterations (4 and a half minutes) of the inpainting algorithm.

If you look carefully where the scribbles were made, you can see the blurring of the image at various color boundaries.  However, the inpainting performs remarkably well for such thin scribbles over digital photos.  In practice, the algorithm proves robust under even much more scribbling than what is shown here.  The inpainting algorithm also has the advantage that it runs in time proportional to the size of the masked out region.  So even for very large images, the algorithm can do its job quickly when the masked region is small.       

Stripes.  On the left we have a masked out region.  On the right we have inpainting after about 110 iterations (almost 6 minutes).

 

It’s apparent from this example that the inpainting algorithm has shortcomings.  For masked out regions that are not thin and/or on backgrounds with texture or detailed patterns, the algorithm doesn’t perform so well.  It tends to blur the resulting image.  Perhaps a better anisotropic diffusion algorithm would prevent this loss of edge definition, but ultimately the method still provides no way of replicating texture for large regions.  The method, however, does maintain the major edges, which is still a good thing.      

 

 

Texture Synthesis here comes to help restore these larger regions of damage.  The inpainting algorithm is still valuable since it performs quite well for small scratches and runs relatively fast.  The texture synthesis method comes with extra overhead and runs in time proportional to the size of the image (not the mask).  It is, however, categorically better in just about all tests tried (with inpainting performing better in only rather artificial setups designed to highlight its abilities).  Since the next couple algorithms will be based on the texture synthesis approach discussed in Harrison’s PhD thesis [3], it is worth going over the main ideas now.

 

The algorithm can effectively be broken into 2 stages:

(1)   calculating the information content of pixels in relation to neighboring pixels (pixels will be chosen based on this weighting).

(2)   finding a pixel with matching neighbors by evaluating some distance measure

 

The algorithm goes like this:

While there exist undetermined pixels

            Choose the highest priority one based on weighting in (1)

            Choose the closest approximation by distance measure in (2)

            Color the pixel and update the weightings of neighbors

  

1.      Calculating weighting.  The details are given in [3].  The first stage of the repair process (which incurs the time proportional to image size cost) loops through all offsets of a pixel within some set radius.  For each offset, we loop through all possible base pixels (pixels we’re looking at the offsets for – entire image minus some radius sized boundary) and look at the most significant bits of that base pixel.  The most significant bits from each channel (for example, RGB) are grouped together so we can consider them together.  We then decide to split the distribution of pixel values at the offset pixels into partitions based on the most significant bits of the base pixel.  How many partitions to create is decided using some information theory justification. First, we guess that the distribution is unimodal.  We then want to find the number of bins that would allow us to encode the pixel value distribution using the least number of bits.  This corresponds to choosing a high information content for the pixels in relation to their neighbors.  Once this number of partitions is calculated (using some formulas explained in an unpublished paper referenced in [3]), we can assign weightings based on the information content of an offset pixel and normalize to reduce the reproduction of the same highly weighted pixels. 

2.      Choosing a match.  The distance function is simply the sum of the absolute value of differences in each channel.  So the neighborhood of known pixels around a base pixel is compared (using a kd-tree for searching) against other neighborhoods to choose a match.  The distance function is subject to various weighting parameters as well.  A random channel is also used to add some seeming non-determinism to the process.      

 

This algorithm performs quite well indeed:

Texture synthesis fix.  Against same mask used above

As seen here, the texture synthesis algorithm performs admirably.  It’s still noticeable that something fishy went on where the black silhouette gets dented in the hemisphere in the image.  Since the inpainting algorithm was able to preserve major edges (although the image above shows it doesn’t remain sharp), perhaps some modification to account for flow may prove beneficial.  Since the information content motivated choice of what pixel to color next seemed quite reasonable (although flow ideas could be integrated there), the three versions below focus on integrating flow ideas on the 2nd part – the distance measure. 

Idea 1. With a firm belief in the idea that flow can help, the first idea is to sum up flows from neighboring pixels, weighting each term with the strength of flow (found by taking dot product of flow direction with neighbor vector).  Then, after the sum is found, force each neighbor pixel to be that averaged color achieved from the mixture of all incoming flows.  That way a pixel would have to be pretty much whatever that color is.  Unfortunately, this method allows noise or single pixels that differ by a lot to dominate the process, getting copied multiple times as the flow progresses.  So these distinct pixels get too great a share.  In the inpainting algorithm, the anisotropic diffusion mitigated this. 

Idea 2.  This time, we use only neighbors that contribute at least a minimum amount (determined through experiment) of flow to the pixel of interest.  This time, no averaging goes on and no forcing of neighbors to take on that averaged value is used.  Since we include potentially fewer neighbors (because not all neighbors contribute strong flow), this method sometimes produces faster results (because the 2nd part of the algorithm runs proportional to the number of pixels that need fixing).  Also, one noisy pixel won’t completely kill everything because there are other pixels that don’t share its outcast value.  In practice, this proves to be considerably faster because during the entire search of the kd-tree, we only need to compare the channels in the reduced number of neighbor pixels.  Although this is arguably just a constant, in practice, the constant is not so negligible.  This one also does a pretty good job in this case:

 

Texture Synthesis with flow.  Again, we use the same mask.

Again, the method isn’t perfect.  We still see visual artifacts, but the silhouette is slightly better preserved.  The performance of this algorithm and the purely texture synthesis one are almost on par.  But they have strengths in slightly different areas.  This one will reproduce rough textures a little better because the flow tends toward doing so.  The other algorithm will work slightly better on smooth surfaces.  However, a strong categorization was hard with the tests run. 

Idea 3.  Here we find the undetermined pixels with the strongest flows flowing into them (determined by the magnitude of the gradient perpendicular and the dotproduct with the neighbor direction vector).  Then we match two that are the closest opposing (if one has flow from right, the other should have flow from left or top or bottom, but preferably left) with an almost matching color of flow.  Then since we know the end point pixels with strong matching colored flows, and have the tangent vectors at each endpoint, we use a hermite-like curve to interpolate and determine the approximate curved path.  We then use this as a suggestion for the rest of the texture synthesis procedure.  But because there are hackish elements here, sometimes bad “opposing pixels” are matched up.  This usually does not provide great results.  In digital photos, it tends to be quite harmful indeed.  It only seems to do fine in rather artificial pictures with smooth backgrounds and clear objects in foreground with need for some curve fitting.

Comparison.  The top right corner of a sphere was masked out.  Left: texture synthesis. Middle: Idea 2, texture synthesis with flow.  Right: Idea 3, texture synthesis with hermite curve.

As seen here, the hermite curve was a good guideline in forcing the other pixels to conform.  But because it gives false matches occasionally, it doesn’t perform so well overall.  Furthermore, the simple-minded method of finding matches doesn’t do well with disconnected mask components.  Perhaps a better algorithm could be used to improve this one’s performance. 

It hasn’t been made explicit above, but when no flows exist, or no strong ones, or no matches are found, the default texture synthesis method is used.  In idea 3, the hermite curve is rarely used.  In ideas 1 and 2, the default texture synthesis mode is used much less (in tests tried, used seldomly).   

One last example:

(a) hmm...

(b) don't want that stuff

(c) that's better

Give it a try!

The distribution comes with source code.  Part of it is under the GNU license.  Some of it is under the “you take responsibility” license.  Read the readme for details. The program works with TGA files.

Program options:

1 – corresponds to Idea 3, not so recommended unless you encounter cases like the good one presented above.

2 – texture synthesis with strong flows included only (Idea 2); this one works nicely; fast (maybe 1/3 the time of regular texture synthesis); but less quality in general.

3 – this one is Idea 1. Not recommended. Just included for the curious.

4 – texture synthesis.  Performs well.  Takes a while.  Patience is a virtue.  

5 – inpainting.  This is nice when you have thin scratches.  Scales nicely.  Can’t handle texture though.

ok, enough rambling: here it is!


Update! experimental project with full source. Note for practical use, the version above is a better choice.

Ackowledgements:

Thanks to AJ Shankar, Wing Yung, Robert Estes Jr., and Paul Harrison for releasing code for me to use.  Also, thanks to Professor Steven Gortler and Danil Kirsanov for guidance in CS 276r.

Papers:

[1] M. Bertalmío, G. Sapiro, V. Caselles and C. Ballester.  “Image Inpainting”.
      Proceedings of SIGGRAPH 2000, New Orleans, USA, July 2000.
         

[2] H. Igehy, L. Pereira. “Image Replacement through Texture Synthesis”.
      Proceedings of IEEE International Conference on Image Processing, 1997
.

[3] Harrison, Paul. “A Non-hierarchical Procedure for Re-synthesis of Complex Textures”
      http://www.csse.monash.edu.au/~pfh/resynthesizer/ (ref: May 22, 2002), July 2000.