102

How can I detect and peel the label from the jar below (POV, cylinder radius, jar contents are all unknown)

enter image description here

to get something like this, which is the original label before it was stuck on the jar?

enter image description here

Dr. belisarius
  • 115,881
  • 13
  • 203
  • 453
  • Do you have any prior knowledge of the label ? – image_doctor May 17 '12 at 14:15
  • 11
    @acl I tried eating the jar contents. Regretfully I own only a photograph, so it wasn't very pleasant. – Dr. belisarius May 17 '12 at 14:16
  • @image_doctor Nope – Dr. belisarius May 17 '12 at 14:16
  • @belisarius but if you really have no knowledge of the label or {curvature of jar, point of view}, isn't this obviously impossible? – acl May 17 '12 at 14:17
  • @acl I think some assumptions can be made regarding the curvature by examining the eccentricity of the lid. – rcollyer May 17 '12 at 14:19
  • @acl well ... I do know it is rectangular – Dr. belisarius May 17 '12 at 14:19
  • @belisarius And that the jar is cylindrical? Not a cone or fancy shape? – Szabolcs May 17 '12 at 14:20
  • @Szabolcs Yes, just a plain ol'cylinder – Dr. belisarius May 17 '12 at 14:21
  • @belisarius so it isn't certain we are looking for a label ? – image_doctor May 17 '12 at 14:22
  • 2
    @image_doctor Identifying the label is not the main task (although it could be very nice). I can manually pre-process the image and change the label color to something easy to extract. The geometric transform is the main issue. – Dr. belisarius May 17 '12 at 14:27
  • An envelope warp (distort; transform) from the outline of the label to a rectangle would get you quite close. Cylindrical distortion would remain. Stretching the resulting image proportional to the local curvature of the original outline of the label would then get you very close. This would be accurate enough for design work, but not for precise measurement. What is your application? – Mr.Wizard May 17 '12 at 14:44
  • 3
    @Mr. Just image processing. I need to be able to print a few of these labels (a classical marmalade counterfeiting organization contracted me). No need for ultimate accuracy. – Dr. belisarius May 17 '12 at 14:47
  • @Mr.Wizard Anyway, the whole scale is not needed (and impossible to find out). Just the rectangle proportions and content. – Dr. belisarius May 17 '12 at 14:49
  • 1
    You had me concerned for a moment before I remembered your singular sense of humor. :-) – Mr.Wizard May 17 '12 at 14:49
  • I'm going to pass on this question because I think version 8 is better equipped for this kind of manipulation, but I believe my comment above lays out a reasonable solution for anyone who cares to implement it. (It will work in a graphics app like Photoshop, for example.) – Mr.Wizard May 17 '12 at 14:51
  • @Mr.Wizard yes, that is exactly what's needed. how to do it with minimal effort is the question – acl May 17 '12 at 14:57

4 Answers4

96

This answer evolved over time and got quite long in the process. I've created a cleaned-up, restructured version as an answer to a very similar question on dsp.stackexchange.

Here's my quick&dirty solution. It's a bit similar to @azdahak's answer, but it uses an approximate mapping instead of cylindrical coordinates. On the other hand, there are no manually adjusted control parameters - the mapping coefficients are all determined automatically:

The label is bright in front of a dark background, so I can find it easily using binarization:

src = Import["https://i.stack.imgur.com/rfNu7.png"];
binary = FillingTransform[DeleteBorderComponents[Binarize[src]]]

binarized image

I simply pick the largest connected component and assume that's the label:

labelMask = Image[SortBy[ComponentMeasurements[binary, {"Area", "Mask"}][[All, 2]], First][[-1, 2]]]

largest component

Next step: find the top/bottom/left/right borders using simple derivative convolution masks:

topBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{1}, {-1}}]];
bottomBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{-1}, {1}}]];
leftBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{1, -1}}]];
rightBorder = DeleteSmallComponents[ImageConvolve[labelMask, {{-1, 1}}]];

enter image description here

This is a little helper function that finds all white pixels in one of these four images and converts the indices to coordinates (Position returns indices, and indices are 1-based {y,x}-tuples, where y=1 is at the top of the image. But all the image processing functions expect coordinates, which are 0-based {x,y}-tuples, where y=0 is the bottom of the image):

{w, h} = ImageDimensions[topBorder];
maskToPoints = Function[mask, {#[[2]]-1, h - #[[1]]+1} & /@ Position[ImageData[mask], 1.]];

Now I have four separate lists of coordinates of the top, bottom, left, right borders of the label. I define a mapping from image coordinates to cylinder coordinates:

Clear[mapping];
mapping[{x_, y_}] := {c1 + c2*x + c3*y + c4*x*y, c5 + c6*y + c7*x + c8*x^2}

This mapping is obviously only a crude approximation to cylinder coordinates. But it's very simple to optimize the coefficients c1..c8:

minimize =
  Flatten[{
    (mapping[#][[1]])^2 & /@ maskToPoints[leftBorder],
    (mapping[#][[1]] - 1)^2 & /@ maskToPoints[rightBorder],
    (mapping[#][[2]] - 1)^2 & /@ maskToPoints[topBorder],
    (mapping[#][[2]])^2 & /@ maskToPoints[bottomBorder]
    }];
solution = NMinimize[Total[minimize], {c1, c2, c3, c4, c5, c6, c7, c8}][[2]]

This minimizes the mapping coefficients, so the points on the left border are mapped to {0, [anything]}, the points on the top border are mapped to {[anything], 1} and so on.

The actual mapping looks like this:

enter image description here

Show[src,
 ContourPlot[mapping[{x, y}][[1]] /. solution, {x, 0, w}, {y, 0, h}, 
  ContourShading -> None, ContourStyle -> Red, 
  Contours -> Range[0, 1, 0.1], 
  RegionFunction -> Function[{x, y}, 0 <= (mapping[{x, y}][[2]] /. solution) <= 1]],
 ContourPlot[mapping[{x, y}][[2]] /. solution, {x, 0, w}, {y, 0, h}, 
  ContourShading -> None, ContourStyle -> Red, 
  Contours -> Range[0, 1, 0.2],
  RegionFunction -> Function[{x, y}, 0 <= (mapping[{x, y}][[1]] /. solution) <= 1]]]

Now I can pass the mapping directly to ImageForwardTransformation

ImageForwardTransformation[src, mapping[#] /. solution &, {400, 300}, DataRange -> Full, PlotRange -> {{0, 1}, {0, 1}}]

enter image description here

The artifacts in the image are already present in the source image. Do you have a high-res version of this image? The distortion on the left side is due to the incorrect mapping. This could probably be reduced by using an improved mapping function, but I can't think of one that's better and still simple enough for minimization right now.

ADD:

I've tried the same algorithm on the high-res image you linked to in the comment, result looks like this:

enter image description here

enter image description here

I had to make minor changes to the label-detection part (DeleteBorderComponents first, then FillingTransform) and I've added extra terms to the mapping formula for the perspective (that wasn't noticeable in the low-res image). At the borders you can see that the 2nd order approximation doesn't fit perfectly, but this might be good enough.

And you might want to invert the mapping function symbolically and use ImageTransformation instead of ImageForwardTransformation, because this is really really slow.

ADD 2:

I think I've found a mapping that eliminates the cylindrical distortion (more or less, at least):

arcSinSeries = Normal[Series[ArcSin[α], {α, 0, 10}]]
Clear[mapping];
mapping[{x_, y_}] := 
   {
    c1 + c2*(arcSinSeries /. α -> (x - cx)/r) + c3*y + c4*x*y, 
    top + y*height + tilt1*Sqrt[Clip[r^2 - (x - cx)^2, {0.01, ∞}]] + tilt2*y*Sqrt[Clip[r^2 - (x - cx)^2, {0.01, ∞}]]
   }

This is a real cylindrical mapping. I used the Taylor series to approximate the arc sine, because I couldn't get the optimization working with ArcSin directly. The Clip calls are my ad-hoc attempt to prevent complex numbers during the optimization. Also, I couldn't get NMinimize to optimize the coefficients, but FindMinimium will work just fine if I give it good start values. And I can estimate good start values from the image, so it should still work for any image (I hope):

leftMean = Mean[maskToPoints[leftBorder]][[1]];
rightMean = Mean[maskToPoints[rightBorder]][[1]];
topMean = Mean[maskToPoints[topBorder]][[2]];
bottomMean = Mean[maskToPoints[bottomBorder]][[2]];
minimize =
  Flatten[{
    (mapping[#][[1]])^2 & /@ maskToPoints[leftBorder],
    (mapping[#][[1]] - 1)^2 & /@ maskToPoints[rightBorder],
    (mapping[#][[2]] - 1)^2 & /@ maskToPoints[topBorder],
    (mapping[#][[2]])^2 & /@ maskToPoints[bottomBorder]
    }];
solution = 
 FindMinimum[
   Total[minimize], 
    {{c1, 0}, {c2, rightMean - leftMean}, {c3, 0}, {c4, 0}, 
     {cx, (leftMean + rightMean)/2}, 
     {top, topMean}, 
     {r, rightMean - leftMean}, 
     {height, bottomMean - topMean}, 
     {tilt1, 0}, {tilt2, 0}}][[2]]

Resulting mapping:

enter image description here

Unrolled image:

enter image description here

The borders now fit the label outline quite well. The characters all seem to have the same width, so I think there's not much distortion, either. The solution of the optimization can also be checked directly: The optimization tries to estimate the cylinder radius r and the x-coordinate of the cylinder center cx, and the estimated values only a few pixels off the real positions in the image.

ADD 3:

I've tried the algorithm on a few images I've found using google image search. No manual interaction except sometimes cropping. The results look promising:

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

As expected, the label detection is the least stable step. (Hence the cropping.) If the user marked points inside the label and outside the label, a watershed-based segmentation will probably give better results.

I'm not sure if the mapping optimization is always numerically stable. But it worked for every image I've tried as long as the label detection worked. The original approximate mapping with the 2nd order terms is probably more stable than the improved cylindrical mapping, so it could be used as a "fallback".

For example, in the 4. sample, the radius can not be estimated from the curvature of the top/bottom border (because there is almost no curvature), so the resulting image is distorted. In this case, it might be better to use the "simpler" mapping, or have the user select the left/right borders of the glass (not the label) manually, and set the center/radius explicitly, instead of estimating it by optimizing the mapping coefficients.

ADD 4:

@Szabolcs has written interactive code that can un-distort these images.

My alternative suggestion to improve this interactively would be to let the user select the left&right borders of the image, for example using a locator pane:

LocatorPane[Dynamic[{{xLeft, y1}, {xRight, y2}}], 
 Dynamic[Show[src, 
   Graphics[{Red, Line[{{xLeft, 0}, {xLeft, h}}], 
     Line[{{xRight, 0}, {xRight, h}}]}]]]]

LocatorPane

Then I can explicitly calculate r and cx instead of optimizing for them:

manualAdjustments = {cx -> (xLeft + xRight)/2, r -> (xRight - xLeft)/2};
solution = 
  FindMinimum[
   Total[minimize /. manualAdjustments], 
    {{c1, 0}, {c2, rightMean - leftMean}, {c3, 0}, {c4, 0}, 
     {top, topMean}, 
     {height, bottomMean - topMean}, 
     {tilt1, 0}, {tilt2, 0}}][[2]]
solution = Join[solution, manualAdjustments]

Using this solution, I get almost distortion-free results:

(Almost) no distortion

enter image description here

Niki Estner
  • 36,101
  • 3
  • 92
  • 152
  • 1
    That's very nice. +1 – azdahak May 17 '12 at 20:47
  • @nikie Thanks for your answer, I'll experiment a bit with the code. As for better images, there are relatively easy to find a few in the Internet. For example: http://www.kitchengardenpreserves.co.uk/newsb/wp-content/uploads/2011/12/Pear-and-Elderflower-cropped.jpg – Dr. belisarius May 17 '12 at 21:25
  • good idea with using the known fact that the label is a rectangle... +1 – acl May 17 '12 at 21:58
  • +1 This is a nice practical example of using Mathematica's image processing capabilities. They seem quite comprehensive, but I haven't seen them used much around here (and I find the built in help a bit lacking). – wxffles May 18 '12 at 01:05
  • How general is this? Have you tried it on many images, or does it only work on a few? I realize the label detection part is more trouble, so a practical solution would be something that doesn't require a lot of precision manual intervention, but alows some. E.g. we could manually crop the images to a rectangular area around the labels. If we do this, will it work automatically for peeling off many different kinds of labels? – Szabolcs May 18 '12 at 07:49
  • (In other words, is the question ready to be shared with a wider audience to amaze them? :-) ) – Szabolcs May 18 '12 at 07:50
  • @Szabolcs: I really don't know how general/stable it is. I've tried it on a few images (see Add 3) and it worked quite well as long as the label was brighter than the rest of the image (because of the binarization). – Niki Estner May 18 '12 at 08:29
  • @nikie Thanks for adding those pictures, it's very nice, and it just made your answer a lot cooler! The remaining problem is the distortion on the extreme left and extreme right, clearly visible on the very last example. I'm not really sure how to tackle that without making the method more fragile and potentially introducing even greater distortions ... – Szabolcs May 18 '12 at 09:15
  • If we look at the label exactly from the front and centre, so it appears as a rectangle, then the outline provides no information about how much we'd need to correct for the distortion near the left and right edges. The label could still be very small (covering a small angle of the cylinder, small distortion), or large, covering 150 of the cylinder (big distortion). I guess there's no way out... – Szabolcs May 18 '12 at 09:15
  • 1
    Could you perhaps summarize your answer in a new answer to this question and also link back here? Then I could remove my answer. – Szabolcs May 18 '12 at 13:49
  • @nikie That's a great answer, thanks! It teaches a lot about image processing in Mathematica. (I just looked in my cupboard, and the only jam i could find one of these - octagonal labels... :) – cormullion May 18 '12 at 14:26
  • @Szabolcs: Done. I'll leave the "evolution version" of the answer here and link to the "intelligent design" version on dsp, that makes it look like I had known how to do it right away. – Niki Estner May 18 '12 at 14:43
  • @nikie It seems Mathematica can be used for machine vision type image processing techniques. I saw this answer and added an account for Mathematica just because of my interest in this. Thanks for an excellent demonstration. – JYelton Oct 12 '12 at 15:37
  • @JYelton: I use it almost exclusively for that. I've tried both Matlab and Mathematica; they're more or less equal in image processing/computer vision capabilities (if you buy the right MatLab toolboxes), but the Mathematica language is way more powerful. – Niki Estner Oct 12 '12 at 15:52
  • Can you explain where you got this term 'c1 + c2x + c3y + c4xy' ? – David Doria Jan 06 '18 at 02:21
  • @NikiEstner You say: This minimizes the mapping coefficients, so the points on the left border are mapped to {0, [anything]}, the points on the top border are mapped to {[anything], 1} and so on. Why are the points on the top border not mapped to have their column be 0 (not 1)? And what about the bottom and right borders? How do you know their values? (i.e. how do you know the width/height of the output image?) – David Doria Jan 06 '18 at 02:26
  • Do you happen to have a reference to the cylindrical mapping? And also perhaps for an inverse mapping? – Ita Jun 04 '18 at 17:27
  • 1
    @Ita: No, that was an ad-hoc invention. The idea was: The angle on the cylinder should be ArcSin[x*c1+y*c2+c3] for some constants c1..c3, if we ignore perspective distortion. But fitting ArcSin didn't work well with FindMinimum, so I used a Taylor series that's close enough in the angle range I'm interested in, and real for all x and y. – Niki Estner Jun 04 '18 at 17:53
15

Here's my stab at it, using a cylindrical projection, and TextureCoordinateFunction with a fitting parameter. Replace IMG in the code with the actual photo. The last command is a manual crop.

result=With[{para = 1.69}, 
ParametricPlot3D[{Cos[u], Sin[u], v}, {u, 0, Pi}, {v, 0, Pi}, 
PlotStyle -> Texture[ImageReflect[IMG, Left -> Right]], 
Mesh -> None, PlotRange -> All, ViewPoint -> Front, 
Lighting -> "Neutral", 
TextureCoordinateFunction -> ({#4, para Tan[#5]} &), 
AspectRatio -> para]]

Show[ImageTrim[Rasterize[result], {{45, 75}, {232, 165}}],AspectRatio->1]

enter image description here

Here's a modified version of the above using a few more control parameters which gives a projection on to a more general elliptical cylinder.It also corrects the lighting.

result = ParametricPlot3D[{Cos[u], .21 Sin[0.9 u], 
v + .08 Sin[u - .05]}, {u, 0, \[Pi]}, {v, 0, 1}, 
PlotStyle -> Texture[ImageReflect[IMG, Left -> Right]], 
Mesh -> None, PlotRange -> All, ViewPoint -> Front, 
Lighting -> {{"Ambient", White}}];

Show[ImageTrim[Rasterize[result], {{50, 53}, {320, 120}}], 
AspectRatio -> 1, ImageSize -> 200]

enter image description here

azdahak
  • 946
  • 5
  • 11
  • I think the issue is that the photo was taken from above, and the vertical axis is offset from the picture by a small angle. If you can take that into account, possibly by examining the ellipse made by the top, then I'd think you could get a better projection. – rcollyer May 17 '12 at 16:46
  • Right. I made a modification above which uses a more general projection. Of course, it's still all hand-tuned. – azdahak May 17 '12 at 19:04
  • we could try to combine this with minimising the mismatch between corresponding points (found as shown in my answer). eg, by producing the plots you have with the viewpoint etc as input, rasterizing the output, locating corresponding pts etc – acl May 17 '12 at 19:06
  • That's much better, there's still some curvature and distortion, but that was present already in the image. So, +1, for that. The values for the cylinder look hand tuned, can that be automated? – rcollyer May 17 '12 at 19:09
10

@nikie gave a very nice answer. This is a complement to it.

One remaining challenge is compensating for the distortion close to the left and right edges of the image, visible for example here (image taken from nikie's post):

Mathematica graphics

The magnitude of the distortion cannot be estimated in the general case without having some information about what's on the label. If the photo of the label is taken directly from the front, it will always appear as a rectangle. Therefore it's impossible to tell what angular region of the cylinder (jar) the labels takes up simply by looking at its photo. It could be half a jar (180°) or it could be negligibly small (e.g. 30°).

One solution could be letting the user compensate for the distortion manually (as a person can judge the magnitude of the distortion from e.g. the text on the label). We could use a Manipulate. The challenge in implementing this was making the Manipulate responsive enough for practical use. ImageTransformation tends to be too slow, so I used ParametricPlot instead to get an approximation for the angle $2 \times \alpha$ taken up by the label.

img = Import["https://i.stack.imgur.com/9axDH.png"]

Manipulate[
 ParametricPlot[{ArcSin[u], v}, {u, -Sin[α], Sin[α]}, {v, 0, 1}, 
    PlotStyle -> {Opacity[1], Texture[img]},
    Mesh -> mesh, 
    AspectRatio -> Divide @@ Reverse@ImageDimensions[img], 
    Frame -> False, Axes -> False, BoundaryStyle -> None, 
    PlotPoints -> {ControlActive[30, 300], 2}, Exclusions -> None],
 {α, 0.1, Pi/2}, {{mesh, False}, {True, False}}]

Mathematica graphics

Note: I assumed parallel projection to keep things simple.

After finding an approximation for $\alpha$ by dragging the slider and looking at the output, we can compute a better quality "undistorted" image using ImageTransformation:

α = 1.235;

ImageTransformation[img, 
 Function[p, {(Sin[Rescale[p[[1]], {0, 1}, α {-1, 1}]] + Sin[α])/(2 Sin[α]), p[[2]]}]]

Mathematica graphics

Szabolcs
  • 234,956
  • 30
  • 623
  • 1,263
9

this approach looks promising:

labelphoto = Import["/Users/acl/Desktop/labelimage.png"]
label = Import["/Users/acl/Desktop/FMALS.gif"]

Mathematica graphics

and then locate corresponding points as follows:

images = {label, labelphoto};
matches = ImageCorrespondingPoints @@ images
MapThread[
 Show[#1, Graphics[{Red, 
     MapIndexed[Inset[#2[[1]], #1] & , #2]}]] &, {images, matches}]

Mathematica graphics

So now all we need is a routine to do arbitrary rotations of an image projected onto the surface of a cylinder, then optimize matching of points with respect to cylinder radius, viewpoint, rotation etc. (and it would be nice to have more than 3 points, but that would probably be the case for larger images).

As an aside, it sounds like FindGeometricTransform should do what we want, but it's not general enough as it is:

timage = ImageTransformation[label, Sqrt];
images = {label, timage};
matches = ImageCorrespondingPoints @@ images;
MapThread[
 Show[#1, Graphics[{Red, 
     MapIndexed[Inset[#2[[1]], #1] & , #2]}]] &, {images, matches}]
{terr, t} = FindGeometricTransform[label, timage];
ImagePerspectiveTransformation[ImageTransformation[label, Sqrt], t, 
 DataRange -> Full]

Mathematica graphics

So it'll take some more work.

acl
  • 19,834
  • 3
  • 66
  • 91