Making 3-D imaging 1,000 times better | MIT News

Algorithms exploiting light’s polarization boost resolution of commercial depth sensors 1,000-fold.

Larry Hardesty | MIT News Office • mit
Dec. 1, 2015 • 9 min • Source

MIT researchers have shown that by exploiting the polarization of light — the physical phenomenon behind polarized sunglasses and most 3-D movie systems — they can increase the depth resolution of conventional 3-D imaging devices as much as 1,000 times.

The technique could lead to high-quality 3-D cameras built into cellphones, and perhaps to the ability to snap a photo of an object and then use a 3-D printer to produce a replica.

Further out, the work could also abet the development of driverless cars.

“Today, they can miniaturize 3-D cameras to fit on cellphones,” says Achuta Kadambi, a PhD student in the MIT Media Lab and one of the system’s developers. “But they make compromises to the 3-D sensing, leading to very coarse recovery of geometry. That’s a natural application for polarization, because you can still use a low-quality sensor, and adding a polarizing filter gives you something that’s better than many machine-shop laser scanners.”

The researchers describe the new system, which they call Polarized 3D, in a paper they’re presenting at the International Conference on Computer Vision in December. Kadambi is the first author, and he’s joined by his thesis advisor, Ramesh Raskar, associate professor of media arts and sciences in the MIT Media Lab; Boxin Shi, who was a postdoc in Raskar’s group and is now a research fellow at the Rapid-Rich Object Search Lab; and Vage Taamazyan, a master’s student at the Skolkovo Institute of Science and Technology in Russia, which MIT helped found in 2011.

When polarized light gets the bounce

If an electromagnetic wave can be thought of as an undulating squiggle, polarization refers to the squiggle’s orientation. It could be undulating up and down, or side to side, or somewhere in-between.

Polarization also affects the way in which light bounces off of physical objects. If light strikes an object squarely, much of it will be absorbed, but whatever reflects back will have the same mix of polarizations that the incoming light did. At wider angles of reflection, however, light within a certain range of polarizations is more likely to be reflected.

This is why polarized sunglasses are good at cutting out glare: Light from the sun bouncing off asphalt or water at a low angle features an unusually heavy concentration of light with a particular polarization. So the polarization of reflected light carries information about the geometry of the objects it has struck.

This relationship has been known for centuries, but it’s been hard to do anything with it, because of a fundamental ambiguity about polarized light. Light with a particular polarization, reflecting off of a surface with a particular orientation and passing through a polarizing lens is indistinguishable from light with the opposite polarization, reflecting off of a surface with the opposite orientation.

This means that for any surface in a visual scene, measurements based on polarized light offer two equally plausible hypotheses about its orientation. Canvassing all the possible combinations of either of the two orientations of every surface, in order to identify the one that makes the most sense geometrically, is a prohibitively time-consuming computation.

Polarization plus depth sensing

To resolve this ambiguity, the Media Lab researchers use coarse depth estimates provided by some other method, such as the time a light signal takes to reflect off of an object and return to its source. Even with this added information, calculating surface orientation from measurements of polarized light is complicated, but it can be done in real-time by a graphics processing unit, the type of special-purpose graphics chip found in most video game consoles.

The researchers’ experimental setup consisted of a Microsoft Kinect — which gauges depth using reflection time — with an ordinary polarizing photographic lens placed in front of its camera. In each experiment, the researchers took three photos of an object, rotating the polarizing filter each time, and their algorithms compared the light intensities of the resulting images.

On its own, at a distance of several meters, the Kinect can resolve physical features as small as a centimeter or so across. But with the addition of the polarization information, the researchers’ system could resolve features in the range of tens of micrometers, or one-thousandth the size.

For comparison, the researchers also imaged several of their test objects with a high-precision laser scanner, which requires that the object be inserted into the scanner bed. Polarized 3D still offered the higher resolution.

A mechanically rotated polarization filter would probably be impractical in a cellphone camera, but grids of tiny polarization filters that can overlay individual pixels in a light sensor are commercially available. Capturing three pixels’ worth of light for each image pixel would reduce a cellphone camera’s resolution, but no more than the color filters that existing cameras already use.

The new paper also offers the tantalizing prospect that polarization systems could aid the development of self-driving cars. Today’s experimental self-driving cars are, in fact, highly reliable under normal illumination conditions, but their vision algorithms go haywire in rain, snow, or fog. That’s because water particles in the air scatter light in unpredictable ways, making it much harder to interpret.

The MIT researchers show that in some very simple test cases — which have nonetheless bedeviled conventional computer vision algorithms — their system can exploit information contained in interfering waves of light to handle scattering. “Mitigating scattering in controlled scenes is a small step,” Kadambi says. “But that’s something that I think will be a cool open problem.”

“The work fuses two 3-D sensing principles, each having pros and cons,” says Yoav Schechner, an associate professor of electrical engineering at Technion — Israel Institute of Technology in Haifa, Israel. “One principle provides the range for each scene pixel: This is the state of the art of most 3-D imaging systems. The second principle does not provide range. On the other hand, it derives the object slope, locally. In other words, per scene pixel, it tells how flat or oblique the object is.”

“The work uses each principle to solve problems associated with the other principle,” Schechner explains. “Because this approach practically overcomes ambiguities in polarization-based shape sensing, it can lead to wider adoption of polarization in the toolkit of machine-vision engineers.”

Thank you for the most informative news!!!!Hats off to the research towards the 3D imaging by MIT researchers.Geethapriya-SGBC India, Online Digital Revolution

Now make it an app that works on tablets/phones with regular cameras and give it out free to make it the next great revolution since fire and the wheel.

THANKS

I would love to see that kinekt software open sourced!!

Does it work with texture rich surface?

why not just put an LCD in front of the sensor to rotate the polarisation

Bravo! Keep breaking barriers and innovating. Just amazing!

Can polarization be used to improve medical imaging like MRIs?

Does this work on texture/color rich surface?

I believe "texture" introduces some "noise" into the polarization of the reflected light -- as the article mentions, the polarization detected feeds into detecting the local slope of the object surface, "how flat or oblique the object is". Polarization largely ignores surface coloration, although the effects of metal surfaces on reflectance would imply that some types of surface chemistry might be pertinent. Kinect SDKs (software development kits and ancillary ecologies), by the way, have been available for some time now.

i'd also love to see the kinect software open sourced. 1000x clearer 3d scanning will be great for 1000x clearer printing!

3d projectors please, what's taking so long?

"they can increase the resolution of conventional 3-D imaging devices as much as 1,000 times"and"Kinect can resolve ... as small as a centimeter ..., the researchers’ system could resolve features in the range of tens of micrometers"----First of all, one centimeter divided by 1000 would be tens of milimeters, not micrometers.

Second, it says they managed to multiply 1000 times the 3D resolution, not a single dimension, what means rather that it is multiplied 10 times per dimension.

Or am I missing something?

We did this 3 years ago using the Microsoft Kinect and open source software. Here's a link to the abstract.

http://scitation.aip.org/conte...

Reprinted with permission of MIT News