Photometric Stereo

👁️ Computer Vision 🟡 Intermediate 👁 1 views

📖 Quick Definition

A computer vision technique that reconstructs 3D surface shapes by analyzing how an object reflects light under varying illumination directions.

## What is Photometric Stereo? Photometric stereo is a classic yet powerful technique in computer vision used to recover the detailed 3D geometry of an object’s surface. Unlike standard stereo vision, which relies on multiple camera viewpoints to triangulate depth, photometric stereo uses a single fixed camera and varies the position of the light sources. By capturing several images of the same static scene from different lighting angles, the algorithm can deduce the orientation of every pixel on the object's surface. Think of it like running your fingers over a textured piece of paper in the dark versus under a bright lamp. In the dark, you feel the bumps and valleys (depth). Under a single light, you see shadows but might miss fine details. However, if you move the light around while keeping your eyes still, the changing highlights and shadows reveal the precise shape of the texture. Photometric stereo automates this process mathematically, converting changes in brightness into precise surface normals—the vectors perpendicular to the surface at each point. This method is particularly effective for objects with Lambertian reflectance, meaning they scatter light equally in all directions (like matte paint or skin), rather than shiny surfaces that create specular highlights. It allows for high-resolution reconstruction without the need for expensive depth sensors or complex multi-camera rigs. ## How Does It Work? The core principle relies on the relationship between light intensity, surface orientation, and material properties. For a Lambertian surface, the observed brightness $I$ at a specific pixel is determined by the dot product of the light direction vector $\mathbf{l}$ and the surface normal vector $\mathbf{n}$, scaled by the albedo (reflectivity) $\rho$: $$ I = \rho (\mathbf{n} \cdot \mathbf{l}) $$ To solve for the unknowns ($\mathbf{n}$ and $\rho$), we need more equations than variables. If we take three images of the same object using three different known light directions, we create a system of linear equations. In matrix form, this looks like: $$ \begin{bmatrix} I_1 \\ I_2 \\ I_3 \end{bmatrix} = \begin{bmatrix} l_{1x} & l_{1y} & l_{1z} \\ l_{2x} & l_{2y} & l_{2z} \\ l_{3x} & l_{3y} & l_{3z} \end{bmatrix} \begin{bmatrix} n_x \\ n_y \\ n_z \end{bmatrix} \rho $$ By inverting the light direction matrix, we can solve for the scaled normal vector $\mathbf{g} = \rho \mathbf{n}$. Once we have $\mathbf{g}$, we normalize it to get the unit surface normal $\mathbf{n}$, and the magnitude gives us the albedo $\rho$. Finally, integrating these normals across the entire image yields the 3D height map of the surface. ## Real-World Applications * **Industrial Quality Control**: Detecting microscopic scratches, dents, or printing defects on flat surfaces like silicon wafers, metal sheets, or packaging materials where standard lighting fails to reveal subtle irregularities. * **Digital Heritage Preservation**: Creating highly detailed 3D models of artifacts, coins, or ancient manuscripts with intricate engravings without touching or damaging the fragile objects. * **Face Recognition and Analysis**: Enhancing facial recognition systems by separating identity-specific features (shape/albedo) from lighting conditions, making recognition robust against varying environmental lights. * **Augmented Reality (AR)**: Generating realistic relighting effects for virtual objects placed in real-world scenes by understanding the underlying geometry of the physical environment. ## Key Takeaways * **Single Camera, Multiple Lights**: The technique requires only one camera but multiple images taken under different, known lighting positions. * **Surface Normals First**: It primarily computes surface normals (directions) rather than direct depth, which are then integrated to form a 3D shape. * **Lambertian Assumption**: It works best on matte, non-shiny surfaces; shiny or transparent objects require more complex variations of the technique. * **High Resolution**: It can achieve much higher spatial resolution than traditional stereo vision because it operates at the pixel level rather than relying on feature matching. ## 🔥 Gogo's Insight **Why It Matters**: In an era dominated by deep learning, photometric stereo remains relevant because it provides ground-truth geometric data efficiently. While neural networks can guess depth from a single image, they often lack physical accuracy. Photometric stereo offers a physically grounded, deterministic solution that is computationally lightweight compared to training large 3D reconstruction models. **Common Misconceptions**: Many assume this technique works on any object. In reality, standard photometric stereo fails on specular (shiny) or translucent surfaces because the simple Lambertian model doesn't account for reflected highlights or subsurface scattering. Specialized variants are needed for those cases. **Related Terms**: * **Shape from Shading**: A related but harder problem that estimates shape from a *single* image, requiring strong assumptions about lighting and shape. * **Albedo Estimation**: The process of determining the intrinsic color/reflectivity of a surface, independent of lighting. * **Normal Map**: A texture map used in rendering that stores surface normal information, often the direct output of photometric stereo algorithms.

🔗 Related Terms

← PersonalizationPhotonic Acceleration →

🤖 See AI tools in action

Explore real-world applications and compare AI tools

AI Use Cases → Compare Tools →