3D Human Modeling: Parametric Body Models, Mesh Recovery, and Digital Avatars

## TL;DR
3D human modeling reconstructs the human body in three dimensions from images and video -- enabling virtual try-on, motion capture without markers, and realistic digital avatars. Parametric body models (SMPL) and deep learning-based mesh recovery have evolved from laboratory multi-camera setups to working from a single smartphone photo.

## Core Explanation
The problem: given an image or video of a person, recover their 3D body shape and pose. Key distinction from 2D pose estimation: 3D modeling outputs a complete 3D mesh (6890 vertices for SMPL) or parametric parameters that can be animated and rendered from any viewpoint. SMPL model: shape parameters (beta -- 10 PCA components from thousands of body scans capturing height, weight, proportions) + pose parameters (theta -- 23 joint rotations + 1 global orientation, 72 total). The model is differentiable -- can be optimized via gradient descent and integrated into deep learning pipelines.

## Detailed Analysis
HMR (2018): CNN encoder extracts image features -> iterative 3D regression module predicts SMPL parameters. SPIN (2019): alternates between optimization (SMPLify -- optimize SMPL parameters to fit 2D keypoints) and regression (trained on optimization outputs). This self-improving loop boosts accuracy. PARE (2021): part attention mechanism learns which body parts are visible vs. occluded. TokenHMR (2024): transformer-based, treats pose tokens as queries attending to image features. Applications: virtual try-on (Zalando, Amazon), markerless motion capture (Move AI, Plask), fitness form analysis from single-camera video, and AR/VR avatars from a selfie. Key limitations: clothing (SMPL models naked body shape; clothed body requires separate CAPE/SCARF models) and monocular depth ambiguity -- single-view 3D reconstruction is fundamentally ill-posed.