## TL;DR
AI is reengineering retail -- from cashierless stores where you grab-and-go to dynamic prices that optimize in real-time to virtual try-on that lets you see clothes on your body before buying. Computer vision, reinforcement learning, and recommendation systems collectively create a shopping experience that is seamless, personalized, and increasingly autonomous.
## Core Explanation
Retail AI stack: (1) In-store -- computer vision for inventory tracking (shelf cameras detect out-of-stock items), theft detection, and cashierless checkout (Amazon Just Walk Out); (2) Pricing -- dynamic pricing algorithms adjust prices based on supply/demand, competitor scraping, and customer willingness-to-pay. RL for markdown optimization: learn optimal discount trajectory for seasonal items (initial price -> subsequent markdowns -> clearance); (3) Personalization -- recommendation systems suggest products based on purchase/browse history (collaborative filtering + content-based). Visual search: upload photo -> find similar products. Virtual try-on: AR overlay of clothing on user body; (4) Supply chain -- demand forecasting (per-SKU per-store) drives inventory allocation and replenishment.
## Detailed Analysis
Amazon Just Walk Out: overhead cameras track each shopper with multi-camera 3D pose tracking. Weight sensors on shelves detect when items are removed. The system fuses camera and weight data to associate item removals with specific shoppers. Deep learning handles occlusion (shoppers blocking cameras) and crowd scenarios. Privacy: no facial recognition -- the system uses body appearance features and device association. Dynamic pricing: RL-based approaches (Q-learning, DQN) model the pricing problem as a sequential decision process. State: current inventory, time to end-of-season, competitor prices, demand forecast. Action: set price. Reward: revenue. Constraint: minimum margin, brand price image. Fast-fashion (Zara, H&M) and grocery (dynamic markdowns on perishables) are primary adopters. Visual search: CLIP-based embeddings map product images and text descriptions into shared space. Virtual try-on: VITON-HD and DCTON use GANs/diffusion to generate realistic try-on images.