Text-to-SQL: Natural Language Database Querying with Large Language Models

## TL;DR
Text-to-SQL (NL2SQL) translates natural language questions into executable SQL queries, enabling anyone in an organization to query databases without knowing SQL. With LLMs, the technology has moved from lab benchmarks to production deployments, handling complex multi-table joins, nested subqueries, and domain-specific business logic from plain English questions.

## Core Explanation
The Text-to-SQL problem: given a natural language question ("What were the top 5 products by revenue in Q3 2024?") and a database schema (tables, columns, relationships), generate a syntactically correct SQL query that answers the question. Key challenges: (1) Schema linking -- map natural language terms to the correct table and column names ("revenue" might map to "sales_amount" in the "transactions" table); (2) Complex SQL constructs -- handling GROUP BY, HAVING, nested subqueries, JOINs across multiple tables; (3) Value disambiguation -- "Q3 2024" must be translated to the correct date range; (4) Domain-specific terminology -- business jargon must be mapped to technical schema elements.

## Detailed Analysis
LLM-based approaches (2023-present): (1) Few-shot prompting -- provide the schema and a few example (question, SQL) pairs in the prompt. Simple but limited by context window for large schemas; (2) Decomposition -- break complex questions into sub-questions, generate sub-SQL, and combine; (3) Self-correction -- generate SQL, execute it, check results for errors, and retry. The Nature 2026 framework demonstrates a robust pipeline: schema representation encoding (embedding database metadata), retrieval-augmented schema linking (retrieve relevant table/column descriptions), SQL generation with chain-of-thought reasoning, and execution-based self-correction (execute candidate SQL, check syntax and result cardinality, regenerate if wrong). Benchmarks: Spider (standard, ~10K questions), BIRD (challenging, external knowledge required), WikiSQL (simple single-table), and SParC (interactive, multi-turn). Current SOTA: GPT-4 + self-correction achieves ~86% on Spider, ~65% on BIRD. Production tools: Dataherald, Aito, and commercial offerings from Snowflake (Cortex Analyst), Databricks (Genie), and Google BigQuery (Data QnA) are bringing NL2SQL to enterprise data warehouses.