Kolmogorov-Arnold Networks (KANs): Learnable Activation Functions as MLP Alternatives

## TL;DR
Kolmogorov-Arnold Networks (KANs) are a radical architectural innovation: instead of fixed activation functions on neurons, KANs use learnable B-spline functions on edges. This design achieves higher accuracy with far fewer parameters, challenging the 60-year dominance of the Multi-Layer Perceptron.

## Core Explanation
MLP: each edge carries a scalar weight w, each node has a fixed activation function σ (ReLU, GELU, etc.). Operation: h_{l+1} = σ(W_l · h_l). KAN: each edge carries a learnable univariate function φ(x) (parameterized as B-spline), nodes perform simple summation. Operation: h_{l+1,j} = Σ_i φ_{l,i,j}(h_{l,i}). This flips the standard paradigm — putting the "learning" on the edges (as functions) and the "nonlinearity" through the function shapes themselves. B-splines provide smooth, locally-supported basis functions with learnable coefficients, enabling efficient gradient-based optimization.

## Detailed Analysis
Advantages of KANs: (1) Parameter efficiency — 10-100x fewer parameters for equivalent accuracy on symbolic regression and PDE solving; (2) Interpretability — each edge function φ(x) can be visualized and understood, and the network can be symbolically simplified (pruning, symbolic regression) into compact mathematical formulas; (3) Avoids catastrophic forgetting naturally since different edges learn different functional relationships. Limitations: (1) Slower training — B-spline evaluation is more expensive than matrix multiplication, though FastKAN and ChebyKAN variants reduce this; (2) Not yet competitive with Transformers for large-scale language modeling; (3) Hyperparameter sensitivity — B-spline order (k) and grid size affect performance. The 2024 paper sparked 15,000+ GitHub stars and extensive follow-up work including Convolutional KANs, Graph KANs, and Fourier KANs. Physics applications (PDE solving, operator learning) and scientific computing benefit most from KANs' interpretability and efficiency.

## Further Reading
- pyKAN: Official KAN Implementation (GitHub: KindXiaoming/pykan)
- EfficientKAN and FastKAN Optimizations
- KAN for Time Series Forecasting and Scientific ML