Recent Advances in KAN-Based Numerical PDE Solvers
Table of Contents
Kolmogorov-Arnold Networks (KANs), introduced in 2024, have rapidly become one of the most active frontiers in scientific machine learning for solving partial differential equations (PDEs) (Liu et al., 2024). Unlike Multi-Layer Perceptrons (MLPs), which apply fixed activation functions at nodes, KANs place learnable univariate activation functions on edges, grounded in the Kolmogorov-Arnold representation theorem: every continuous multivariate function can be expressed as a composition of univariate functions and summations. This structural difference gives KANs two key properties relevant to PDE numerics — higher interpretability and parameter efficiency — making them an appealing successor to MLP-based Physics-Informed Neural Networks (PINNs).
From 2024 through early 2026, researchers have published dozens of frameworks combining KANs with classical numerical concepts (spectral methods, operator learning, energy-stable time-stepping, neural operators) and targeting problems ranging from single PDEs to high-dimensional systems with hundreds of variables.
Overview #
The KAN-for-PDEs landscape organises into several interrelated research threads:
- Physics-Informed KAN Frameworks (PIKANs / KINN) — direct replacements of MLP layers in PINNs with KAN layers, using strong, energy, and inverse PDE formulations.
- Spectral-Basis and Wavelet-Enriched KANs — embedding orthogonal polynomial or wavelet bases to combat spectral bias.
- KAN-Based Neural Operators — KAN sub-networks inside DeepONet, FNO, and pseudo-differential operator frameworks for learning PDE solution maps.
- Time-Dependent and Evolutionary KANs — energy-stable schemes, KAN-ODEs, and moving-boundary solvers.
- Discontinuities, Shock Waves, and Turbulence — specialised architectures for sharp transitions.
- High-Dimensional PDEs — separable and tensor-product KAN surrogates scaling to hundreds of dimensions.
- Data-Driven Discovery and Inverse Problems — interpretability-driven model identification.
| Architecture | Key Strength | Representative Work |
|---|---|---|
| KINN | Forward/inverse problems, strong/energy/inverse forms | Wang et al., 2024 |
| ChebPIKAN | Fluid mechanics PDEs, orthogonal basis | Cui et al., 2024 |
| KANO | Symbolic operator recovery, variable-coefficient PDEs | arXiv:2509.16825 |
| EvoKAN | Long-horizon time evolution, energy stability | arXiv:2503.01618 |
| Anant-KAN | High-dimensional PDEs (up to 300D) | arXiv:2505.03595 |
| DPINN | Shock waves and discontinuities | arXiv:2507.08338 |
Background #
The Kolmogorov-Arnold Representation Theorem #
The theoretical foundation of KANs is the Kolmogorov-Arnold theorem: any continuous function $f: [0,1]^n \to \mathbb{R}$ can be written as
$$f(x_1, \ldots, x_n) = \sum_{q=0}^{2n} \Phi_q!\left(\sum_{p=1}^{n} \phi_{q,p}(x_p)\right),$$
where $\phi_{q,p}: [0,1] \to \mathbb{R}$ and $\Phi_q: \mathbb{R} \to \mathbb{R}$ are univariate continuous functions. In contrast to MLPs — where activations are fixed and weights are learned — KANs parameterise the activation functions themselves (typically as B-splines or orthogonal polynomials) on each edge of the network graph.
Physics-Informed Neural Networks (PINNs) — The Starting Point #
PINNs (Raissi, Perdikaris, & Karniadakis, 2019) embed physical laws directly into the neural network loss function. For a PDE $\mathcal{N}[u] = f$ on domain $\Omega$ with boundary condition $\mathcal{B}[u] = g$ on $\partial\Omega$, the PINN loss is
$$\mathcal{L} = \underbrace{\frac{1}{N _r}\sum _{i=1}^{N _r}|\mathcal{N}[u _\theta](x _i)|^2} _{\text{PDE residual}} + \underbrace{\frac{1}{N _b}\sum _{j=1}^{N _b}|\mathcal{B}[u _\theta](x _j) - g(x _j)|^2} _{\text{boundary condition}}.$$
The substitution of MLP layers with KAN layers in this framework is the basic idea behind all PIKAN architectures.
Recent Developments #
1. Physics-Informed KAN Frameworks #
KINN — The Foundational Framework #
The Kolmogorov-Arnold-Informed Neural Network (KINN) is the primary physics-informed framework replacing MLP layers in PINNs with KAN layers (Wang et al., 2024). KINN supports three PDE formulations: the strong form (collocating the PDE residual directly), the energy form (minimising a variational energy functional), and the inverse form (recovering unknown parameters from observations).
Systematic benchmarks demonstrate that KINN significantly outperforms MLP-based PINNs in accuracy and convergence speed for multi-scale problems, stress concentration, singularities, nonlinear hyperelasticity, and heterogeneous materials. The one domain where MLP remains competitive is complex geometry problems. Published in Computer Methods in Applied Mechanics and Engineering (2024), KINN has become the canonical reference for subsequent KAN-PDE research.
Chebyshev and Polynomial Basis PIKANs #
A major architectural refinement has been substituting B-spline basis functions with orthogonal polynomial bases. The ChebPIKAN model leverages orthogonality of Chebyshev polynomials and integrates physics-informed loss functions for fluid-mechanics PDEs including the Allen-Cahn, Burgers, Helmholtz, Kovasznay flow, cylinder wake flow, and cavity flow equations (Cui et al., 2024). ChebPIKAN significantly outperforms vanilla KAN by embedding essential physical information and alleviating overfitting.
The AC-PKAN (Attention-Enhanced Chebyshev PKAN) further addresses the rank collapse problem in Chebyshev-based KANs by integrating wavelet-activated MLPs with an internal attention mechanism, provably preserving a full-rank Jacobian and approximating PDEs of arbitrary order (arXiv:2505.08687). An external Residual Gradient Attention (RGA) mechanism dynamically re-weights individual loss terms based on gradient norms, stabilising training of stiff PDE systems.
The Legendre-KAN method applies Legendre polynomial orthogonality to solve the fully nonlinear Monge-Ampère equation with Dirichlet boundary conditions, demonstrating effectiveness on both smooth and singular solutions across various dimensions and in the optimal transport problem.
Hybrid KAN–MLP and Augmented Lagrangian Approaches #
The AL-PKAN introduces a hybrid encoder-decoder architecture where the decoder maps hidden variable features from high-dimensional latent space into trainable univariate activation functions via KAN (Zhang et al., 2025). An augmented Lagrangian function treats penalty factors and Lagrangian multipliers as learnable parameters to dynamically balance constraint terms. This approach typically improves prediction accuracy by one to two orders of magnitude compared to traditional neural networks.
The HPKM-PINN combines MLP and KAN branches with a trainable convex mixing parameter to blend features optimally across subdomains, especially effective for multi-scale problems.
2. Spectral-Basis and Wavelet-Enriched KANs #
Wav-KAN incorporates wavelet functions into the KAN structure, capturing both high-frequency and low-frequency components via continuous dyadic wavelet transforms for multiresolution analysis. This directly addresses the spectral bias problem inherent in standard neural networks, which struggle to resolve high-frequency features in PDE solutions.
PIKANs have been extended to multi-resolution spectral hybridisations (HWF-PIKAN), combining wavelet and Fourier features to explicitly counteract spectral bias and accelerate convergence for advection-dominated and kinetic equations.
A unified benchmark published in February 2026 provides a systematic, controlled comparison between MLP-based PINNs and KAN-based PIKANs across a representative collection of ODEs and PDEs (arXiv:2602.15068). The results show that PIKANs consistently achieve more accurate solutions, converge in fewer iterations, and yield superior gradient estimates.
3. KAN-Based Neural Operators #
Neural operators learn mappings between infinite-dimensional function spaces, enabling generalisation across families of PDEs. KANs are increasingly embedded in operator architectures.
DeepOKAN replaces MLP sub-networks in the Deep Operator Network (DeepONet) framework with KAN sub-networks using Gaussian Radial Basis Functions (Abueidda et al., 2024). The branch and trunk networks of DeepONet are re-implemented as RBF-KAN layers. Evaluated on 1D sinusoidal waves, 2D orthotropic elasticity, and transient Poisson problems, DeepOKAN consistently achieves lower training losses and more accurate predictions compared to standard DeepONet.
PO-CKAN (Physics-informed Deep Operator KAN with Chunk Rational Structure) integrates PDE residual loss into a DeepONet-style branch–trunk architecture using Chunkwise Rational KAN sub-networks (arXiv:2510.08795). On Burgers’ equation with viscosity $\nu = 0.01$, PO-CKAN reduces mean relative $L^2$ error by approximately 48% compared to PI-DeepONet.
KANO (Kolmogorov-Arnold Neural Operator) is the most theoretically ambitious framework, jointly parameterising operators in both spectral and spatial bases within a pseudo-differential operator framework (arXiv:2509.16825). KANO overcomes the pure-spectral bottleneck of Fourier Neural Operators (FNO): while FNO remains practical only for spectrally sparse operators, KANO remains expressive over generic variable-coefficient PDEs. Crucially, KANO achieves symbolic recovery of the learned operator, enabling closed-form extraction of governing equations. On the quantum Hamiltonian learning benchmark, KANO attains state infidelity $\approx 6 \times 10^{-6}$ compared to FNO’s $\approx 1.5 \times 10^{-2}$.
KAN-ONets embeds adaptive, learnable B-spline activations from KAN into FNO (yielding FNO-KAN for uniform grids) and into the attention-based GNOT (yielding GNOT-KAN for arbitrary grids). Across seven challenging PDE benchmarks, KAN-ONets achieves MSE reductions of 10.2–30.2% compared to existing models.
4. Time-Dependent and Evolutionary KANs #
EvoKAN (Evolutionary Kolmogorov-Arnold Network, March 2025) introduces a novel paradigm: rather than retraining repeatedly, EvoKAN encodes only the PDE’s initial state during an initial learning phase, then evolves the network parameters numerically, governed by the same PDE (arXiv:2503.01618). KAN weights are treated as time-dependent functions updated through time steps, enabling prediction over arbitrarily long time horizons.
EvoKAN integrates the Scalar Auxiliary Variable (SAV) method to guarantee unconditional energy stability: at each time step, SAV requires only solving decoupled linear systems with constant coefficients. EvoKAN has been validated on the 1D and 2D Allen-Cahn equations (phase-field phenomena with sharp interfaces) and the 2D Navier-Stokes equations (turbulent flows), closely matching analytical references.
KAN-ODEs apply KANs as the backbone of neural ordinary differential equation (ODE) frameworks, enabling data-driven discovery of governing dynamics with greater interpretability compared to MLP-based neural ODEs (arXiv:2407.04192).
Shallow-KAN addresses Stefan-type moving boundary problems (melting, solidification) by approximating the temperature distribution and moving interface while enforcing governing PDEs, phase equilibrium, and the Stefan condition through physics-informed residuals (arXiv:2601.09818). A key finding is that two hidden layers with tens of learnable parameters suffice — far fewer than the nearly one million parameters required by standard MLP-based PINNs for the same problem.
5. Discontinuities, Shock Waves, and Turbulence #
A known weakness of smooth neural networks is difficulty resolving sharp spatial transitions and discontinuities such as shock waves. Two specialised frameworks address this:
DPINN (Discontinuity-aware PINN) incorporates a discontinuity-aware KAN for modelling shock-wave properties, combined with an adaptive Fourier-feature embedding layer to mitigate spectral bias, mesh transformation for complex geometries, and learnable local artificial viscosity to stabilise the algorithm near discontinuities (arXiv:2507.08338). Numerical experiments on the inviscid Burgers’ equation and transonic/supersonic airfoil flows demonstrate superior accuracy over existing methods.
A Physics-Infused KAN for Turbulence (2026) targets turbulent flow prediction integrated with CFD, applying KAN within the Reynolds-Averaged Navier-Stokes (RANS) framework. It addresses the information bottleneck phenomenon in multi-output KANs and proposes pruning-based network optimisation, achieving high prediction accuracy for Navier-Stokes equations.
6. High-Dimensional PDEs and the Curse of Dimensionality #
High-dimensional PDEs (tens to hundreds of dimensions) are where conventional numerical methods completely fail due to exponential cost scaling. KAN has shown early promise here.
Anant-Net (2025) is a scalable neural surrogate employing a tensor product formulation with dimension-wise sweeps and selective automatic differentiation (arXiv:2505.03595). Benchmarked on the Poisson, Sine-Gordon, Allen-Cahn, and transient heat equations, Anant-Net solves PDEs in up to 300 dimensions on a single GPU within a few hours. The framework includes Anant-KAN, an interpretable KAN-based variant offering deeper insights into the learned solution structure.
Separable PIKANs (SPIKANs) decompose the PDE solution into products of one-dimensional KAN networks, drastically reducing computational complexity for high-dimensional problems while retaining accuracy and interpretability.
7. Data-Driven Discovery and Inverse Problems #
KANs are especially powerful for scientific discovery tasks where interpretability of the learned function is critical.
Data-driven model discovery with KANs has been demonstrated on complex dynamical systems — including the Ikeda map and optical-cavity systems — where sparse optimisation methods fail due to non-sparse governing equations (arXiv:2409.15167). KAN captures complex behaviour while offering interpretability through its edge-wise univariate functions, providing insight into governing dynamics inaccessible in black-box MLPs.
PI-KAN-PointNet extends PIKAN to simultaneously solve inverse problems over multiple irregular geometries within a single training run, demonstrated on natural convection over 135 geometries with sparse data. KINN for Inverse Problems enables identification of unknown material parameters in heterogeneous or hyperelastic materials from partial observations. KANHedge applies KANs to high-dimensional BSDE solvers for option pricing, demonstrating improved hedging performance over MLP-based deep BSDE solvers (arXiv:2601.11097).
8. Comparative Analysis: KAN vs. MLP for PDEs #
A comprehensive comparison between MLP and KAN representations for differential equations establishes nuanced findings (arXiv:2406.02917):
| Architecture | Shallow Networks | Deep Networks | Robustness | Interpretability |
|---|---|---|---|---|
| KAN (B-spline) | Superior accuracy | Comparable to MLP | Lower (may diverge with different seeds) | High — symbolic extraction possible |
| KAN (Chebyshev/Legendre) | High accuracy | Competitive | Moderate — rank collapse risk | High |
| MLP/PINN | Moderate accuracy | Robust | High | Low |
| PIKAN (optimised) | Superior | Superior or comparable | Moderate | High |
Key findings: KANs in shallow settings significantly outperform MLPs, leveraging per-edge nonlinear expressiveness. In deep settings, KANs do not consistently outperform MLPs, but when properly optimised (e.g., with L-BFGS or Self-Scaled Broyden second-order optimisers), they achieve superior accuracy. JAX-based PIKAN implementations have achieved up to 84× training speedup over original NumPy/PyTorch KANs.
Open Problems #
Despite rapid progress, several challenges remain:
Computational cost. Spline function evaluation involves multiple iterations, making KANs significantly slower per parameter than MLPs. Variants like PowerMLP propose more efficient formulations (arXiv:2412.13571), but a satisfactory solution to raw training speed at scale is still outstanding.
Scalability to complex geometries. KINN and standard PIKANs underperform MLPs on irregular geometry problems. This remains a practical bottleneck for engineering applications involving complex domains.
Gradient instability in deep KANs. Deep PIKANs face vanishing/exploding gradient challenges, motivating Glorot-like initialisation strategies and residual-gated architectures.
Theoretical guarantees. Generalisation bounds for KANs trained on PDE collocation have been studied — bounds scale with $\ell_1$ norms of spline coefficients — but practical understanding of how architecture choices affect convergence and generalisation remains incomplete (arXiv:2410.08026).
Operator learning completeness. While KANO achieves symbolic operator recovery, the theoretical relationship between KAN architecture depth/width and approximation of PDE solution operators is still under active development.
The trajectory is clear: KAN-based PDE solvers are moving from proof-of-concept demonstrations on canonical benchmarks toward production-ready frameworks for engineering simulation, turbulence modelling, inverse problems, and high-dimensional scientific computing. The combination of interpretability, parameter efficiency, and growing theoretical foundations positions KANs as a genuinely transformative architecture for numerical PDEs.
References #
Abueidda, D. W., Pantidis, P., & Mobasher, M. E. (2024). DeepOKAN: Deep operator network based on Kolmogorov Arnold networks for mechanics problems. arXiv:2405.19143. https://www.alphaxiv.org/overview/2405.19143v3
Cui, Z., et al. (2024). Physics-informed Kolmogorov–Arnold network with Chebyshev polynomials for fluid mechanics. Physics of Fluids, 37(9), 095120. https://pubs.aip.org/aip/pof/article-abstract/37/9/095120/3361431
Knottenbelt, W., et al. (2026). KANHedge: Efficient hedging of high-dimensional options using Kolmogorov-Arnold network-based BSDE solver. arXiv:2601.11097. https://arxiv.org/abs/2601.11097
Kovachki, N., et al. (2023). Neural operator: Learning maps between function spaces with applications to PDEs. Journal of Machine Learning Research, 24(89), 1–97.
Li, Z., et al. (2025). Discontinuity-aware KAN-based physics-informed neural networks. arXiv:2507.08338. https://arxiv.org/html/2507.08338v1
Liu, Z., et al. (2024). KAN: Kolmogorov–Arnold Networks. arXiv:2404.19756. https://storage.prod.researchhub.com/uploads/papers/2024/05/04/2404.19756.pdf
Liu, Z., et al. (2024). A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks. arXiv:2406.02917. https://arxiv.org/abs/2406.02917
Liu, Z., et al. (2026). A unified benchmark of physics-informed neural networks and Kolmogorov-Arnold networks. arXiv:2602.15068. https://arxiv.org/html/2602.15068v1
Peng, W., et al. (2025). KANO: Kolmogorov-Arnold Neural Operator. arXiv:2509.16825. https://arxiv.org/abs/2509.16825
Shukla, K., et al. (2025). Anant-Net: Breaking the curse of dimensionality with scalable and interpretable neural surrogates for high-dimensional PDEs. arXiv:2505.03595. https://arxiv.org/html/2505.03595v3
Tang, K., et al. (2025). AC-PKAN: Attention-enhanced and Chebyshev polynomial-based Kolmogorov-Arnold networks. arXiv:2505.08687. https://arxiv.org/html/2505.08687v2
Wang, Z., et al. (2025). EvoKAN: Energy-dissipative evolutionary Kolmogorov-Arnold networks for complex PDE systems. arXiv:2503.01618. https://arxiv.org/abs/2503.01618
Wang, Z., et al. (2024). Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems based on Kolmogorov–Arnold Networks. Computer Methods in Applied Mechanics and Engineering. arXiv:2406.11045. https://www.sciencedirect.com/science/article/abs/pii/S0045782524007722
Xu, Y., et al. (2026). Shallow-KAN based solution of moving boundary PDEs. arXiv:2601.09818. https://arxiv.org/html/2601.09818v1
Yang, L., et al. (2025). KAN-ODEs: Kolmogorov-Arnold network ordinary differential equations for learning dynamical systems and hidden physics. arXiv:2407.04192. https://arxiv.org/html/2407.04192v1
Zhang, Z., et al. (2025). Physics-informed neural networks with hybrid Kolmogorov-Arnold networks. PMC. https://pmc.ncbi.nlm.nih.gov/articles/PMC11950322/
Zuo, Q., et al. (2025). Data-driven model discovery with Kolmogorov-Arnold networks. arXiv:2409.15167. https://arxiv.org/abs/2409.15167