Recent Advances in KAN-Based Numerical PDE Solvers

Le, Nhut Nam
Mar 30, 2026

Table of Contents

Kolmogorov-Arnold Networks (KANs), introduced in 2024, have rapidly become one of the most active frontiers in scientific machine learning for solving partial differential equations (PDEs) (Liu et al., 2024). Unlike Multi-Layer Perceptrons (MLPs), which apply fixed activation functions at nodes, KANs place learnable univariate activation functions on edges, grounded in the Kolmogorov-Arnold representation theorem: every continuous multivariate function can be expressed as a composition of univariate functions and summations. This structural difference gives KANs two key properties relevant to PDE numerics — higher interpretability and parameter efficiency — making them an appealing successor to MLP-based Physics-Informed Neural Networks (PINNs).

From 2024 through early 2026, researchers have published dozens of frameworks combining KANs with classical numerical concepts (spectral methods, operator learning, energy-stable time-stepping, neural operators) and targeting problems ranging from single PDEs to high-dimensional systems with hundreds of variables.

Overview #

The KAN-for-PDEs landscape organises into several interrelated research threads:

Physics-Informed KAN Frameworks (PIKANs / KINN) — direct replacements of MLP layers in PINNs with KAN layers, using strong, energy, and inverse PDE formulations.
Spectral-Basis and Wavelet-Enriched KANs — embedding orthogonal polynomial or wavelet bases to combat spectral bias.
KAN-Based Neural Operators — KAN sub-networks inside DeepONet, FNO, and pseudo-differential operator frameworks for learning PDE solution maps.
Time-Dependent and Evolutionary KANs — energy-stable schemes, KAN-ODEs, and moving-boundary solvers.
Discontinuities, Shock Waves, and Turbulence — specialised architectures for sharp transitions.
High-Dimensional PDEs — separable and tensor-product KAN surrogates scaling to hundreds of dimensions.
Data-Driven Discovery and Inverse Problems — interpretability-driven model identification.

Architecture	Key Strength	Representative Work
KINN	Forward/inverse problems, strong/energy/inverse forms	Wang et al., 2024
ChebPIKAN	Fluid mechanics PDEs, orthogonal basis	Cui et al., 2024
KANO	Symbolic operator recovery, variable-coefficient PDEs	arXiv:2509.16825
EvoKAN	Long-horizon time evolution, energy stability	arXiv:2503.01618
Anant-KAN	High-dimensional PDEs (up to 300D)	arXiv:2505.03595
DPINN	Shock waves and discontinuities	arXiv:2507.08338

Background #

The Kolmogorov-Arnold Representation Theorem #

The theoretical foundation of KANs is the Kolmogorov-Arnold theorem: any continuous function $f: [0,1]^n \to \mathbb{R}$ can be written as

$$f(x_1, \ldots, x_n) = \sum_{q=0}^{2n} \Phi_q!\left(\sum_{p=1}^{n} \phi_{q,p}(x_p)\right),$$

where $\phi_{q,p}: [0,1] \to \mathbb{R}$ and $\Phi_q: \mathbb{R} \to \mathbb{R}$ are univariate continuous functions. In contrast to MLPs — where activations are fixed and weights are learned — KANs parameterise the activation functions themselves (typically as B-splines or orthogonal polynomials) on each edge of the network graph.

Physics-Informed Neural Networks (PINNs) — The Starting Point #

PINNs (Raissi, Perdikaris, & Karniadakis, 2019) embed physical laws directly into the neural network loss function. For a PDE $\mathcal{N}[u] = f$ on domain $\Omega$ with boundary condition $\mathcal{B}[u] = g$ on $\partial\Omega$, the PINN loss is

$$\mathcal{L} = \underbrace{\frac{1}{N _r}\sum _{i=1}^{N _r}|\mathcal{N}[u _\theta](x _i)|^2} _{\text{PDE residual}} + \underbrace{\frac{1}{N _b}\sum _{j=1}^{N _b}|\mathcal{B}[u _\theta](x _j) - g(x _j)|^2} _{\text{boundary condition}}.$$

The substitution of MLP layers with KAN layers in this framework is the basic idea behind all PIKAN architectures.

Recent Developments #

1. Physics-Informed KAN Frameworks #

KINN — The Foundational Framework #

The Kolmogorov-Arnold-Informed Neural Network (KINN) is the primary physics-informed framework replacing MLP layers in PINNs with KAN layers (Wang et al., 2024). KINN supports three PDE formulations: the strong form (collocating the PDE residual directly), the energy form (minimising a variational energy functional), and the inverse form (recovering unknown parameters from observations).

Systematic benchmarks demonstrate that KINN significantly outperforms MLP-based PINNs in accuracy and convergence speed for multi-scale problems, stress concentration, singularities, nonlinear hyperelasticity, and heterogeneous materials. The one domain where MLP remains competitive is complex geometry problems. Published in Computer Methods in Applied Mechanics and Engineering (2024), KINN has become the canonical reference for subsequent KAN-PDE research.

Chebyshev and Polynomial Basis PIKANs #

A major architectural refinement has been substituting B-spline basis functions with orthogonal polynomial bases. The ChebPIKAN model leverages orthogonality of Chebyshev polynomials and integrates physics-informed loss functions for fluid-mechanics PDEs including the Allen-Cahn, Burgers, Helmholtz, Kovasznay flow, cylinder wake flow, and cavity flow equations (Cui et al., 2024). ChebPIKAN significantly outperforms vanilla KAN by embedding essential physical information and alleviating overfitting.

The AC-PKAN (Attention-Enhanced Chebyshev PKAN) further addresses the rank collapse problem in Chebyshev-based KANs by integrating wavelet-activated MLPs with an internal attention mechanism, provably preserving a full-rank Jacobian and approximating PDEs of arbitrary order (arXiv:2505.08687). An external Residual Gradient Attention (RGA) mechanism dynamically re-weights individual loss terms based on gradient norms, stabilising training of stiff PDE systems.

The Legendre-KAN method applies Legendre polynomial orthogonality to solve the fully nonlinear Monge-Ampère equation with Dirichlet boundary conditions, demonstrating effectiveness on both smooth and singular solutions across various dimensions and in the optimal transport problem.

Hybrid KAN–MLP and Augmented Lagrangian Approaches #

The AL-PKAN introduces a hybrid encoder-decoder architecture where the decoder maps hidden variable features from high-dimensional latent space into trainable univariate activation functions via KAN (Zhang et al., 2025). An augmented Lagrangian function treats penalty factors and Lagrangian multipliers as learnable parameters to dynamically balance constraint terms. This approach typically improves prediction accuracy by one to two orders of magnitude compared to traditional neural networks.

The HPKM-PINN combines MLP and KAN branches with a trainable convex mixing parameter to blend features optimally across subdomains, especially effective for multi-scale problems.

2. Spectral-Basis and Wavelet-Enriched KANs #

Wav-KAN incorporates wavelet functions into the KAN structure, capturing both high-frequency and low-frequency components via continuous dyadic wavelet transforms for multiresolution analysis. This directly addresses the spectral bias problem inherent in standard neural networks, which struggle to resolve high-frequency features in PDE solutions.

PIKANs have been extended to multi-resolution spectral hybridisations (HWF-PIKAN), combining wavelet and Fourier features to explicitly counteract spectral bias and accelerate convergence for advection-dominated and kinetic equations.

A unified benchmark published in February 2026 provides a systematic, controlled comparison between MLP-based PINNs and KAN-based PIKANs across a representative collection of ODEs and PDEs (arXiv:2602.15068). The results show that PIKANs consistently achieve more accurate solutions, converge in fewer iterations, and yield superior gradient estimates.

3. KAN-Based Neural Operators #

Neural operators learn mappings between infinite-dimensional function spaces, enabling generalisation across families of PDEs. KANs are increasingly embedded in operator architectures.

DeepOKAN replaces MLP sub-networks in the Deep Operator Network (DeepONet) framework with KAN sub-networks using Gaussian Radial Basis Functions (Abueidda et al., 2024). The branch and trunk networks of DeepONet are re-implemented as RBF-KAN layers. Evaluated on 1D sinusoidal waves, 2D orthotropic elasticity, and transient Poisson problems, DeepOKAN consistently achieves lower training losses and more accurate predictions compared to standard DeepONet.

PO-CKAN (Physics-informed Deep Operator KAN with Chunk Rational Structure) integrates PDE residual loss into a DeepONet-style branch–trunk architecture using Chunkwise Rational KAN sub-networks (arXiv:2510.08795). On Burgers’ equation with viscosity $\nu = 0.01$, PO-CKAN reduces mean relative $L^2$ error by approximately 48% compared to PI-DeepONet.

KANO (Kolmogorov-Arnold Neural Operator) is the most theoretically ambitious framework, jointly parameterising operators in both spectral and spatial bases within a pseudo-differential operator framework (arXiv:2509.16825). KANO overcomes the pure-spectral bottleneck of Fourier Neural Operators (FNO): while FNO remains practical only for spectrally sparse operators, KANO remains expressive over generic variable-coefficient PDEs. Crucially, KANO achieves symbolic recovery of the learned operator, enabling closed-form extraction of governing equations. On the quantum Hamiltonian learning benchmark, KANO attains state infidelity $\approx 6 \times 10^{-6}$ compared to FNO’s $\approx 1.5 \times 10^{-2}$.

KAN-ONets embeds adaptive, learnable B-spline activations from KAN into FNO (yielding FNO-KAN for uniform grids) and into the attention-based GNOT (yielding GNOT-KAN for arbitrary grids). Across seven challenging PDE benchmarks, KAN-ONets achieves MSE reductions of 10.2–30.2% compared to existing models.

4. Time-Dependent and Evolutionary KANs #

EvoKAN (Evolutionary Kolmogorov-Arnold Network, March 2025) introduces a novel paradigm: rather than retraining repeatedly, EvoKAN encodes only the PDE’s initial state during an initial learning phase, then evolves the network parameters numerically, governed by the same PDE (arXiv:2503.01618). KAN weights are treated as time-dependent functions updated through time steps, enabling prediction over arbitrarily long time horizons.

EvoKAN integrates the Scalar Auxiliary Variable (SAV) method to guarantee unconditional energy stability: at each time step, SAV requires only solving decoupled linear systems with constant coefficients. EvoKAN has been validated on the 1D and 2D Allen-Cahn equations (phase-field phenomena with sharp interfaces) and the 2D Navier-Stokes equations (turbulent flows), closely matching analytical references.

KAN-ODEs apply KANs as the backbone of neural ordinary differential equation (ODE) frameworks, enabling data-driven discovery of governing dynamics with greater interpretability compared to MLP-based neural ODEs (arXiv:2407.04192).

Shallow-KAN addresses Stefan-type moving boundary problems (melting, solidification) by approximating the temperature distribution and moving interface while enforcing governing PDEs, phase equilibrium, and the Stefan condition through physics-informed residuals (arXiv:2601.09818). A key finding is that two hidden layers with tens of learnable parameters suffice — far fewer than the nearly one million parameters required by standard MLP-based PINNs for the same problem.

5. Discontinuities, Shock Waves, and Turbulence #

A known weakness of smooth neural networks is difficulty resolving sharp spatial transitions and discontinuities such as shock waves. Two specialised frameworks address this:

DPINN (Discontinuity-aware PINN) incorporates a discontinuity-aware KAN for modelling shock-wave properties, combined with an adaptive Fourier-feature embedding layer to mitigate spectral bias, mesh transformation for complex geometries, and learnable local artificial viscosity to stabilise the algorithm near discontinuities (arXiv:2507.08338). Numerical experiments on the inviscid Burgers’ equation and transonic/supersonic airfoil flows demonstrate superior accuracy over existing methods.

A Physics-Infused KAN for Turbulence (2026) targets turbulent flow prediction integrated with CFD, applying KAN within the Reynolds-Averaged Navier-Stokes (RANS) framework. It addresses the information bottleneck phenomenon in multi-output KANs and proposes pruning-based network optimisation, achieving high prediction accuracy for Navier-Stokes equations.

6. High-Dimensional PDEs and the Curse of Dimensionality #

High-dimensional PDEs (tens to hundreds of dimensions) are where conventional numerical methods completely fail due to exponential cost scaling. KAN has shown early promise here.

Anant-Net (2025) is a scalable neural surrogate employing a tensor product formulation with dimension-wise sweeps and selective automatic differentiation (arXiv:2505.03595). Benchmarked on the Poisson, Sine-Gordon, Allen-Cahn, and transient heat equations, Anant-Net solves PDEs in up to 300 dimensions on a single GPU within a few hours. The framework includes Anant-KAN, an interpretable KAN-based variant offering deeper insights into the learned solution structure.

Separable PIKANs (SPIKANs) decompose the PDE solution into products of one-dimensional KAN networks, drastically reducing computational complexity for high-dimensional problems while retaining accuracy and interpretability.

7. Data-Driven Discovery and Inverse Problems #

KANs are especially powerful for scientific discovery tasks where interpretability of the learned function is critical.

Data-driven model discovery with KANs has been demonstrated on complex dynamical systems — including the Ikeda map and optical-cavity systems — where sparse optimisation methods fail due to non-sparse governing equations (arXiv:2409.15167). KAN captures complex behaviour while offering interpretability through its edge-wise univariate functions, providing insight into governing dynamics inaccessible in black-box MLPs.

PI-KAN-PointNet extends PIKAN to simultaneously solve inverse problems over multiple irregular geometries within a single training run, demonstrated on natural convection over 135 geometries with sparse data. KINN for Inverse Problems enables identification of unknown material parameters in heterogeneous or hyperelastic materials from partial observations. KANHedge applies KANs to high-dimensional BSDE solvers for option pricing, demonstrating improved hedging performance over MLP-based deep BSDE solvers (arXiv:2601.11097).

8. Comparative Analysis: KAN vs. MLP for PDEs #

A comprehensive comparison between MLP and KAN representations for differential equations establishes nuanced findings (arXiv:2406.02917):

Architecture	Shallow Networks	Deep Networks	Robustness	Interpretability
KAN (B-spline)	Superior accuracy	Comparable to MLP	Lower (may diverge with different seeds)	High — symbolic extraction possible
KAN (Chebyshev/Legendre)	High accuracy	Competitive	Moderate — rank collapse risk	High
MLP/PINN	Moderate accuracy	Robust	High	Low
PIKAN (optimised)	Superior	Superior or comparable	Moderate	High

Key findings: KANs in shallow settings significantly outperform MLPs, leveraging per-edge nonlinear expressiveness. In deep settings, KANs do not consistently outperform MLPs, but when properly optimised (e.g., with L-BFGS or Self-Scaled Broyden second-order optimisers), they achieve superior accuracy. JAX-based PIKAN implementations have achieved up to 84× training speedup over original NumPy/PyTorch KANs.

Open Problems #

Despite rapid progress, several challenges remain:

Computational cost. Spline function evaluation involves multiple iterations, making KANs significantly slower per parameter than MLPs. Variants like PowerMLP propose more efficient formulations (arXiv:2412.13571), but a satisfactory solution to raw training speed at scale is still outstanding.

Scalability to complex geometries. KINN and standard PIKANs underperform MLPs on irregular geometry problems. This remains a practical bottleneck for engineering applications involving complex domains.

Gradient instability in deep KANs. Deep PIKANs face vanishing/exploding gradient challenges, motivating Glorot-like initialisation strategies and residual-gated architectures.

Theoretical guarantees. Generalisation bounds for KANs trained on PDE collocation have been studied — bounds scale with $\ell_1$ norms of spline coefficients — but practical understanding of how architecture choices affect convergence and generalisation remains incomplete (arXiv:2410.08026).

Operator learning completeness. While KANO achieves symbolic operator recovery, the theoretical relationship between KAN architecture depth/width and approximation of PDE solution operators is still under active development.

The trajectory is clear: KAN-based PDE solvers are moving from proof-of-concept demonstrations on canonical benchmarks toward production-ready frameworks for engineering simulation, turbulence modelling, inverse problems, and high-dimensional scientific computing. The combination of interpretability, parameter efficiency, and growing theoretical foundations positions KANs as a genuinely transformative architecture for numerical PDEs.

References #

Abueidda, D. W., Pantidis, P., & Mobasher, M. E. (2024). DeepOKAN: Deep operator network based on Kolmogorov Arnold networks for mechanics problems. arXiv:2405.19143. https://www.alphaxiv.org/overview/2405.19143v3

Cui, Z., et al. (2024). Physics-informed Kolmogorov–Arnold network with Chebyshev polynomials for fluid mechanics. Physics of Fluids, 37(9), 095120. https://pubs.aip.org/aip/pof/article-abstract/37/9/095120/3361431

Knottenbelt, W., et al. (2026). KANHedge: Efficient hedging of high-dimensional options using Kolmogorov-Arnold network-based BSDE solver. arXiv:2601.11097. https://arxiv.org/abs/2601.11097

Kovachki, N., et al. (2023). Neural operator: Learning maps between function spaces with applications to PDEs. Journal of Machine Learning Research, 24(89), 1–97.

Li, Z., et al. (2025). Discontinuity-aware KAN-based physics-informed neural networks. arXiv:2507.08338. https://arxiv.org/html/2507.08338v1

Liu, Z., et al. (2024). KAN: Kolmogorov–Arnold Networks. arXiv:2404.19756. https://storage.prod.researchhub.com/uploads/papers/2024/05/04/2404.19756.pdf

Liu, Z., et al. (2024). A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks. arXiv:2406.02917. https://arxiv.org/abs/2406.02917

Liu, Z., et al. (2026). A unified benchmark of physics-informed neural networks and Kolmogorov-Arnold networks. arXiv:2602.15068. https://arxiv.org/html/2602.15068v1

Peng, W., et al. (2025). KANO: Kolmogorov-Arnold Neural Operator. arXiv:2509.16825. https://arxiv.org/abs/2509.16825

Shukla, K., et al. (2025). Anant-Net: Breaking the curse of dimensionality with scalable and interpretable neural surrogates for high-dimensional PDEs. arXiv:2505.03595. https://arxiv.org/html/2505.03595v3

Tang, K., et al. (2025). AC-PKAN: Attention-enhanced and Chebyshev polynomial-based Kolmogorov-Arnold networks. arXiv:2505.08687. https://arxiv.org/html/2505.08687v2

Wang, Z., et al. (2025). EvoKAN: Energy-dissipative evolutionary Kolmogorov-Arnold networks for complex PDE systems. arXiv:2503.01618. https://arxiv.org/abs/2503.01618

Wang, Z., et al. (2024). Kolmogorov–Arnold-Informed neural network: A physics-informed deep learning framework for solving forward and inverse problems based on Kolmogorov–Arnold Networks. Computer Methods in Applied Mechanics and Engineering. arXiv:2406.11045. https://www.sciencedirect.com/science/article/abs/pii/S0045782524007722

Xu, Y., et al. (2026). Shallow-KAN based solution of moving boundary PDEs. arXiv:2601.09818. https://arxiv.org/html/2601.09818v1

Yang, L., et al. (2025). KAN-ODEs: Kolmogorov-Arnold network ordinary differential equations for learning dynamical systems and hidden physics. arXiv:2407.04192. https://arxiv.org/html/2407.04192v1

Zhang, Z., et al. (2025). Physics-informed neural networks with hybrid Kolmogorov-Arnold networks. PMC. https://pmc.ncbi.nlm.nih.gov/articles/PMC11950322/

Zuo, Q., et al. (2025). Data-driven model discovery with Kolmogorov-Arnold networks. arXiv:2409.15167. https://arxiv.org/abs/2409.15167

Tags:

Partial Differential Equations

Categories:

Research Survey