Building a General PDE Solving Framework with Symbolic-Numeric Scientific Machine Learning

Building a General PDE Solving
Framework with Symbolic-Numeric
Scientific Machine Learning
Chris Rackauckas
VP of Modeling and Simulation, JuliaHub
Research Affiliate, MIT CSAIL

Connect with SciML
Follow on Twitter
@ChrisRackauckas
Follow on LinkedIn
https://www.linkedin.com/in/chrisrackauckas/
Star the GitHub repositories
https://github.com/SciML
Join our community chatrooms
julialang.org/slack
Dr. Chris Rackauckas
VP of Modeling and Simulation
JuliaHub
Research Affiliate
MIT CSAIL
Director of Scientific Research
Pumas-AI

PDEs Done Wrong! Writing Loops for a Single PDE

Fast Simulation:
Good Algorithms + Fast Implementation
(“Fast” + Implicit Euler / Newmark Beta? Get Outta Here!)

All good PDE methods can be expressed as lower level primitives:
1. Symbolic transformations to standard computable forms
1. Numerical methods which exploit structure on standard computable
forms
ModelingToolkit PDE
Hypothesis:

, Christopher, and Qing Nie. "Confederated modular differential equation APIs for accelerated algorithm
development and benchmarking." Advances in Engineering Software 132 (2019): 1-6.
SciML: Open Source
Tooling
For Modeling and Simulation
https://github.com/SciML/SciMLBenchmarks.jl
Rackauckas, Christopher, and Qing Nie. "Differentialequations.jl–a performant
and feature-rich ecosystem for solving differential equations in julia." Journal
of Open Research Software 5.1 (2017).
Rackauckas, Christopher, and Qing Nie. "Confederated modular differential
equation APIs for accelerated algorithm development and benchmarking."
Advances in Engineering Software 132 (2019): 1-6.
1. Speed
2. Stability
3. Stochasticity
4. Adjoints and Inference
5. Parallelism
Non-Stiff ODE: Rigid Body System
8 Stiff ODEs: HIRES Chemical Reaction Network
Generally:
• 50x faster than
SciPy
• 50x faster than
MATLAB
• 100x faster than
R’s deSolve
• Methods that
outperform classic
C and Fortran
methods (CVODE,
DASSL)
• Specialized
situational
methods (IMEX,
stage parallel, etc.)
Active Developer Base:
• ~100 contributors in a month across SciML
• Paid summer programs training ~10 new devs
• MIT Lab (~20 people) driven development

NonlinearSolve.jl: Robust Nonlinear Root-Finding in Julia
NonlinearSolve.jl outperforms standard softwares like
Sundials & MINPACK by 10-100x. Built-In Sparsity
Detection & Automatic Differentiation Support enables
scaling to large systems.
Pal, Avik, Holtorf, Larsson, Loman, Utkarsh, Schaefer, Qu,
Alan Edelman, and Chris Rackauckas. "NonlinearSolve.jl:
High-Performance and Robust Solvers for Systems of
Nonlinear Equations in Julia." Manuscript in Preparation.
MINPACK
is the
slowest
NonlinearSolve.jl has
the fastest solvers
Sundials is also
very slow!
NonlinearSolve.jl
successfully solves
novel problems
where other software
fail.

Modeling and Simulation Tools Across the Spectrum of Mathematics
https://scimlbase.sciml.ai/dev/
The SciML Common Interface for Julia Equation Solvers
LinearSolve.jl: Unified Linear Solver Interface
NonlinearSolve.jl: Unified Nonlinear Solver
Interface
DifferentialEquations.jl: Unified Interface for all
Differential Equations
Optimization.jl: Unified Optimization Interface
Integrals.jl: Unified Quadrature Interface
Unified Partial Differential Equation Interface

What is Scientific Machine Learning (SciML)?
Scientific Computing ↔ Machine Learning
Machine Learning
● Neural Nets
● Bayesian
Modeling
● Automatic
Differentiation
Scientific Computing Scientific Machine
Learning
● Model Building
● Robust Solvers
● Control Systems
● Differentiable Simulators
● Surrogates and ROM
● Inverse Problems &
Calibration
● Automatic Equation Discovery
● Applicable to Small Data
and more ….
10

JuliaSim Model Discovery: Autocompleting Models with SciML

Universal Differential Equations Predict Chemical Processes
UDEs in advection-diffusion transform the learning problem to
low dimensional spaces where small data is sufficient

Universal Differential Equations Predict Chemical Processes
Santana, V. V., Costa, E., Rebello, C. M., Ribeiro, A. M., Rackauckas, C., &
Nogueira, I. B. (2023). Efficient hybrid modeling and sorption model discovery for
non-linear advection-diffusion-sorption systems: A systematic scientific machine
learning approach. arXiv preprint arXiv:2303.13555.
Recovers equations with the
same 2nd order Taylor
expansion

PDE Discretization
via ModelingToolkit Is Transformaiton

modular differential equation APIs for accelerated algorithm development and benchmarking." Advances in
Engineering Software 132 (2019): 1-6.
SciML’s Modeling Ecosystem:
ModelingToolkit’s Numerical
Counterpart
Every improvement by every package
developer feeds into one pipeline
Rackauckas, Christopher, and Qing Nie. "Confederated modular differential
equation APIs for accelerated algorithm development and benchmarking."
Advances in Engineering Software 132 (2019): 1-6.
Youtube: Differential Equations in 2021
SciML’s Common Interface:
• One consistent interface for all numerics
• Symbolic modeling for all forms
• Automated inverse problems and adjoints
• Composes across the whole package
ecosystem
• Fully embraces generic programming
• Uses and embraces the work of other
developers

Heat Equation
using MethodOfLines, ModelingToolkit, DomainSets
@parameters t, x
@variables u(..)
Dt = Differential(t); Dx = Differential(x)
Dxx = Differential(x)^2
α = 1.1
eq = Dt(u(t, x)) ~ α * Dxx(u(t, x))
domain = [x ∈ Interval(0.0, 10.0),
t ∈ Interval(0.0, 1.0)]
ic_bc = [u(0.0, x) ~ exp(-(x - 4.0)^2) + exp(-(x - 6.0)^2),
u(t, 0.0) ~ 0.0,
u(t, 10.0) ~ 0.0]
@named sys = PDESystem(eq, ic_bc, domain, [t, x], [u(t, x)])
SciML/MethodOfLines.jl

# Method of lines discretization
dx = 0.1
order = 2
discretization = MOLFiniteDifference([x => dx], t, approx_order=order)
# Convert the PDE problem into an ODE problem
prob = discretize(sys, discretization)
using OrdinaryDiffEq
# Solve ODE problem
sol = solve(prob, Tsit5(), saveat=0.1)

solu = sol[u(t,x)]
x_grid = sol[x]
t_grid = sol[t]
using Plots
heatmap(t_grid, x_grid, transpose(solu), …)

anim = @animate for i in 1:length(t_grid)
plot(x_grid, solu[i, :], …)
end
gif(anim, "heat.gif", fps = 8)

Burger’s Equation

@parameters x t
@variables u(..)
Dx = Differential(x)
Dxx = Dx^2
Dt = Differential(t)
x_min = -1.0
x_max = 1.0
t_min = 0.0
t_max = 3.0
v = 0.01
eq = Dt(u(t, x)) ~ -u(t, x) * Dx(u(t, x)) + v*Dxx(u(t,x))
bcs = [u(0, x) ~ 0.11*cospi(4x),
u(t, x_min) ~ u(t, x_max)]
domains = [t ∈ Interval(t_min, t_max),
x ∈ Interval(x_min, x_max)]
@named pdesys = PDESystem(eq, bcs, domains, [t, x], [u(t, x)])
Burger’s Equation

Implemented schemes
● Centered difference scheme to arbitrary order, for even ordered spatial
derivative terms.
● Upwind difference scheme (hardcoded to first order) for odd ordered spatial
derivatives.
● WENO Scheme
● Integral schemes
● Nonlinear laplacian scheme.
● Spherical laplacian scheme.
● Discrete callbacks
● Staggered Grid

ModelingToolkit PDEs: Method Of Lines Finite Difference

ModelingToolkit PDEs: Physics-Informed Neural Networks
Easy and Customizable PINN PDE Solving with NeuralPDE.jl, JuliaCon
2021

ModelingToolkit PDEs: Extensible PDE Interface
Many extra forms have partial implementations
being productionized:
Finite Volume methods (FiniteVolumeMethods.jl)
Spectral methods (?)
Finite element methods (via Ferrite.jl)
Discrete Galerkin methods (via Trixi.jl)
Exterior Calculus (via Decapodes.jl)
…
Attempting to unify the difference PDE
discretization methods into a one interface
compatible with component-based modeling!

But a general PDE solver interface means nothing if it’s slow.
How we take a general PDE discretization engine and make it as
fast as a code that is hand-optimized to specific PDEs?

ModelingToolkit Symbolic-Numeric Optimizations

, Christopher, and Qing Nie. "Confederated modular differential equation APIs for accelerated algorithm development and benchmarking." Advances in Engineering Software 132 (2019): 1-6.
Symbolics.jl
MTK
DiffEq

Example of Tearing Nonlinear Systems

It automatically reduced your 5 equation system to 1!

Only solves one equation numerically
But can generate the other
variables

Inline Integration: Tearing of Internal Solves
Perform tearing on the internal nonlinear solve of an implicit method

Structural Simplify Performance: DAE vs AE
DAE AE
Derivative Form Analytic Computational Analytic Computational
Alias Elimination 18 20 18 18
Structural
Simplify
6 12 4 8

Structural Simplify Performance: DAE vs AE
Mode
l
Inpu
t
Outp
ut
Eqs/State
s
DAE AE
f4 x ddy 97 Alias Elimination: 52
Structural Simplify: 27 eqs/26
states
50
11

Inlined Linear Solver Optimization

Improving Automated Solvers with Sparsity Detection
Symbolics.jl:
Automated Sparsity Analysis
on Code
Will be a part of the automatic solver choice!

What Kinds of Transformations Do You Get? DAE Index Reduction
Not solvable by standard
numerical solvers!
Differentiate the last equation
twice, do a few substitutions…
Easy to solve!
If you don’t know the details about why this makes a better numerical simulation, then you should be using
ModelingToolkit.

What Kinds of Transformations Do You Get? DAE Index Reduction
Let me fix that for you…
Structural_simplify:
Pantelides algorithm
For index reduction

But that form has numerical drift

Dummy Derivative: Include the Implicit Constraints

1. Many “specialized CFD” methods are simply a standard ODE solver after
Pantelides transformation of discretized Navier-Stokes
2. Inlined Linear Solver optimizations covers many simple semi-implicit schemes
3. Tearing of nonlinear systems on finite difference is a generalization of ghost
points
4. While this already hits good runtimes, there’s compilation issues I’ve glossed
over
Some immediate
consequences that would
take too long to prove:

Interested in Symbolic-Numerics?
Lots of starter projects!
Get in touch

1. Implicit-Explicit schemes, ADI, etc. all handled by this aspect.
1. We have yet to find a PDE scheme that cannot be expressed by some
combination of the previous features
1. Scaling and automatic parallelism scheduling are among the current ongoing
research questions
Some immediate consequences
that would take too long to prove:

sys = structural_simplify(complete_motor)
JuliaSimCompiler: Accelerated ModelingToolkit
2 lines of code to turn on, enables enormous scalability improvements
Solves a major scaling problem in acasual systems
Conclusion: ModelingToolkit is a widely used open modeling platform, and with
JuliaSim it’s also the most scalable.
using JuliaSimCompiler
complete_motor_ir = IRSystem(complete_motor)
sys_ir = structural_simplify(complete_motor_ir)
MTK
JuliaSimCompiler

Loop Rerolling
systems = @named begin
sine = Sine(frequency = 10)
source = Voltage()
resistors[1:n] = Resistor()
capacitors[1:n] = Capacitor()
ground = Ground()
end;

Loop Rerolling
resistors_1₊v(t) ~ resistors_1₊p₊v(t) - resistors_1₊n₊v(t)
0 ~ resistors_1₊p₊i(t) + resistors_1₊n₊i(t)
resistors_1₊i(t) ~ resistors_1₊p₊i(t)
resistors_1₊v(t) ~ resistors_1₊R*resistors_1₊i(t)
resistors_2₊v(t) ~ -resistors_2₊n₊v(t) + resistors_2₊p₊v(t)
0 ~ resistors_2₊p₊i(t) + resistors_2₊n₊i(t)
resistors_2₊i(t) ~ resistors_2₊p₊i(t)
resistors_2₊v(t) ~ resistors_2₊R*resistors_2₊i(t)
Variable classes:
{{resistors_1₊v, resistors_2₊v, …},
{resistors_1₊p₊v, resistors_2₊p₊v}, …}
Equation classes:
{0 = f₁(x, y, z) = x - (y - z),
0 = f₂(x, y) = x + y, …}

Loop Rerolling
for var"%33" = 1:97
var"%34" = var"%33" - var"%32"
var"%35" = var"%29" + var"%34"
var"%36" = Base.getindex(var"###in 1###", var"%35")
var"%37" = var"%30" + var"%34"
var"%39" = var"%31" + var"%34"
var"%41" = var"%24" * var"%38"
var"%42" = var"%41" + var"%36"
var"%43" = var"%40" + var"%42"
var"%44" = var"%31" + var"%33"
var"%45" = Base.setindex!(var"###out###", var"%43", var"%44")
end

Loop Rerolling on JuliaSimBattery (Single-Particle Model (SPM), Lithium Nickel
Manganese Cobalt Oxide (NMC))

Exploiting Structure in ODE Solvers

“Refinements” to ODE Problems can fall into different classes

Approximate Matrix Factorization = Generalized ADI

Preliminary results indicate upto 7x
acceleration on PDE solutions with upto 2x
lesser memory consumption**
Idea: Non-linear solve is typically
the most time consuming step in
PDE simulations. Perform non-
linear solves in mixed precision
with GPUs and also without
affecting convergence?*
32-bit precision non-linear solve can
achieve machine precision accuracy!
* Kelley, C. T. "Newton's Method in Mixed
Precision." SIAM Review 64.1 (2022): 191-211.
**Manuscript in preparation.
GPU Accelerated Mixed Precision Methods for Semi-discretized PDEs

Solving Semilinear Parabolic PDEs (Nonlinear Black-Scholes) by
Learning Structured Stochastic Differential Equations
Simplified:
• Transform the PDE into a Backwards SDE: a
stochastic boundary value problem. The
unknown is a function!
• Learn the unknown function via neural
network.
• Once learned, the PDE solution is known.
Solving high-dimensional partial differential equations
using deep learning, 2018, PNAS, Han, Jentzen, E

Solving high-dimensional partial differential equations using deep learning, 2018, PNAS, Han, Jentzen, E

• This is equivalent to training the universal SDE:
• Solves a 100 dimensional PDE in minutes on a laptop!
• As a universal SDE, we can solve this with high order (less neural network evaluations),
adaptivity, etc. methods using DiffEqFlux.jl. Thus we automatically extend the deep BSDE
method to better discretizations by using this interpretation!

Extending the Method:
Learning Nonlinearities
While Solving?

Differentiation for Inverse Problems:
Issues with Adjoints

Advancements in Structural Identifiability Analysis
of ODE control systems
Novel autonomous algorithms based on
differential algebra that uncover the source
of non-identifiability
Classic problem: impossible to deduce
if we only measure and 𝑏 is unknown
Also: first steps towards PDE identifiability in "Algebraic identifiability
of partial differential equation models," arxiv 2024.
Solution: model reparametrization.
- Fewer parameters.
- All new variables are identifiable.
is not identifiable
Reparametrization is already available
in SciML/StructuralIdentifiability.jl:
- Fully automatic.
- Domain agnostic.
- Works with non-linear ODEs of
dimensions around 20.

The adjoint equation is an ODE!
How do you get z(t)? One suggestion:
Reverse the ODE
Timeseries is not
stored, therefore
O(1) in memory!
Machine Learning Neural Ordinary Differential Equations
Chen, Ricky TQ, et al. "Neural ordinary differential equations." Advances in neural information
processing systems. 2018.

Rackauckas, Christopher, and Qing Nie. "Differentialequations. jl–a performant and feature-
rich ecosystem for solving differential equations in julia." Journal of Open Research Software
5.1 (2017).
Rackauckas, Christopher, and Qing Nie. "Confederated modular differential equation APIs for
accelerated algorithm development and benchmarking." Advances in Engineering Software
132 (2019): 1-6.
“Adjoints by reversing” also is
unconditionally unstable on some
problems!
Advection Equation:
Approximating the derivative in x has two choices: forwards or
backwards
If you discretize in the wrong direction you get unconditional
instability
You need to understand the engineering principles and the
numerical simulation properties of domain to make ML stable on
it.
Result: the method in the PyTorch
library is unconditionally unstable!!!

Problems With Naïve Adjoint Approaches On Stiff Equations
Kim, Suyong, Weiqi Ji, Sili Deng, and Christopher Rackauckas. "Stiff neural
ordinary differential equations." Chaos (2021).
How do you get u(t) while solving backwards?
3 options!
1.
2. Store u(t) while solving forwards (dense output)
3. Checkpointing
Unstable
High memory
More Compute
Each choices has an engineering trade-off!

Problems With Naïve Adjoint Approaches On Stiff Equations
Kim, Suyong, Weiqi Ji, Sili Deng, and Christopher Rackauckas. "Stiff
neural ordinary differential equations." Chaos (2021).

Gives about a 10x performance improvement for adjoint calculations on PDEs
“Adjoint” can mean many
many different things…
Ma, Yingbo, Vaibhav Dixit, Michael J. Innes, Xingjian
Guo, and Chris Rackauckas. "A comparison of
automatic differentiation and continuous sensitivity
analysis for derivatives of differential equation
solutions." In 2021 IEEE High Performance Extreme
Computing Conference (HPEC), pp. 1-9. IEEE, 2021.

Julia Derivative Results
All methods compute the same
derivative to many digits

Jax Implementation of the Test Problem
Jax requires a lot more code

Jax Derivative Results: BacksolveAdjoint
BacksolveAdjoint works here, but it’s
the unstable one!

Jax Derivative Results: RecursiveCheckpointingAdjoint and DirectAdjoint
Jax’s other methods don’t
converge, >60% error!

Memory-efficient Continuous Adjoint Sensitivity Method for Universal Ordinary
Differential Equations
In comparison to QuadratureAdjoint, GaussAdjoint
● is much more memory efficient!
● will work on GPUs
● will be compatible with SDEs
Goal: compute the gradient of a loss function with
respect to many parameters
Sapienza et al., "Differentiable Programming for Differential Equations: A Review" in
preparation (2024).
This gradient can be computed as follows
- Solve the differential equation to obtain u(t):
- Solve the adjoint differential equation to obtain λ(t):
- Solve the integral:
Idea: GaussAdjoint computes the integral in a memory-
efficient manner by accumulating the Gauss quadrature
approximation of the integral over time!
This circumvents the need of a continuous solution for λ(t)!

The PyTorch library uses a numerically unstable method
The Jax library uses a non-convergent method.
The Julia library is numerically stable and convergent.
Calculating derivatives is hard.
Conclusion:

ModelOrderReduction: Intrusive MOR done Nonintrusively

Using structural PDE information reduces
the amount of data required for the
surrogates to achieve 5% error by 100x!
Pestourie, R., Mroueh, Y., Rackauckas, C. et al. Physics-enhanced deep surrogates for partial differential
equations. Nat Mach Intell 5, 1458–1465 (2023). https://doi.org/10.1038/s42256-023-00761-y
Physics-Enhanced Deep Neural Surrogates (PEDS)

Steady State Surrogate Jet Engine Pipeline
C Binary
Simulat
or
Model Ingestion Parallel Data Generation using JuliaHub
Simulat
or
Simulat
or
Simulator
Sampled
Config
Mass Model Experimentation using JuliaHub Deployment

Surrogates which converge to a user-chosen accuracy
Provisional patent filed
JuliaSimSurrogates Auto-Train
Fully automated training infrastructure for surrogates

JuliaSim Surrogates Cloud Deployment
Automating the parallelization in SciML training
Surrogates have multiple points of natural parallelization:
■ Neural network training should take place on large
GPU systems
■ Surrogate data training is an embarrassingly-
parallel process, generally on CPU
Optimal parallelism thus requires different machines for
different parts of the process, and expertise than can
optimize the hardware choices automatically.

■ 2D electrochemical model of a single battery
○ 300 equations, 5 ms solve time
■ Battery pack: repeat the model 200 times
○ Very long simulation time
○ Most tools cannot scale physically-
accurate
models to full packs
JuliaSim Batteries
overview

Digital Echo on Live Battery Models

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Machine Learning

Recommended

Recommended

More Related Content

Similar to Building a General PDE Solving Framework with Symbolic-Numeric Scientific Machine Learning

Similar to Building a General PDE Solving Framework with Symbolic-Numeric Scientific Machine Learning (20)

Recently uploaded

Recently uploaded (20)

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Machine Learning