Integrating equation solvers with probabilistic programming through differentiable programming


Part of the COMPUTATIONAL ABSTRACTIONS FOR PROBABILISTIC AND DIFFERENTIABLE PROGRAMMING WORKSHOP

Abstract: Many probabilistic programming languages (PPLs) attempt to integrate with equation solvers (differential equations, nonlinear equations, partial differential equations, etc.) from the inside, i.e. the developers of the PPLs like Stan provide differential equation solver choices as part of the suite. However, as equation solvers are an entire discipline to themselves with many active development communities and subfields, this places an immense burden on PPL developers to keep up with the changing landscape of tens of thousands of independent researchers. In this talk we will explore how Julia PPLs such as Turing.jl support of equation solvers from the outside, i.e. how the tools of differentiable programming allows equation solver libraries to be compatible with PPLs … READ MORE

Direct Automatic Differentiation of (Differential Equation) Solvers vs Analytical Adjoints: Which is Better?


Automatic differentiation of a “solver” is a subject with many details for doing it in the most effective form. For this reason, there are a lot of talks and courses that go into lots of depth on the topic. I recently gave a talk on some of the latest stuff in differentiable simulation with the American Statistical Association, and have some detailed notes on such adjoint derivations as part of the 18.337 Parallel Computing and Scientific Machine Learning graduate course at MIT. And there are entire organizations like my SciML Open Source Software Organization which work day-in and day-out on the development of new differentiable solvers.

I’ll give a brief summary of all my materials here below.

Continuous vs Discrete Differentiation of Solvers

AD of a solver can be done in essentially two different ways: either directly performing automatic … READ MORE

Is Differentiable Programming Actually Necessary? Can’t you just train the neural networks separately?


October 4 2022 in Scientific ML | Tags: | Author: Christopher Rackauckas

Is differentiable programming actually necessary, or can you just train the neural network in isolation against data and then stick the trained neural network into the simulation? We looked at this problem in detail in our new manuscript titled Capturing missing physics in climate model parameterizations using neural differential equations.

The goal of this project is to understand temperature mixing in large eddy simulations, essentially columns of water in the ocean. I.e., can we take a “true” 3D Navier-Stokes and use that to build very quick and accurate models for how heat flows up and down in the water?

This isn’t a … READ MORE

Composing Modeling and Simulation with Julia (2021 Modelica Conference)


In this paper we introduce JuliaSim, a high-performance programming environment designed to blend traditional modeling and simulation with machine learning. JuliaSim can build accelerated surrogates from component-based models, such as those conforming to the FMI standard, using continuous-time echo state networks (CTESN). The foundation of this environment, ModelingToolkit.jl, is an acausal modeling language which can compose the trained surrogates as components within its staged compilation process. As a complementary factor we present the JuliaSim model library, a standard library with differential-algebraic equations and pre-trained surrogates, which can be composed using the modeling system for design, optimization, and control. We demonstrate the effectiveness of the surrogate-accelerated modeling and simulation approach on HVAC dynamics by showing that the CTESN surrogates accurately capture the dynamics of a HVAC … READ MORE

Engineering Trade-Offs in Automatic Differentiation: from TensorFlow and PyTorch to Jax and Julia


To understand the differences between automatic differentiation libraries, let’s talk about the engineering trade-offs that were made. I would personally say that none of these libraries are “better” than another, they simply all make engineering trade-offs based on the domains and use cases they were aiming to satisfy. The easiest way to describe these trade-offs is to follow the evolution and see how each new library tweaked the trade-offs made of the previous.

Early TensorFlow used a graph building system, i.e. it required users to essentially define variables in a specific graph language separate from the host language. You had to define “TensorFlow variables” and “TensorFlow ops”, and the AD would then be performed on this static graph. Control flow constructs were limited to the constructs that could be represented statically. For example, an `ifelse` function statement is very different from … READ MORE

Learning Epidemic Models That Extrapolate, AI4Pandemics


I think this talk was pretty good so I wanted to link it here!

Title: Learning Epidemic Models That Extrapolate

Speaker Chris Rackauckas, https://chrisrackauckas.com/

Abstract:
Modern techniques of machine learning are uncanny in their ability to automatically learn predictive models directly from data. However, they do not tend to work beyond their original training dataset. Mechanistic models utilize characteristics of the problem to ensure accurate qualitative extrapolation but can lack in predictive power. How can we build techniques which integrate the best of both approaches? In this talk we will discuss the body of work around universal differential equations, a technique which mixes traditional differential equation modeling with machine learning for accurate extrapolation from small data. We will showcase how incorporating different variations of the technique, such … READ MORE

Useful Algorithms That Are Not Optimized By Jax, PyTorch, or Tensorflow


In some previous blog posts we described in details how one can generalize automatic differentiation to give automatically stability enhancements and all sorts of other niceties by incorporating graph transformations into code generation. However, one of the things which we didn’t go into too much is the limitation of these types of algorithms. This limitation is what we have termed “quasi-static” which is the property that an algorithm can be reinterpreted as some static algorithm. It turns out that for very fundamental reasons, this is the same limitation that some major machine learning frameworks impose on the code that they can fully optimize, such as Jax or Tensorflow. This led us to the question: are there algorithms which are not optimizable within this mindset, and why? The answer is now published at ICML 2021, so lets dig into … READ MORE

ModelingToolkit, Modelica, and Modia: The Composable Modeling Future in Julia


Let me take a bit of time here to write out a complete canonical answer to ModelingToolkit and how it relates to Modia and Modelica. This question comes up a lot: why does ModelingToolkit exist instead of building on tooling for Modelica compilers? I’ll start out by saying I am a huge fan of Martin and Hilding’s work and we work very closely with them on the direction of Julia-based tooling for modeling and simulation. ModelingToolkit, being a new system, has some flexibility in the design space it explores, and while we are following a different foundational philosophy, we have many of the same goals.

Composable Abstractions for Model Transformations

Everything in the SciML organization is built around a principle of confederated modular development: let other packages influence the capabilities of your own. This is highlighted in a … READ MORE

Generalizing Automatic Differentiation to Automatic Sparsity, Uncertainty, Stability, and Parallelism


Automatic differentiation is a “compiler trick” whereby a code that calculates f(x) is transformed into a code that calculates f'(x). This trick and its two forms, forward and reverse mode automatic differentiation, have become the pervasive backbone behind all of the machine learning libraries. If you ask what PyTorch or Flux.jl is doing that’s special, the answer is really that it’s doing automatic differentiation over some functions.

What I want to dig into in this blog post is a simple question: what is the trick behind automatic differentiation, why is it always differentiation, and are there other mathematical problems we can be focusing this trick towards? While very technical discussions on this can be found in our recent paper titled “ModelingToolkit: A Composable Graph Transformation System For Equation-Based Modeling” and descriptions of methods like intrusive uncertainty quantification, I want … READ MORE

COVID-19 Epidemic Mitigation via Scientific Machine Learning (SciML)


Chris Rackauckas
Applied Mathematics Instructor, MIT
Senior Research Analyst, University of Maryland, Baltimore School of Pharmacy

This was a seminar talk given to the COVID modeling journal club on scientific machine learning for epidemic modeling.

Resources:

https://sciml.ai/
https://diffeqflux.sciml.ai/dev/
https://datadriven.sciml.ai/dev/
https://docs.sciml.ai/latest/
https://safeblues.org/