Click here to watch my talk!
We introduce the concept of Numerical Gaussian Processes, which we define as Gaussian Processes with covariance functions resulting from temporal discretization of time-dependent partial differential equations. Numerical Gaussian processes, by construction, are designed to deal with cases where: (1) all we observe are noisy data on black-box initial conditions, and (2) we are interested in quantifying the uncertainty associated with such noisy data in our solutions to time-dependent partial differential equations. Our method circumvents the need for spatial discretization of the differential operators by proper placement of Gaussian process priors. This is an attempt to construct structured and data-efficient learning machines, which are explicitly informed by the underlying physics that possibly generated the observed data. The effectiveness of the proposed approach is demonstrated through several benchmark problems involving linear and nonlinear time-dependent operators. In all examples, we are able to recover accurate approximations of the latent solutions, and consistently propagate uncertainty, even in cases involving very long time integration.
Let us try to convey the main ideas of this work by considering the Burgers’ equation in one space dimension
along with Dirichlet boundary conditions where denotes the unknown solution and is a viscosity parameter. Let us assume that all we observe are noisy measurements of the black-box initial function . Given such measurements, we would like to solve the Burgers’ equation while propagating through time the uncertainty associated with the noisy initial data.
Let us apply the backward Euler scheme to the Burgers’ equation. This can be written as
Similar to the ideas presented here and here, we would like to place a Gaussian process prior on . However, the nonlinear term is causing problems simply because the product of two Gaussian processes is no longer Gaussian. Hence, we will approximate the nonlinear term with , where is the posterior mean of the previous time step. Therefore, the backward Euler scheme can be approximated by
Let us make the prior assumption that
is a Gaussian process with denoting the hyper-parameters of the kernel . This enables us to obtain the following Numerical Gaussian Process
The hyper-parameters and the noise parameters can be trained by employing the Negative Log Marginal Likelihood resulting from
where are the (noisy) data on the boundary and are artificially generated data to be explained later. Here,
Prediction & Propagating Uncertainty
In order to predict at a new test point , we use the following conditional distribution
Now, one can use the resulting posterior distribution to obtain the artificially generated data for the next time step with
Here, and .
Higher Order Time Stepping
Let us consider linear partial differential equations of the form
where is a linear operator and denotes the latent solution.
Linear Multi-step Methods
The trapezoidal time-stepping scheme can be equivalently written as
By assuming we can capture the entire structure of the trapezoidal rule in the resulting joint distribution of and .
The trapezoidal time-stepping scheme can be equivalently written as a representative member of the class of Runge-Kutta methods
Rearanging the terms we obtain
we can capture the entire structure of the trapezoidal rule in the resulting joint distribution of , , , and . Here,
We have presented a novel machine learning framework for encoding physical laws described by partial differential equations into Gaussian process priors for nonparametric Bayesian regression. The proposed algorithms can be used to infer solutions to time-dependent and nonlinear partial differential equations, and effectively quantify and propagate uncertainty due to noisy initial or boundary data. Moreover, to the best of our knowledge, this is the first attempt to construct structured learning machines which are explicitly informed by the underlying physics that possibly generated the observed data. Exploiting this structure is critical for constructing data-efficient learning algorithms that can effectively distill information in the data-scarce scenarios appearing routinely when we study complex physical systems.
This work received support by the DARPA EQUiPS grant N66001-15-2-4055 and the AFOSR grant FA9550-17-1-0013. All data and codes are publicly available on GitHub.