Surrogates#

The forward-simulation process can oftentimes be very expensive. For instance, if we keep adding compartments to an ODE-based models, or increase the number of agents in an ABM, or make the rules within an ABM complex. When such simulation-based models become computationally expensive to evaluate, making it impractical to run thousands of simulations required for analysis, optimization, or inference, surrogate models can been used. These models aim to approximate the original complex model while being much faster to compute. The surrogate then replaces the original model in the wider popeline – be it Bayesian optimisation, Bayesiand inference, active learning or another task.

Gaussian processes as surrogates#

Gaussian processes have long been popular choices for creating surrogate models [Gramacy, 2020]. They provide a probabilistic framework that not only predicts an output value but also quantifies the uncertainty associated with that prediction. They also offer flexibility in handling noisy data and capturing nonlinear relationships between inputs and outputs.

Neural networks as surrogates#

More recently, neural networks have emerged as another powerful tool for building surrogate models. Deep learning techniques, in particular, have shown remarkable success in approximating highly complex systems. Neural networks excel at capturing intricate patterns and nonlinearities present in large datasets, which makes them suitable for modeling problems with high-dimensional input spaces. Furthermore, their scalability allows them to handle vast amounts of training data efficiently, making them attractive for applications involving big data.

Which surrogate to choose?#

GPs are well-suited for smaller datasets and provide uncertainty estimates, while neural networks shine when dealing with larger datasets and more complex functional forms. The choice between these two approaches often depends on the specific characteristics of the problem at hand, including the size and nature of the available data and computational resources.