My current research is in numerical optimization techniques, for efficient and robust optimization of the likelihood surface of a Gaussian Process model, which we are using to emulate computational expensive computer simulator output.

Broadly speaking, Optimization is a mathematical technique used to determine the maximum or minimum value of a function. Optimization problems arise in almost every field, from mathematics, science and engineering, to finance and risk analysis. The general optimization problem is as follows: $\underset{x \in \Omega}{min} \ f(x)$ where $f$ is the objective function, that is confined to the feasible region $\Omega$. The feasible region defines the search space for the optimization algorithm, parameterized by upper and lower bound constraints, as well as general equality and inequality constraints. We should note that in order to search for the maximum value, $\underset{x \in \Omega}{max} \ f(x)$, one must simply negate the objective function $f$.

An excellent 2-dimensional function for testing the performance of optimization algorithms. Notice the multiple local and global optima present.

My post-doc supervisor, Thomas Humphries, uses a simulated oil reservoir, implemented with the use of MatLab’s Reservoir Simulation Toolbox (MRST). His primary goal is to find the optimum well position and control parameters that will return the largest net present value (NPV) of the extracted oil over a certain forecast horizon. The NPV function equates to the total amount of oil recovered minus the costs required to obtain the oil, such as the cost of injecting or disposing of water. As a result, he is experimenting with several global, gradient-free optimization techniques, in search for the algorithm that is best suited for this problem.

Of course, computer simulators are very useful tools for modeling or imitating the complexity of real world systems, such as the the simulation of an oil reservoir, in which the continuously changing properties and sheer size of the field would be too cumbersome to physically observe. As a solution, we input vast amounts of historical data, statistical approximations and known parameters into a simulator, in order to best replicate its behaviour. Unfortunately, computer simulators can become very expensive to compute, and when called multiple times can greatly reduce our computational performance. In a recent paper by Ranjan et al. (2012, University of Acadia) a Gaussian process (GP) model, GPfit, was proposed in order to build a statistical surrogate that will replace having to make expensive calls to the simulator. This method involves optimizing the likelihood function which, in turn, minimizes the deviance in our approximation, in order to provide the best true function estimates. Ranjan proposed both a genetic algorithm (GA), implemented in Matlab, as well as a multi-start, gradient based BFGS algorithm, implemented in $R$, in order to optimize the log-likelihood function. I am also currently looking into replacing the genetic algorithm with faster, more robust algorithms. We analyze the performance of the algorithms in terms of their ability to find a suitable global optimum to within an optimal number of function evaluations (the classic law of diminishing return approach). Stay tuned for posts that will discuss and further clarify the Guassian Process and optimization of a likelihood surface.

Gaussian Model fit of a 1-Dimensional function.

Until next time,

AB