Tutorial for running the Python Optimization Toolbox (POPT)¶
As an illustrative example we choose a 2D-field with one producer and two (water) injectors. The figure below shows the permeability field and the well positions. The grid is 100x100, and the porosity is 0.18. The optimization problem is to find the pressure control for the wells in order to maximize net present value.
The first step is to load neccessary external and local modules.
# Set width
from IPython.display import display, HTML
display(HTML("<style>.container { width:50% !important; }</style>"))
# Import global modules
import os
import glob
import shutil
import logging
from glob import glob
from copy import deepcopy
import numpy as np
from scipy.optimize import minimize
import matplotlib.pyplot as plt
from scipy.stats import expon
# Import local modules
from input_output import read_config # functions for reading input
from popt.misc_tools import optim_tools as ot # help functions for optimization
from popt.loop.optimize import Optimize # this class contains the iterative loop
from popt.loop.ensemble import Ensemble # this class contains the control pertrubations and gradient calculations
from simulator.opm import flow # the simulator we want to use
from popt.update_schemes.enopt import EnOpt # the standard EnOpt method
from popt.update_schemes.smcopt import SmcOpt # the sequential Monte Carlo method
from popt.cost_functions.npv import npv # the cost function
Set the random seed:
np.random.seed(101122)
The plottting function is used to display the objective function vs. iterations for different methods:
def plot_obj_func():
# Collect all results
path_to_files = './'
path_to_figures = './' # Save here
if not os.path.exists(path_to_figures):
os.mkdir(path_to_figures)
files = os.listdir(path_to_files)
results = [name for name in files if "optimize_result" in name]
num_iter = len(results)
mm = []
for iter in range(num_iter):
info = np.load(str(path_to_files) + 'optimize_result_{}.npz'.format(iter))
if 'best_func' in info:
mm.append(-info['best_func'])
else:
mm.append(-info['obj_func_values'])
f = plt.figure()
plt.plot(mm, 'bs-')
plt.xticks(range(num_iter))
plt.xticks(fontsize = 14)
plt.yticks(fontsize = 14)
plt.xlabel('Iteration no.', size=14)
plt.ylabel('NPV', size=14)
plt.title('Objective function', size=14)
f.tight_layout(pad=2.0)
plt.show()
Remove old results and folders, if present:
for folder in glob('En_*'):
shutil.rmtree(folder)
for file in glob('optimize_result_*'):
os.remove(file)
Read inputfile. In this tutorial the input file is written as a .toml file, and consists of three main keys: ensemble, optim and fwdsim. The first part contains keys related to the ensemble, the second part contains the options for the optimization algorithm and the third part are options related to the forward simulation model and objective function. The description of all keys are provided in the printouts of method docstrings below.
!cat init_optim.toml
ko, kf, ke = read_config.read_toml('init_optim.toml')
[ensemble] disable_tqdm = true ne = 10 state = ["injbhp","prodbhp"] prior_injbhp = [ ["mean","init_injbhp.npz"], ["var",6250.0], ["limits",100.0,500.0] ] prior_prodbhp = [ ["mean","init_prodbhp.npz"], ["var",6250.0,], ["limits",20.0,300.0] ] num_models = 1 transform = true [optim] maxiter = 5 tol = 1e-06 alpha = 0.2 beta = 0.1 alpha_maxiter = 3 resample = 0 optimizer = 'GA' nesterov = true restartsave = true restart = false hessian = false inflation_factor = 10 savedata = ["alpha","obj_func_values"] [fwdsim] npv_const = [ ["wop",283.05], ["wgp",0.0], ["wwp",37.74], ["wwi",12.58], ["disc",0.08], ["obj_scaling",-1.0e6] ] parallel = 2 simoptions = [ ['mpi', 'mpirun -np 3'], ['sim_path', '/usr/bin/'], ['sim_flag', '--tolerance-mb=1e-5 --parsing-strictness=low'] ] sim_limit = 5.0 runfile = "3well" reportpoint = [ 1994-02-09 00:00:00, 1995-01-01 00:00:00, 1996-01-01 00:00:00, 1997-01-01 00:00:00, 1998-01-01 00:00:00, 1999-01-01 00:00:00, ] reporttype = "dates" datatype = ["fopt","fgpt","fwpt","fwit"]
Set initial pressure (note that the filenames correspond to the names given in the input file above):
init_injbhp = np.array([300.0,250.0])
init_prodbhp = np.array([100.0])
np.savez('init_injbhp.npz', init_injbhp)
np.savez('init_prodbhp.npz', init_prodbhp)
Initialize the simulator with simulator keys.
sim = flow(kf)
print(flow.__init__.__doc__)
The inputs are all optional, but in the same fashion as the other simulators a system must be followed. The input_dict can be utilized as a single input. Here all nescessary info is stored. Alternatively, if input_dict is not defined, all the other input variables must be defined. Parameters ---------- input_dict : dict, optional Dictionary containing all information required to run the simulator. - parallel: number of forward simulations run in parallel - simoptions: options for the simulations - mpi: option to use mpi (always use > 2 cores) - sim_path: Path to the simulator - sim_flag: Flags sent to the simulator (see simulator documentation for all possibilities) - sim_limit: maximum number of seconds a simulation can run before being killed - runfile: name of the simulation input file - reportpoint: these are the dates the simulator reports results - reporttype: this key states that the report poins are given as dates - datatype: the data types the simulator reports filename : str, optional Name of the .mako file utilized to generate the ECL input .DATA file. Must be in uppercase for the ECL simulator. options : dict, optional Dictionary with options for the simulator. Returns ------- initial_object : object Initial object from the class ecl_100.
Initialize the ensemble with ensemble keys, the simulator, and the chosen objective function. Here we also extract the initial state (x0), the covariance (cov), and the bounds.
ensemble = Ensemble(ke, sim, npv)
print(Ensemble.__init__.__doc__)
x0 = ensemble.get_state()
cov = ensemble.get_cov()
bounds = ensemble.get_bounds()
Parameters ---------- keys_en : dict Options for the ensemble class - disable_tqdm: supress tqdm progress bar for clean output in the notebook - ne: number of perturbations used to compute the gradient - state: name of state variables passed to the .mako file - prior_<name>: the prior information the state variables, including mean, variance and variable limits - num_models: number of models (if robust optimization) (default 1) - transform: transform variables to [0,1] if true (default true) sim : callable The forward simulator (e.g. flow) obj_func : callable The objective function (e.g. npv)
Example using EnOpt. The input and available options are given below. During optimization, useful information is written to the screen. The same information is also written to a log-file named popt.log.
print(EnOpt.__init__.__doc__)
EnOpt(ensemble.function, x0, args=(cov,), jac=ensemble.gradient, hess=ensemble.hessian, bounds=bounds, **ko)
Parameters ---------- fun: callable objective function x: ndarray Initial state args: tuple Initial covariance jac: callable Gradient function hess: callable Hessian function bounds: list, optional (min, max) pairs for each element in x. None is used to specify no bound. options: dict Optimization options - maxiter: maximum number of iterations (default 10) - restart: restart optimization from a restart file (default false) - restartsave: save a restart file after each successful iteration (defalut false) - tol: convergence tolerance for the objective function (default 1e-6) - alpha: step size for the steepest decent method (default 0.1) - beta: momentum coefficient for running accelerated optimization (default 0.0) - alpha_maxiter: maximum number of backtracing trials (default 5) - resample: number indicating how many times resampling is tried if no improvement is found - optimizer: 'GA' (gradient accent) or Adam (default 'GA') - nesterov: use Nesterov acceleration if true (default false) - hessian: use Hessian approximation (if the algorithm permits use of Hessian) (default false) - normalize: normalize the gradient if true (default true) - cov_factor: factor used to shrink the covariance for each resampling trial (defalut 0.5) - savedata: specify which class variables to save to the result files (state, objective function value, iteration number, number of function evaluations, and number of gradient evaluations, are always saved)
2023-12-14 10:16:03,832 : INFO : popt.loop.optimize : ====== Running optimization - EnOpt ====== 2023-12-14 10:16:03,833 : INFO : popt.loop.optimize : {'alpha': 0.2, 'alpha_maxiter': 3, 'beta': 0.1, 'datatype': ['fopt', 'fgpt', 'fwpt', 'fwit'], 'hessian': False, 'inflation_factor': 10, 'maxiter': 5, 'nesterov': True, 'optimizer': 'GA', 'resample': 0, 'restart': False, 'restartsave': True, 'savedata': ['alpha', 'obj_func_values'], 'tol': 1e-06} 2023-12-14 10:16:03,833 : INFO : popt.loop.optimize : iter alpha_iter obj_func step-size cov[0,0] 2023-12-14 10:16:03,834 : INFO : popt.loop.optimize : 0 -1.9083e-01 2023-12-14 10:16:13,088 : INFO : popt.loop.optimize : 1 0 -3.1499e-01 2.00e-01 3.97e-02 2023-12-14 10:16:21,543 : INFO : popt.loop.optimize : 2 0 -3.7002e-01 2.00e-01 3.97e-02 2023-12-14 10:16:33,583 : INFO : popt.loop.optimize : 3 0 -3.7252e-01 2.00e-01 3.95e-02 2023-12-14 10:16:43,758 : INFO : popt.loop.optimize : 4 0 -3.7253e-01 2.00e-01 3.93e-02 2023-12-14 10:16:53,204 : INFO : popt.loop.optimize : 5 0 -3.7260e-01 2.00e-01 3.97e-02 2023-12-14 10:16:53,211 : INFO : popt.loop.optimize : Optimization converged in 5 iterations 2023-12-14 10:16:53,213 : INFO : popt.loop.optimize : Optimization converged with final obj_func = -0.3726 2023-12-14 10:16:53,216 : INFO : popt.loop.optimize : Total number of function evaluations = 6 2023-12-14 10:16:53,218 : INFO : popt.loop.optimize : Total number of jacobi evaluations = 5 2023-12-14 10:16:53,221 : INFO : popt.loop.optimize : Total elapsed time = 0.84 minutes 2023-12-14 10:16:53,227 : INFO : popt.loop.optimize : ============================================
<popt.update_schemes.enopt.EnOpt at 0x7f44d18aae80>
Finally, plot the objective function using the function defined above:
Example using the sequential Monte Carlo method. Here we also save the best state (which is not the mean of the ensemble simulations) and the corresponding best objective function value:
ko_smc = deepcopy(ko)
ko_smc['savedata'] += ["best_state", "best_func"]
print(SmcOpt.__init__.__doc__)
SmcOpt(ensemble.function, x0, args=(cov,), sens=ensemble.calc_ensemble_weights, bounds=bounds, **ko_smc)
Parameters ---------- fun: callable objective function x: ndarray Initial state sens: callable Ensemble sensitivity bounds: list, optional (min, max) pairs for each element in x. None is used to specify no bound. options: dict Optimization options - maxiter: maximum number of iterations (default 10) - restart: restart optimization from a restart file (default false) - restartsave: save a restart file after each successful iteration (defalut false) - tol: convergence tolerance for the objective function (default 1e-6) - alpha: weight between previous and new step (default 0.1) - alpha_maxiter: maximum number of backtracing trials (default 5) - resample: number indicating how many times resampling is tried if no improvement is found - cov_factor: factor used to shrink the covariance for each resampling trial (defalut 0.5) - inflation_factor: term used to weight down prior influence (defalult 1) - savedata: specify which class variables to save to the result files (state, objective function value, iteration number, number of function evaluations, and number of gradient evaluations, are always saved)
2023-12-14 10:17:21,181 : INFO : popt.loop.optimize : ====== Running optimization - SmcOpt ====== 2023-12-14 10:17:21,182 : INFO : popt.loop.optimize : {'alpha': 0.2, 'alpha_maxiter': 3, 'beta': 0.1, 'datatype': ['fopt', 'fgpt', 'fwpt', 'fwit'], 'hessian': False, 'inflation_factor': 10, 'maxiter': 5, 'nesterov': True, 'optimizer': 'GA', 'resample': 0, 'restart': False, 'restartsave': True, 'savedata': ['alpha', 'obj_func_values', 'best_state', 'best_func'], 'tol': 1e-06} 2023-12-14 10:17:21,182 : INFO : popt.loop.optimize : iter alpha_iter obj_func step-size 2023-12-14 10:17:21,183 : INFO : popt.loop.optimize : 0 -1.9083e-01 2023-12-14 10:17:30,338 : INFO : popt.loop.optimize : 1 0 -3.6047e-01 2.00e-01 2023-12-14 10:17:39,446 : INFO : popt.loop.optimize : 2 0 -3.7190e-01 2.00e-01 2023-12-14 10:17:49,220 : INFO : popt.loop.optimize : 3 0 -3.7443e-01 2.00e-01 2023-12-14 10:17:58,141 : INFO : popt.loop.optimize : 4 0 -3.7443e-01 2.00e-01 2023-12-14 10:18:10,110 : INFO : popt.loop.optimize : 5 0 -3.7443e-01 2.00e-01 2023-12-14 10:18:10,112 : INFO : popt.loop.optimize : Optimization converged in 5 iterations 2023-12-14 10:18:10,113 : INFO : popt.loop.optimize : Optimization converged with final obj_func = -0.3672 2023-12-14 10:18:10,113 : INFO : popt.loop.optimize : Total number of function evaluations = 6 2023-12-14 10:18:10,114 : INFO : popt.loop.optimize : Total number of jacobi evaluations = 5 2023-12-14 10:18:10,114 : INFO : popt.loop.optimize : Total elapsed time = 0.83 minutes 2023-12-14 10:18:10,114 : INFO : popt.loop.optimize : ============================================
<popt.update_schemes.smcopt.SmcOpt at 0x7f453c9638b0>
Plot the objective function using the function defined above:
Example using ensemble gradient approximation with the conjugate gradient (CG) method from scipy.minimize:
res = minimize(ensemble.function, x0, args=(cov,), method='CG', jac=ensemble.gradient, tol=ko['tol'],
callback=ot.save_optimize_results, bounds=bounds, options=ko)
print(res)
message: Maximum number of iterations has been exceeded. success: False status: 1 fun: -0.37443170946723164 x: [-1.023e-01 -1.007e-01 -2.088e-01] nit: 5 jac: [ 0.000e+00 2.207e-06 0.000e+00] nfev: 21 njev: 21
Example calling EnOpt through scipy.minimize (this does exactly the same as running EnOpt, but with a different random seed since we do not reset the seed):
for file in glob('optimize_result_*'):
os.remove(file)
minimize(ensemble.function, x0, args=(cov,), method=EnOpt, jac=ensemble.gradient, hess=ensemble.hessian,
bounds=bounds, options=ko)
2023-12-14 10:22:29,796 : INFO : popt.loop.optimize : ====== Running optimization - EnOpt ====== 2023-12-14 10:22:29,797 : INFO : popt.loop.optimize : {'alpha': 0.2, 'alpha_maxiter': 3, 'beta': 0.1, 'callback': None, 'constraints': (), 'datatype': ['fopt', 'fgpt', 'fwpt', 'fwit'], 'hessian': False, 'hessp': None, 'inflation_factor': 10, 'maxiter': 5, 'nesterov': True, 'optimizer': 'GA', 'resample': 0, 'restart': False, 'restartsave': True, 'savedata': ['alpha', 'obj_func_values'], 'tol': 1e-06} 2023-12-14 10:22:29,797 : INFO : popt.loop.optimize : iter alpha_iter obj_func step-size cov[0,0] 2023-12-14 10:22:29,798 : INFO : popt.loop.optimize : 0 -1.9083e-01 2023-12-14 10:22:38,680 : INFO : popt.loop.optimize : 1 0 -3.0921e-01 2.00e-01 3.90e-02 2023-12-14 10:22:47,943 : INFO : popt.loop.optimize : 2 0 -3.6664e-01 2.00e-01 3.90e-02 2023-12-14 10:23:00,146 : INFO : popt.loop.optimize : Optimization converged in 2 iterations 2023-12-14 10:23:00,147 : INFO : popt.loop.optimize : Optimization converged with final obj_func = -0.3666 2023-12-14 10:23:00,147 : INFO : popt.loop.optimize : Total number of function evaluations = 3 2023-12-14 10:23:00,147 : INFO : popt.loop.optimize : Total number of jacobi evaluations = 2 2023-12-14 10:23:00,148 : INFO : popt.loop.optimize : Total elapsed time = 0.52 minutes 2023-12-14 10:23:00,148 : INFO : popt.loop.optimize : ============================================
<popt.update_schemes.enopt.EnOpt at 0x7f453c96d310>
Plot the objective function using the function defined above:
Setting up the .mako file¶
The optimization relies on a .mako file for writing the current control variables to the flow simulator input. In this case, the flow simulator is opm-flow opm-projects.org, and the input file is provided as a text file (.DATA file). The .mako file is created by replacing the keywords WCONINJE and WCONPROD in the .DATA file with:
WCONINJE
'INJ-1' WATER 'OPEN' BHP 2* ${injbhp[0]} /
'INJ-2' WATER 'OPEN' BHP 2* ${injbhp[1]} /
/
WCONPROD
'PRO-1' 'OPEN' BHP 5* ${prodbhp[0]} /
/
Running locally¶
It is recommended to run the notebook from a virtual environment. Follow these steps to run this notebook on your own computer:
Step 1: Create virtual environment as normal
python3 -m venv pet_venv
Then activate the environment using:
source pet_venv/bin/activate
Step 2: Install Jupyter Notebook into virtual environment
python3 -m pip install ipykernel
Step 3: Install PET in the virtual environment, see PET installation
Step 4: Allow Jupyter access to the kernel within the virtual environment
python3 -m ipykernel install --user --name=pet_venv
Start jupyter notebook, and load tutorial_popt.ipynb (this file). On the jupyter notebook toolbar, select ‘Kernel’ and ‘Change Kernel’. The new kernel is now be available in the list for selection: