Tutorial for running the Python Inverse Problem Toolbox (PIPT)¶

As an illustrative example we choose a small 3D-field with three producers and three (water) injectors. The figure below shows the true (data generating) permeability field and the well positions. The grid is 10x10x2, and the porosity is 0.2. The inverse problem is to find the permeability for the reservoir by assimilation produced water and oil and injected water.

The first step is to load neccessary external and local modules.

In [1]:

Copied!





# Set width
from IPython.display import display, HTML
display(HTML("<style>.container { width:50% !important; }</style>"))

# Import global modules
import numpy as np
from glob import glob
import os
import matplotlib.pyplot as plt  

# Import local modules
from pipt.loop.ensemble import Ensemble # this class contains the data
from pipt.loop.assimilation import Assimilate # this class contains the iterative assimilation loop
from subsurface.multphaseflow.opm import flow # the simulator we want to use
from input_output import read_config # functions for reading input
from pipt import pipt_init # script for initializing the module with the data assimilation method 
from plot_objective_function import combined # plot the data mismatch
from plot_parameters import plot_layer, export_to_grid # plot the parameters
from plot_data import plot_prod # plot the production data
# Set width
from IPython.display import display, HTML
display(HTML(""))

# Import global modules
import numpy as np
from glob import glob
import os
import matplotlib.pyplot as plt  

# Import local modules
from pipt.loop.ensemble import Ensemble # this class contains the data
from pipt.loop.assimilation import Assimilate # this class contains the iterative assimilation loop
from subsurface.multphaseflow.opm import flow # the simulator we want to use
from input_output import read_config # functions for reading input
from pipt import pipt_init # script for initializing the module with the data assimilation method 
from plot_objective_function import combined # plot the data mismatch
from plot_parameters import plot_layer, export_to_grid # plot the parameters
from plot_data import plot_prod # plot the production data

Set the random seed:

In [2]:

Copied!

np.random.seed(10)
np.random.seed(10)

Remove old results and folders, if present:

In [3]:

Copied!





for folder in glob('En_*'):
    shutil.rmtree(folder)
for file in glob('debug_analysis_step_*'):
    os.remove(file)
for folder in glob('En_*'):
    shutil.rmtree(folder)
for file in glob('debug_analysis_step_*'):
    os.remove(file)

Read inputfile. In this tutorial the input file is written as a .toml file, and consists of two main keys: dataassim and fwdsim. The first part contains the options for the data assimilation algorithm and the second part are options related to the forward simulation model. The description of all keys are provided in the printouts of method docstrings below.

In [4]:

Copied!

!cat 3D_ESMDA.toml
kd, kf, ke = read_config.read_toml('3D_ESMDA.toml')
!cat 3D_ESMDA.toml
kd, kf, ke = read_config.read_toml('3D_ESMDA.toml')

[ensemble]
ne = 50.0
state = "permx"
prior_permx = [["vario", "sph"], ["mean", "priormean.npz"], ["var", 1.0], ["range", 10.0], ["aniso", 1.0],
               ["angle", 0.0], ["grid", [10.0, 10.0, 2.0]]]
               
[dataassim]
daalg = ["esmda", "esmda"]
analysis = "approx"
energy = 98.0
obsvarsave = "yes"
restartsave = "no"
analysisdebug = ["pred_data", "state", "data_misfit", "prev_data_misfit"]
restart = "no"
obsname = "days"
truedataindex = [400, 800, 1200, 1600, 2000, 2400, 2800, 3200, 3600, 4000]
truedata = "true_data.csv"
assimindex = [0,1,2,3,4,5,6,7,8,9]
datatype = ["WOPR PRO1", "WOPR PRO2", "WOPR PRO3", "WWPR PRO1", "WWPR PRO2",
            "WWPR PRO3", "WWIR INJ1", "WWIR INJ2", "WWIR INJ3"]
staticvar = "permx"
datavar = "var.csv"
mda = [ ["tot_assim_steps", 3], ['inflation_param', [2, 4, 4]] ]

[fwdsim]
reporttype = "days"
reportpoint = [400, 800, 1200, 1600, 2000, 2400, 2800, 3200, 3600, 4000]
replace = "yes"
saveforecast = "yes"
sim_limit = 300.0
rerun = 1
runfile = "runfile"
datatype = ["WOPR PRO1", "WOPR PRO2", "WOPR PRO3", "WWPR PRO1", "WWPR PRO2",
            "WWPR PRO3", "WWIR INJ1", "WWIR INJ2", "WWIR INJ3"]
parallel = 4
startdate = "1/1/2022"

Initialize the simulator with simulator keys.

In [5]:

Copied!

sim = flow(kf)
print(flow.__init__.__doc__)
sim = flow(kf)
print(flow.__init__.__doc__)

The inputs are all optional, but in the same fashion as the other simulators a system must be followed.
The input_dict can be utilized as a single input. Here all nescessary info is stored. Alternatively,
if input_dict is not defined, all the other input variables must be defined.

Parameters
----------
input_dict : dict, optional
Dictionary containing all information required to run the simulator.

- parallel: number of forward simulations run in parallel
- simoptions: options for the simulations
- mpi: option to use mpi (always use > 2 cores)
- sim_path: Path to the simulator
- sim_flag: Flags sent to the simulator (see simulator documentation for all possibilities)
- sim_limit: maximum number of seconds a simulation can run before being killed
- runfile: name of the simulation input file
- reportpoint: these are the dates the simulator reports results
- reporttype: this key states that the report poins are given as dates
- datatype: the data types the simulator reports

filename : str, optional
Name of the .mako file utilized to generate the ECL input .DATA file. Must be in uppercase for the
ECL simulator.

options : dict, optional
Dictionary with options for the simulator.

Returns
-------
initial_object : object
Initial object from the class ecl_100.

Print the Ensemble options:

In [6]:

Copied!

print(Ensemble.__init__.__doc__)
print(Ensemble.__init__.__doc__)

        Parameters
        ----------
        keys_da : dict
            Options for the data assimilation class

            - daalg: spesification of the method, first the main type (e.g., "enrml"), then the solver (e.g., "gnenrml")
            - analysis: update flavour ("approx", "full" or "subspace")
            - energy: percent of singular values kept after SVD
            - obsvarsave: save the observations as a file (default false)
            - restart: restart optimization from a restart file (default false)
            - restartsave: save a restart file after each successful iteration (defalut false)
            - analysisdebug: specify which class variables to save to the result files
            - truedataindex: order of the simulated data (for timeseries this is points in time)
            - obsname: unit for truedataindex (for timeseries this is days or hours or seconds, etc.)
            - truedata: the data, e.g., provided as a .csv file
            - assimindex: index for the data that will be used for assimilation
            - datatype: list with the name of the datatypes
            - staticvar: name of the static variables
            - datavar: data variance, e.g., provided as a .csv file

        keys_en : dict
            Options for the ensemble class

            - ne: number of perturbations used to compute the gradient
            - state: name of state variables passed to the .mako file
            - prior_<name>: the prior information the state variables, including mean, variance and variable limits

        sim : callable
            The forward simulator (e.g. flow)

Example using ESMDA. The input and available options are given below. During assimilation, useful information is written to the screen. The same information is also written to a log-file named pet_logger.log.

In [7]:

Copied!

analysis = pipt_init.init_da(kd, ke, sim)

print(analysis.__init__.__doc__)
assimilation = Assimilate(analysis)
assimilation.run()
analysis = pipt_init.init_da(kd, ke, sim)

print(analysis.__init__.__doc__)
assimilation = Assimilate(analysis)
assimilation.run()

Single entry for VARIO will be copied to all 2 layers
Single entry for VAR will be copied to all 2 layers
Single entry for ANISO will be copied to all 2 layers
Single entry for ANGLE will be copied to all 2 layers
Single entry for RANGE will be copied to all 2 layers

        The class is initialized by passing the keywords and simulator object upwards in the hierarchy.

        Parameters
        ----------
        keys_da['mda']: list
            - tot_assim_steps: total number of iterations in MDA, e.g., 3
            - inflation_param: covariance inflation factors, e.g., [2, 4, 4]

        keys_en : dict

        sim : callable

Iterations (Obj. func. val: ):   0%|                                                                                                                                                                                             | 0/4 [00:00<?, ?it/s]

  0%|          | 0/50 [00:00<?, ?it/s]

Iterations (Obj. func. val: ):  25%|█████████████████████████████████████████████▎                                                                                                                                       | 1/4 [00:06<00:19,  6.58s/it]

  0%|          | 0/50 [00:00<?, ?it/s]

Iterations (Obj. func. val:749926.2 Reduced: 95 %):  50%|████████████████████████████████████████████████████████████████████████████████                                                                                | 2/4 [00:13<00:13,  6.71s/it]

  0%|          | 0/50 [00:00<?, ?it/s]

Iterations (Obj. func. val:163815.9 Reduced: 78 %):  75%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████                                        | 3/4 [00:20<00:06,  6.88s/it]

  0%|          | 0/50 [00:00<?, ?it/s]

Iterations (Obj. func. val:14396.2 Reduced: 91 %): 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:27<00:00,  6.76s/it]

Convergence was met. Obj. function reduced from 14994770.0 to 14396.2

Plot the data mismatch:

In [8]:

Copied!

combined()
combined()

No description has been provided for this image

Plot the prior and posterior permeability in the upper layer:

In [9]:

Copied!

export_to_grid('permx')
plot_layer('permx', [2, 10, 10])
export_to_grid('permx')
plot_layer('permx', [2, 10, 10])

In [10]:

Copied!

plot_prod()
plot_prod()

WOPR PRO1

WOPR PRO2

WOPR PRO3

WWPR PRO1

WWPR PRO2

WWPR PRO3

WWIR INJ1

WWIR INJ2

WWIR INJ3

Setting up the .mako file¶

The data assimilation relies on a .mako file for writing the current state variables to the flow simulator input. In this case, the flow simulator is opm-flow opm-projects.org, and the input file is provided as a text file (.DATA file). The .mako file is created by replacing the keywords PERMX in the .DATA file with:

PERMX
% for i in range(0, len(permx)):
% if permx[i] < 6:
${"%.3f" %(np.exp(permx[i]))}
% else:
${"%.3f" %(np.exp(6))}
% endif
% endfor
/

Running locally¶

It is recommended to run the notebook from a virtual environment. Follow these steps to run this notebook on your own computer:

Step 1: Create virtual environment as normal

python3 -m venv pet_venv

Then activate the environment using:

source pet_venv/bin/activate

Step 2: Install Jupyter Notebook into virtual environment

python3 -m pip install ipykernel

Step 3: Install PET in the virtual environment, see PET installation

Step 4: Install Plotting in the virtual environment, see Plotting installation

Step 5: Allow Jupyter access to the kernel within the virtual environment

python3 -m ipykernel install --user --name=pet_venv

Start jupyter notebook, and load tutorial_popt.ipynb (this file). On the jupyter notebook toolbar, select ‘Kernel’ and ‘Change Kernel’. The new kernel is now be available in the list for selection:

In [ ]: