-
Notifications
You must be signed in to change notification settings - Fork 13
Calibration
Pedro R. Andrade, Antonio O. Gomes Jr, Claus Aranha, Washington S. França
This tutorial works with TerraME version 2.0.0-RC7 or greater.
- Introduction
- Exploring a given parameter
- Repeating simulations
- Output directory
- Automatic calibration
After implementing a Model
and verifying that it works properly, it is usually necessary to execute experiments through several
simulations. This process allows the modeler to investigate if the model always converge to a stable state, the possible effects of specific parameters of the model, as well as calibrating the model to better fit real data. Package calibration
provides functionalities to facilitate achieving such objectives.
Type MultipleRuns
can simulate a given Model
several times. It has strategies to help de user
to execute multiple simulations according to ordinary modeling necessities. For example, one of the basic
simulation procedures is to explore different values for the given parameter leaving all other unchanged.
The code below uses the Model Daisyworld to investigate how sunLuminosity
affects
the final outcomes of the model. It simulates the model 121 times, using sunLuminosity
from 0.4
to 1.6
, with 0.01
of step (0.4
, 0.41
, 0.42
, ..., 1.6
). Note that lower step
values implies in more simulations and therefore more execution time. In the end, an object mr
is created with all the outputs of all simulations.
import("sysdyn")
import("calibration")
mr = MultipleRuns{
model = Daisyworld,
parameters = {
sunLuminosity = Choice{min = 0.4, max = 1.6, step = 0.01},
}
}
After that, it is possible to save the output, as shown below. An object of type MultipleRuns
has an object output
with all the results. The saved file will have 122 lines because it adds a header.
mr.output:save("result.csv")
It is also possible to plot the output of the simulations in a Chart
using sunLuminosity
as xAxis
. The same output
object can be used as target
Note that each point in this Chart
is the final output of one simulation.
chart = Chart{
target = mr.output,
select = {"blackArea", "whiteArea", "emptyArea"},
xAxis = "sunLuminosity"
}
When working with models that use random numbers, it is necessary to repeat the simulations in order to investigate whether
they always converge to a stable state. MultipleRuns
has an argument repetition
to indicate the amout of times
a given simulation must be repeated. As default, it only executes once. The source code below executes model
Fire
from package ca
, simulating different values for empty
argument, and repeating each of them five times. The function
summary
executes in the end of the repetitions of a given simulation. In this case, it computes the average
value of forest
in the end of the simulation.
import("ca")
import("calibration")
mr = MultipleRuns{
model = Fire,
repetition = 5,
parameters = {
empty = Choice{min = 0.2, max = 1, step = 0.05},
dim = 30
},
forest = function(model)
return model.cs:state().forest or 0
end,
summary = function(result)
local sum = 0
-- each value of result.forest is obtained from the function above
forEachElement(result.forest, function(_, value)
sum = sum + value
end)
return {average = sum / #result.forest}
end
}
Finally, it is possible to plot the result of the averages, as shown below:
chart = Chart{
target = mr.summary,
select = "average",
label = "average in the end",
xAxis = "empty",
color = "red"
}
When running a simulation, all the output files are created in the current directory as default. If one wants
to run several simulations following this procedure, the output of consecultive simulations will be overwritten.
MultipleRuns
has an argument output
that describes a directory where the output data will be saved.
For each simulation, it creates a given directory with the parameters of the simulation and stores the output
there. This way, no file is overwritten by a following simulation during the experiment.
The code below saves the map created by each simulation. The code to save is implemented in function save
,
but it could be implemented in any other function within MultipleRuns
as well as within the model itself.
Note that, to allow saving the maps, it is necessary to use hideGraphics = false
as argument to MultipleRuns
.
import("ca")
import("calibration")
local m = MultipleRuns{
model = Fire,
hideGraphics = false,
repetition = 2,
folderName = "output",
parameters = {
empty = Choice{min = 0.2, max = 0.4, step = 0.1},
finalTime = 5,
dim = 20
},
save = function(model)
model.map:save("map.png")
end
}
Figure below shows the six created directories. Each of them has a file map.png
with the output of the
given simulation. Note that, as the simulations stop in the end of time five (see the code below), the
burning process is still taking place.
Sometimes it is interesting to use an automatic approach to try to get insights related to the parameters of the model.
Package calibration provides SAMDE
, an automatic calibration method based on genetic algorithms.
It requires a goodness-of-fit function that gets the result of a simulation and return how good it is, in order
to allow comparing the results between different simulations. The parameters
it gets as argument define
the search space (Choice
values) as well as static values (other parameters). It also needs a fit
function
that returns a value with the difference between reality and simulation. As default, SAMDE
will try
to find a set of parameters that produces an approximated minimum fit
in a computationally reasonable time.
The example below uses a Susceptible-Infected-Recovered (SIR
)
model from sysdyn
package. It uses real fluData
to compute the fit
using the sum of the squares of the differences.
import("sysdyn")
import("calibration")
fluData = {3, 7, 25, 72, 222, 282, 256, 233, 189, 123, 70, 25, 11, 4}
fluSimulation = SAMDE{
model = SIR,
maxGen = 50,
parameters = {
contacts = Choice{min = 2, max = 50, step = 1},
probability = Choice{min = 0, max = 1},
duration = Choice{min = 1, max = 20},
finalTime = 13,
susceptible = 763,
infected = 3
},
fit = function(model)
local dif = 0
forEachOrderedElement(model.finalInfected, function(idx, att)
dif = dif + math.abs(att - fluData[idx]) ^ 2
end)
return dif
end
}
The results of the final fit as well as the parameters of the best simulation are stored in the output
of SAMDE
within instance
attribute, as shown below:
print("Difference between data and best simulation: "..fluSimulation.fit)
local modelF = fluSimulation.instance
print("Parameters of best simulatoin:")
print("duration: "..modelF.duration)
print("contacts: "..modelF.contacts)
print("probability: "..modelF.probability)
From the output, it is possible to repeat the best simulation using the selected parameters:
instance = SIR{
duration = modelF.duration,
contacts = modelF.contacts,
probability = modelF.probability,
susceptible = 763,
infected = 3,
finalTime = modelF.finalTime
}
instance:run()
data = DataFrame{data = fluData, infected = instance.finalInfected}
chart = Chart{
target = data,
select = {"data", "infected"},
label = {"Data", "Best simulation"},
title = "Infected"
}
The output of this script is shown below:
If you have comments, doubts or suggestions related to this document, please write a feedback to pedro.andrade <at> inpe.br.
Back to wiki or terrame.org.