Calibration

Multiple Simulations

Pedro R. Andrade, Antonio O. Gomes Jr, Claus Aranha, Washington S. França

This tutorial works with TerraME version 2.0.0-RC7 or greater.

Summary

Introduction

After implementing a Model and verifying that it works properly, it is usually necessary to execute experiments through several simulations. This process allows the modeler to investigate if the model always converge to a stable state, the possible effects of specific parameters of the model, as well as calibrating the model to better fit real data. Package calibration provides functionalities to facilitate achieving such objectives.

Exploring a given parameter

Type MultipleRuns can simulate a given Model several times. It has strategies to help de user to execute multiple simulations according to ordinary modeling necessities. For example, one of the basic simulation procedures is to explore different values for the given parameter leaving all other unchanged. The code below uses the Model Daisyworld to investigate how sunLuminosity affects the final outcomes of the model. It simulates the model 121 times, using sunLuminosity from 0.4 to 1.6, with 0.01 of step (0.4, 0.41, 0.42, ..., 1.6). Note that lower step values implies in more simulations and therefore more execution time. In the end, an object mr is created with all the outputs of all simulations.

import("sysdyn")
import("calibration")

mr = MultipleRuns{
    model = Daisyworld,
    parameters = {
        sunLuminosity = Choice{min = 0.4, max = 1.6, step = 0.01},
    }
}

After that, it is possible to save the output, as shown below. An object of type MultipleRuns has an object output with all the results. The saved file will have 122 lines because it adds a header.

mr.output:save("result.csv")

It is also possible to plot the output of the simulations in a Chart using sunLuminosity as xAxis. The same output object can be used as target Note that each point in this Chart is the final output of one simulation.

chart = Chart{
    target = mr.output,
    select = {"blackArea", "whiteArea", "emptyArea"},
    xAxis = "sunLuminosity"
}

Repeating simulations

When working with models that use random numbers, it is necessary to repeat the simulations in order to investigate whether they always converge to a stable state. MultipleRuns has an argument repetition to indicate the amout of times a given simulation must be repeated. As default, it only executes once. The source code below executes model Fire from package ca, simulating different values for empty argument, and repeating each of them five times. The function summary executes in the end of the repetitions of a given simulation. In this case, it computes the average value of forest in the end of the simulation.

import("ca")
import("calibration")

mr = MultipleRuns{
    model = Fire,
    repetition = 5,
    parameters = {
        empty = Choice{min = 0.2, max = 1, step = 0.05},
        dim = 30
    },
    forest = function(model)
        return model.cs:state().forest or 0
    end,
    summary = function(result)
        local sum = 0

        -- each value of result.forest is obtained from the function above
        forEachElement(result.forest, function(_, value) 
            sum = sum + value
        end)

        return {average = sum / #result.forest}
    end
}

Finally, it is possible to plot the result of the averages, as shown below:

chart = Chart{
    target = mr.summary,
    select = "average",
    label = "average in the end",
    xAxis = "empty",
    color = "red"
}

Output directory

When running a simulation, all the output files are created in the current directory as default. If one wants to run several simulations following this procedure, the output of consecultive simulations will be overwritten. MultipleRuns has an argument output that describes a directory where the output data will be saved. For each simulation, it creates a given directory with the parameters of the simulation and stores the output there. This way, no file is overwritten by a following simulation during the experiment.

The code below saves the map created by each simulation. The code to save is implemented in function save, but it could be implemented in any other function within MultipleRuns as well as within the model itself. Note that, to allow saving the maps, it is necessary to use hideGraphics = false as argument to MultipleRuns.

import("ca")
import("calibration")

local m = MultipleRuns{
    model = Fire,
    hideGraphics = false,
    repetition = 2,
    folderName = "output",
    parameters = {
        empty = Choice{min = 0.2, max = 0.4, step = 0.1},
        finalTime = 5,
        dim = 20
    },
    save = function(model)
        model.map:save("map.png")
    end
}

Figure below shows the six created directories. Each of them has a file map.png with the output of the given simulation. Note that, as the simulations stop in the end of time five (see the code below), the burning process is still taking place.

Automatic calibration

Sometimes it is interesting to use an automatic approach to try to get insights related to the parameters of the model. Package calibration provides SAMDE, an automatic calibration method based on genetic algorithms. It requires a goodness-of-fit function that gets the result of a simulation and return how good it is, in order to allow comparing the results between different simulations. The parameters it gets as argument define the search space (Choice values) as well as static values (other parameters). It also needs a fit function that returns a value with the difference between reality and simulation. As default, SAMDE will try to find a set of parameters that produces an approximated minimum fit in a computationally reasonable time. The example below uses a Susceptible-Infected-Recovered (SIR) model from sysdyn package. It uses real fluData to compute the fit using the sum of the squares of the differences.

import("sysdyn")
import("calibration")

fluData = {3, 7, 25, 72, 222, 282, 256, 233, 189, 123, 70, 25, 11, 4}

fluSimulation = SAMDE{
	model = SIR,
	maxGen = 50,
	parameters = {
		contacts = Choice{min = 2, max = 50, step = 1},
		probability = Choice{min = 0, max = 1},
		duration = Choice{min = 1, max = 20},
		finalTime = 13,
		susceptible = 763,
		infected = 3
	},
	fit = function(model)
		local dif = 0

		forEachOrderedElement(model.finalInfected, function(idx, att)
			dif = dif + math.abs(att - fluData[idx]) ^ 2
		end)

		return dif
	end
}

The results of the final fit as well as the parameters of the best simulation are stored in the output of SAMDE within instance attribute, as shown below:

print("Difference between data and best simulation: "..fluSimulation.fit)
local modelF = fluSimulation.instance

print("Parameters of best simulatoin:")
print("duration:    "..modelF.duration)
print("contacts:    "..modelF.contacts)
print("probability: "..modelF.probability)

From the output, it is possible to repeat the best simulation using the selected parameters:

instance = SIR{
	duration    = modelF.duration,
	contacts    = modelF.contacts,
	probability = modelF.probability,
	susceptible = 763,
	infected    = 3,
	finalTime   = modelF.finalTime
}

instance:run()

data = DataFrame{data = fluData, infected = instance.finalInfected}

chart = Chart{
	target = data,
	select = {"data", "infected"},
	label = {"Data", "Best simulation"},
	title = "Infected"
}

The output of this script is shown below:

If you have comments, doubts or suggestions related to this document, please write a feedback to pedro.andrade <at> inpe.br.

Back to wiki or terrame.org.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly