Risk Assessment

The accurate assessment of risks in order to evaluate their severity and to develop appropriate responses and compensations is a key function of managers in a range of fields. Simgua includes built-in support for risk assessment using the Dynamic Risk Assessment Method (DRAM).

The DRAM procedure allows you to specify a number of “thresholds” or bounds that carry a cost if the value of a primitive violates or crosses those bounds. For instance, if you were simulating the temperature of a chemical process, there might be a negative effect if the temperature rose above a certain threshold and also if the temperature fell below a certain threshold.

One way to assess the risk would be to just run the model and see if the temperature rose above or fell below the specified thresholds. This might give you a general idea of the risk, but it would not be very precise and it would be difficult to accurately assess the risk for very rare events. DRAM works by converting the results of a model simulation into a probability model that allows you to obtain precise values for the risk in the model.

Example case for DRAM in which there are both upper and lower threshoLds

It is important to note that DRAM works by assuming that on some level the value of the variable you are studying is a random variable. That means DRAM must not be applied to variables where there are clearly visible trends or periodic functions in the value of the variable. If used in such cases, DRAM will generate meaningless results.

Computation Procedure

The DRAM procedure requires the execution of the following stages:

1.      Obtain time-based data series through simulation.

2.      Specify a number of thresholds and costs associated with violating given thresholds. Both upper and lower bound thresholds may be specified.

3.      Determine the probability distribution that best models the data by one of two ways:

a.     User specification.

b.     Kolmogorov-Smirnov Goodness of Fit tests.

4.      Apply an autocorrelation analysis of the data. Locate the smallest separation interval at which the autocorrelation falls below a user-specified parameter. Divide the total number of data-points by this separation to obtain the number of approximately independent data-points.

5.      For each threshold, calculate the probability of the threshold being violated during the simulation period based on the probability distribution and number of independent observations.

6.      Calculate risk by multiplying probability of violation by cost for each threshold and integrating to arrive at total risk. Risks for time periods other than those used in the analysis may be determined by linear extrapolation.

 

Detailed Risk Assessment Case Study

We will now explore at the steps of the DRAM procedure in more detail. A simulation of a sample model will be used to illustrate the procedure. This model predicts the level of infection of Cryptosporidium, a waterborne pathogen, within a community. The variable that will be focused on in this analysis is the percent of the population that is infected with the pathogen.

Step 1

The infection model was implemented in Simgua and predicted infection levels for a specific community and set of conditions could be obtained by running the simulation. The following figure is an example of a portion of the simulation results from the model. It is important to note that the level of infected population includes all those infected with the disease. This value includes those with only mild symptoms in addition to asymptomatic individuals (approximately 65% of the number of infected).

Simulated infection level data

Step 2

Thresholds based on the level of population infected and the associated costs to society were entered into the DRAM interface. Studies indicate that costs due to infection increased exponentially in relation to infection levels. All thresholds were upper bound thresholds given the nature of the problem. For other problems, such as the temperature or pressure of a boiler, one could conceivably have both upper and lower bound thresholds.

Step 3

The Simgua implementation of DRAM includes three probability models: Normal, Lognormal and Gamma Distributions. The user may specify which model to use or they may let the algorithm choose the best-fit distribution based on Kolmogorov-Smirnov Goodness of Fit tests. The test works by comparing the theoretical probability distribution for a dataset against the actual Cumulative Distribution Function (CDF). The following figure contains two Kolmogorov-Smirnov Goodness of Fit diagrams for the infection level data and the Normal and Lognormal distributions. Clearly, the Lognormal Distribution better approximates the observed data, as the two curves are visibly closer together.

Two Kolmogorov-Smirnov Goodness of Fit Tests

The KS parameter is determined by taking the maximum distance between the theoretical and empirically distributions. The results of this test are then reported as a hypothesis test where the null-hypothesis entails accepting the distribution, and the alternative hypothesis is rejecting it. Alpha is the probability of incorrectly rejecting the distribution. The current implementation of DRAM displays the results of this test for alphas of 20%, 10%, 5% and 1%. When choosing the best distribution automatically, DRAM selects the distribution that results in the smallest KS value when the test is applied to the time series data.

Step 4

The fundamental problem encountered when applying random number statistical theory to dynamic simulation results is that the simulation results are not (in the vast majority of cases) random variables. There is often a high level of autocorrelation between sequential data points in the resulting time series. Thus the sequential observations are not strictly independent. To compensate for this fact, the DRAM algorithm analyzes the autocorrelation within the dataset to approximate the total number of independent observations.

First, a correlogram (autocorrelation plot) for the data series is constructed. The correlogram plots the autocorrelation within the dataset for different offsets. The algorithm then locates the first place at which the autocorrelation falls below a set threshold (recommended to be 0.707, the point at which R2 is 0.5 indicating that the majority of an observation is no longer explained by the previous observations). The offset at this point is recorded.  The total number of data-points is then divided by this critical offset. The result of this division is the number of approximately independent data points. The following figure illustrates such a correlogram for the disease model.

Sample correlogram; the critical offset in this analysis is found to be 10

The success of the method is seen in the way the number of independent observations changes as we change the simulation time step. The simulation time step is a parameter of system dynamics models which is used by numerical solvers. The smaller the time step, the more accurate the resulting simulation but the longer it takes to complete the simulation. Theoretically, reducing the time step does not change the probability that thresholds will be exceeded (assuming it is already at a relatively fine-grained level and the level on randomness in the model is not based on the time step) thus the number of independent data-points should remain constant. Cutting the time step in half (while holding constant the duration of the simulation) results in a doubling the number of observations. In our tests for the infection model, however, the number of independent observations calculated by DRAM only changed by approximately 0.5% when the number of real observations was doubled.

Step 5

For a random variable, the probability of a lower bound threshold not being violated or crossed in one trial is denoted ql and is determined with equations described in the Simgua manual.

Step 6

The probability of violation is calculated separately for each threshold. Combined risk is calculated by multiplying risk by probability for each threshold and then summing the totals.

The following figure illustrates example results from a DRAM analysis for the disease simulation. Hypothetical costs were assumed for testing the algorithm. Calculated risk for the simulated community of 50,000 individuals over the course of 5 years is $9,123.33. This cost may be scaled linearly for different time periods resulting in a yearly risk of: R = $1,824.66.

Picture 1.png

Sample results from DRAM analysis