Probability Distributions for SimulationFor experienced modelers, the most challenging task in creating a simulation model is usually not identifying the key inputs and outputs, but selecting an appropriate probability distribution and parameters to model the uncertainty of each input variable. For example, Risk Solver Platform software provides over 40 analytic probability distributions -- which one should you use? The answer depends on your application, but some general guidelines can be given.
Discrete Vs. Continuous DistributionsIf you must choose or create your own distribution, the first step is to determine whether to use a discrete or continuous form.
Bounded Vs. Unbounded DistributionsAnother characteristic that distinguishes probability distributions is the range of sample values they can generate.
At times, you may find that the most appropriate distribution (say the Normal) is unbounded, but you know that the realistic values of the physical process are bounded, or your model is designed to handle values only up to some realistic limit. Your software may allow you to truncate an unbounded distribution. For example, in Risk Solver Platform you can impose bounds on any distribution by passing the PsiTruncate property function as an argument to the distribution function. Analytic Vs. Custom DistributionsA third characteristic of probability distributions is whether they are analytic (also called parametric) or custom (sometimes called non-parametric) distributions.
Generally speaking, you should choose an analytic distribution if -- and only if -- the theoretical assumptions truly apply in your situation. More Hints and WarningsUsing a Triangular Distribution. If you have only estimates of the minimum, maximum, and most likely values of an uncertain variable -- and no other past data or literature references -- a popular approach is to create a Triangular distribution from these three numbers. This is unlikely to be a highly accurate representation of the uncertainty, but it will allow you to get started, and it is far better than a single average that is subject to the Flaw of Averages. If your ‘minimum’ and ‘maximum’ values are really low- and high-percentile estimates rather than the absolute lowest and highest values that can occur, consider using a 'generalized Triangular' distribution (PsiTriangGen in Risk Solver Platform) instead. Define Each Uncertain Variable Only Once. Often, you’ll need to use the same uncertain variable in several different formulas in your model. A very common error is to enter the same distribution function, with the same parameters (say PsiNormal(100, 10) in Risk Solver Platform), several times in a model -- in a belief that these instances will yield the same results on each trial. This is incorrect -- by doing this, you’ve actually defined several independent uncertain variables that may well sample different values on each trial. You should instead define =PsiNormal(100, 10) only once (for example in a cell such as A1), and use A1 in every formula where the uncertain variable is needed. |