Home > Stories > Natural capital > How do you deal with unknown unknowns in environmental modelling?

How do you deal with unknown unknowns in environmental modelling?

Summary

How do you deal with unknown unknowns in environmental modelling? When asked to represent error, we leap to quantifying different aspects of data error and proudly proclaim a 95% confidence interval. However, this neglects epistemic error – the Rumsfeldian unknown unknowns.

When we are modelling pure physical processes, this may not matter too much as the science is quite mature. But predicting biological outcomes, or social outcomes is fraught with epistemic error because the science of causal response is not as well developed.

By ignoring epistemic error we ignore the limitations in our process and system interaction understanding.

7 min read

Author: Nick Marsh

How to deal with confidence in NRM actions?

We confidently present a range of NRM investment opportunities whenever there is an open grant round. If pressed, we will give predictions of the outcomes. What we do poorly is to qualify those predictions of outcomes from our on-ground actions. When a measure of confidence in our outcome prediction is demanded, we engage a mathematician:

The mathematician talks to a statistician.
Laborious data collation from any vaguely related on-ground action then ensues.
Collected data is cleaned.
Data distributions are analysed, and data further cleaned.
Data variance is calculated.
95% confidence intervals are proclaimed.

Those confidence intervals represent only part of the story, the easy bit. There are many potential sources of uncertainty. Most fit under different ways to get the data wrong (sample size, methods, lab errors) and are bundled into the above solution about determining data variance. Data uncertainty is great because it is satisfyingly quantifiable.

The real killer when getting a grip on uncertainty is epistemic error. Epistemic error is our inability to understand the system. For example, our measure used to represent social benefits from an on-ground action assumes some relationship, which could be completely wrong. Error is most easily described using the Rumsfeldian lingo. Data errors are the ‘known unknowns’ and epistemic errors are the ‘unknown unknowns’. The trick is how to quantify the unknown unknowns.

Truii’s approach

When asked to quantify the epistemic error (unknown unknowns), we want to know ‘who is asking?’ And ‘why’?

In the case of quantifying the likely relative confidence in the outcomes of NRM actions to inform investment, then the absolute unknowns are less important than the relative (between actions) unknowns. We don’t necessarily need to know the absolute error in predicting the outcomes of action A, just that the error of prediction is less than the predicted outcome of action B because the science for action A is more mature.

Our approach in Natural Capital Region has been to adopt a survey-based approach to rank the data error (three questions) and epistemic error (one question) for every relationship between an action and its predicted outcome. An action then has different levels of confidence for different outcomes measures. For example, we may have a high confidence in the erosion control benefit of contour banks but low confidence in the predicted local economic benefit of contour banks because there is less science to support the relationship.

Figure 1: confidence survey with epistemic error questions highlighted

This isn’t the last word on error, uncertainty and confidence, it is simply our pragmatic implementation of a method to illustrate the relative confidence between action outcome predictions without needing the produce a quantitative error prediction.

Back to all stories