≡ Menu

Maitri’s Introduction to Geostatistics and Uncertainty: Part I

This is the first of several posts that will form a gentle tutorial on geostatistics and uncertainty from the point of view of an applied-geophysics practitioner. Please feel free to ask questions and leave comments, clarifications and corrections below. This post will also be shared in the Subsurf Wiki as preparation material for the Agile Geophysics Hackathon being held in Houston, Texas on September 21st and 22nd.

The bulk of my work these days is the inversion of seismic data for reservoir characteristics that, along with logs, cores, maps, outcrop studies and production history, feed into the generation of outputs such as hydrocarbon reserves, fluid flow and drilling locations. It is akin to reconstructing a complete internal picture of an individual  human from x-rays, MRIs, etc. to get a picture of the circulatory system to then glean amount of blood, rate of blood flow and where to operate or place a needle. When a geophysicist is fortunate enough to have well data from a penetration in the reservoir (think: biopsy with blood sample), he or she is then faced with the challenge of extrapolating that data outward from the well to the rest of the area of interest and also interpolating the data between two or more well penetrations. Thus, a lot of my time is spent dealing with uncertainty and employing geostatistics, i.e. statistics specifically for the earth.

Again, the earth, much like the human body, is a complex, heterogeneous, non-isotropic and discontinuous entity. In this case, it has sedimentary rock units that change shape and size in x,y,z space and whose internal characteristics – such as rock type, depth, age, thickness, grain size, pore distribution, cementation, diagenetic history, layering, fluid type, fluid viscosity and pressure difference – and resultant porosity and permeability change within the extent of each unit. These characteristics are often related to one another, but not always in a singular and straightforward fashion. It is then the geoscientist’s job to determine if all of the data correlate; if not, which data do we believe more and how much more? Furthermore, as exact description of a large, remote system is virtually impossible and infeasible, our models need to be as accurate and repeatable as possible but also manageable and not computationally costly. Ultimately, we wish to achieve the “simulation of flow at a reasonable scale.”

As 100% model accuracy is never achieved, what is always left is uncertainty. My favorite definition of uncertainty goes as follows: “Uncertainty of a measured value is an interval around that value such that any repetition of the measurement will produce a new result that lies within this interval.” There are a couple of important points to keep in mind about uncerrtainty estimation in geostatistics:

  • “Uncertainty is not an intrinsic property of the [system]; it is the result of incomplete knowledge by the observer.” The rock is not uncertain, you are uncertain about the rock.
  • Since it is a spatial interpolation of properties in a static system, and not a forecast or some other kind of classical statistical exercise (like where will a hurricane make landfall or who will win the 2016 presidential election), any geostatistical analysis must involve the data in their entirety. In other words, “geostatistics uses the sampling location of every measurement. [And] unless the measurements show spatial correlation, the application of geostatistics is pointless.”

The fact that every sample location is used means the problem-specific application of  spatially-weighted and deterministic or probabilistic/stochastic geostatistical techniques like kriging, sequential Gaussian simulation, Markov chain analysis, genetic models, cellular automata and multi-point statistics, to name the hot few. Uncertainty can then be reported along with a probability.

The fact that every sample location is used does not mean that the model is right and/or that the geostatistical method was robust. As many teachers of the subject ought to point out, many bad reservoir models hide behind the curtain of geostatistical jargon and poor usage. So, it is not measurement error that we should worry about, but any analysis error introduced by faulty geostatistics, including in initial upscaling. And, even if the most statistically sound methods were used, they “can help us, but most of us don’t know what we’re doing with statistics (be honest). Do we just need more data? No. More expensive analysis equipment? No. No, none of this will help. You cannot beat uncertainty. You just have to deal with it.”

A note on “error.” This short but awesome college science guide to Precision, Accuracy, Error and Uncertainty doesn’t waste time telling you what it thinks of the concept:

You may be amazed to discover that error is not that important in the discussion of experimental results … Do not write “human error” as any part of your lab report. It is in the first place embarrassing, and in our experience as faculty members, it is rarely the source of experimental problems. (Well over half of problems producing bad laboratory results are due to analysis errors in the report! Look here first.) … Uncertainty, rather than error, is the important term to the working scientist.


A few mornings ago, I woke up thinking, “So much uncertainty associated with my various inversion products. I really need to bone up on my geostatistics!” By the end of the day, I had been asked to chair the SEG IQ Earth Forum session on Geostatistics and Uncertainty, coming up in August. Following this is the Agile Geophysics Hackathon in which participants will compete to develop apps that convey error and uncertainty in applied geophysics, right before the annual SEG conference. Moral of the story: Be careful what you fill your waking thoughts with. It may come true.


2 comments… add one
  • Clay August 1, 2013, 8:15 AM

    “Uncertainty of a measured value is an interval around that value such that any repetition of the measurement will produce a new result that lies within this interval.”

    Love that definition.

    Last semester finally got into some serious statistics. One of the thing that engineers are finally moving towards (especially in ocean structures) is probabalistic loading and strength. In the past, engineers figured out the governing loading (ex. shear from a lifting operation from a beam), design for that, and call it a day. With especially ocean structures, storm loading in the worst case is unknowable, so we resort to stochastic models of waves and (and this is a real WHOAH moment for a lot of engineers) stochastic models of strength and make sure that the two bell curves have enough offset. Very trippy when it was first introduced to me.

  • Evan August 29, 2013, 10:59 AM

    I quite like the term ‘expectation value’ that is used a lot in quantum mechanics. Even though the name suggests that it is a single value, it is actually a density of probabilities that take their form from certain functions (PDF) within the system.

    “Statisticians can reuse their data to quantify the uncertainty of complex models” – Cosma Shalizi, American Scientist, 98(10), 2010.

    As far as our job goes, that of describing reality with sparse and imprecise information, maybe uncertainty is indeed inherent to the system? At least in the remote and sparse way in which we will ever interact with it. I do not mean to say that there are earth materials in a would-be reservoir that remain in a indeterminate state of existence until the box is opened (like Schrodinger’s mythical cat). But I think our experiments should reflect that notion of an uncertain relationship that only collapses to a concise state when produce this thing called an observation.

    You are bang on when you say uncertainty is inherently tied to the experiment. Better science, data stewardship, and holding truth as an ideal means documenting the known sources of uncertainty. This is something the oil and gas industry, and the technology companies that support them have not done, yet. It’s time to start. In a geo-spatial parameter space, I think this needs to take the form of visualization tricks; Fault swarms instead of fault sticks, horizon clouds instead off horizon surfaces, that vary and modulate with some measure of reliability with the experiment, and have some form of memory or index to the data from which they came.

    Which brings me to two open questions:
    How do we capture and relay metrics of geo-spatial subsurface uncertainty?
    What are some graphical and map elements that we can utilize for displaying, visualizing, or annotating uncertainty?

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: