Editor's Note: HeadFi member Macedionian Hero may be a headphone geek by night, but he's the Director of Engineering at a MIL/AERO electronics manufacturing firm by day. He's also a "Lean Six Sigma Black Belt" (don't know what he had to do to get that, but I bet there's a lot of math), and one of his specialities is characterizing the the precision and accuracy of electronic systems. He volunteered to evaluate the precision of my headphone frequency response measurements.
Did some nail biting over this one, I tell ya.
Introduction
When I joined Headfi several years back, one of the first things that drew my attention were the wonderful headphone measurements offered by Headroom, and now most recently by innerfidelity.com (both thanks to Tyll). Being an engineer, I guess I'm kind of naturally drawn to this stuff like a bee to honey. But one question constantly in the back of my mind was how accurate/relevant were these measurements?
In my 15+ year career in electronics manufacturing, I've measured many things, but I've also learned that just because one measures something and applies a "number" to it, doesn't mean that the story ends there. Measurement systems have inaccuracies built into them and introduce variations as well. Please note that I am defining "measurement systems" to include the person(s) taking the measurements as well.
How Operators are a Source for Varience
Every September for Bring your Kids to Work Day I run a quick little experiment with the grade 9 students. I bring a 30cm ruler, a 1m stick and a 25 foot tape measure along with me. I then break the students into groups of 3. I then give each group a measuring device from my list above and ask them to measure a 7 foot table. The results are the same every year. Each group measures a different length of the exact same 7 foot long table. With the most accurate being the 25 foot measuring tape; then the 1m stick and the worst being the 30cm ruler.
The kids with the 30cm ruler argued that they had the most difficult job because they had to constantly move the ruler across the table and use their fingers/pencils/pens to mark each position as they moved through the table length. So I had each group change measuring devices. The results, even when the same device was used by 3 different groups, still ended up with the same variability in the lengths for the three measurement devices. The kids learned by this demonstration, that that the person(s) measuring also were a source of variability; not just the gage. In case you're interested, this is a very simplified gage R&R (Repeatability & Reproducibility).
We've seen from above that measurement systems rely on a few key factors:
 The suitability of the gage to perform the measurement.
 The suitability of the person(s)' training taking the measurement.
Both of the above can and do introduce variations within the measurements themselves.
Basic Statistics
So I've used the words "variation" and "variability" in my introduction. But what does this mean from a statistical point of view? Variability is everywhere; in manufacturing processes, in materials used, and even in subsequent measurements of these processes, materials and products. Variance represents the entire variability of a process or product. We can estimate variance through the calculation of a standard deviation. That is to stay that the standard deviation is an estimate of variance and as the population size increases, this estimate of variance becomes more and more accurate. The standard deviation is also referred to as a "sigma" or the following symbol: "σ".
The other terms that I'm going to use are "mean" and "average". Both are the same thing. So the average length of the table from the same gage that the grade 9 students measured was simply calculated by the formula:
represents the average/mean

Xi represents the individual measurements

n represents the number of measurements taken

I won't throw anymore statistical parameters/equations at you than these two. These should be sufficient for the purposes of this article.
Now back to the standard deviation (remember this is an estimate of the variance). Most things in nature follow what's called a "Normal Distribution" (you might have heard the term "Gaussian Distribution" also used, both are equivalent). A quick example would be people's heights. If we were to plot Number of Persons on the YAxis and Height Ranges on the XAxis, we would end up with the following type of curve:
The "0" point in this graph would represent the average height (say 5'10" for the average man). Then you'll notice a +/ 1 σ, this would mean +/ 1 standard deviation. The area under the above curve would represent the percentage of the population that one would find that have heights +/ 1 standard deviation from the average. In a normal Gaussian distribution, this percentage is roughly 68.3%. Two standard deviation represent roughly 95%. The term Six Sigma represents +/ 6 standard deviations and corresponds to roughly 3.4 defects out of a million.
For the purposes of this study, I will use +/ 2 "sigmas" or "standard deviations." That is to say that if we measured the same pair of headphones 100 times, 95 of the measurements would fall within this range.