Beyond Neutral: Measurements, Blind Testing, Subjective Experience, and Personal Pleasure in Headphone Listening
It appears this is a subject I'll be embroiled in for quite some time. I'm not the first, nor will I be the last, audio writer thus burdened. Steve Guttenberg dragged me into the controversy recently, but he's not to blame. In fact, I thank him, because it was bound to happen sooner or later. My work laptop has been sent to the mothership for IT Tomfoolery, so I'm stuck outside the corporate firewall for a few days, unable to upload pix and graphs to the servers, so I figured I'd spend a little time waxing poetic on the subjectivist/objectivist debate, and Steve's recent comments elsewhere. To offer my views, I'll be a guest on The 404 Friday, June 1 with Steve Guttenberg, and on Scott Wilkinson's "The Home Theater Geeks" Monday, June 18.
It seems to me there's no better way to figure out exactly what one's point of view is than to write it down for all to see. I'm nowhere near as qualified to speak on the subject as John Atkinson or Peter Aczel, but my position at InnerFidelity fairly requires that I establish a position. And since I both measure gear and provide subjective impressions in an effort to help you find just the right headphone, I reckon I do need to formulate and publish a position on the meaningfulness of headphone measurements, blind testing, subjective evaluation, and their relationship to your listening pleasure. Here goes...
The word "neutral" means the sound reproduced by an audio system is identical to the sound signal the system is being fed for reproduction. Measurements indicate the degree to which a system's reproduction meets or deviates from this standard of neutral.
I believe strongly that there is a solid relationship between headphone measurements and what we hear. There's simply no doubt in my mind that when one headphone measures having more bass energy than mid-range energy, you'll hear it as a bass-emphatic headphone. If you zero two headphones' frequency response graphs together at, let's say, 500Hz, and headphone A shows more bass than headphone B, I strongly believe you'll hear headphone A as having more bass. Occasionally you might find someone who disagrees, but in my opinion, they'd just be wrong. Objective measurements are measuring something real, and when you put the headphones on your head, your ears will be presented with that reality.
Our ears are all different from each other, of course, and different from the ears on my measurement dummy head, as well. So there will be differences in exactly what is heard from person to person, or between a particular individual and the measured values, but in terms of measuring a variety of headphones on InnerFidelity's measurement system, I have a high degree of confidence that a fairly accurate picture of the relative differences between headphones is being revealed. "Fairly accurate" needs to be discussed, however.
Objectively, headphone measurements are only moderately accurate. In fact, it's very difficult to identify what "accuracy" even means when discussing headphone measurements. With speakers, we can capture the emitted sound at some distance (usually a few meters) from the speaker using a calibrated microphone and special signals that remove the room from the equation, and we are thereby able to record and characterize with fairly good confidence the performance of the speaker relative to flat ("flat" means essentially the same thing as "neutral"). With electronics, there are no room effects in the way, and simply by careful interconnection and grounding we are able to measure performance with a high degree of accuracy. Headphones, unfortunately, are a dramatically different beast.
Headphones are an acoustic coupler. The headphone, head, ear, ear canal, and ear drum are a single acoustic system and must be characterized as a whole. Because the wavelength of audible sound ranges from substantially longer to similar in length to that of the dimensions of the acoustic space found in a headphone coupler, we cannot view the headphone driver as simply propagating sound towards the ear. Rather, we must think of the driver being coupled to the ear drum through the elastic and potentially acoustically resonant air trapped within the space. You cannot separate or remove any one part from another without changing the nature and performance of the coupler. In other words, there is no place in the coupler where one could say, "Here is where you can measure how flat this system is." We can only deliver electrical signals to the driver and look at the response at the eardrum, and understand that these measurements are a convoluted result of sound going through a complex coupler. I'm afraid that with headphones, finding a very accurate measure of neutral, flat, or transparent will forever elude us. But the data is reliable enough that rational inferences can be made, and by observing the measured performance of very good sounding headphones, a reasonable approximation of "neutral" and "accurate" can become moderately apparent to the educated eye.
Repeatability is another story, however. Macedonian Hero and I went to some lengths to characterize the objective repeatability of my headphone measurement system in this article. In it, we demonstrate that the system's repeatability is within 1dB from 10Hz to about 3kHz; good to a couple of dB between 3kHz and 7.5kHz, and fairly poor varying on average about 8dB above 7.5kHz. I'll hasten to add this is true for only the Sennheiser HD 800 used in the experiment, and on-ear headphones may have poorer repeatability in the lower ranges. Regardless, I'm convinced that the system's repeatability is quite good enough to make meaningful relative comparison between headphones of similar type.
What do I mean by "meaningful"? I have found that measured data from headphones have a fundamental but limited relationship with the subjective listening experience. Here are some areas where I think measurements may inform us about the way a particular headphone sounds:
- Overall Tonal Balance - Because the frequency response measurements are fairly reliable from 10Hz to 3kHz, I find it fairly easy to look at the measured frequency response and determine the basic tonal quality of the cans. You can see if they are: emphatic or lean in the bass; emphasize or have a sucked-out hollow mid-range; or, up to about 3kHz, if they have a smooth or uneven response.
- Bass Tightness - A preliminary read on how tight and punchy the bass and low-mids are can be seen in the linearity of the 30Hz square wave top, and in the degree and starting point of a rise in the low frequencies of the THD+noise plot.
- General Upper Treble Balance - While the frequency response data above 10kHz is all over the place, by visually averaging the amount of energy in the top octave and comparing it to the response below 3kHz, we can see the general balance between upper treble to the mids and bass.
- Imaging - A preliminary and fairly coarse read on the ability of a headphone to produce an interpretable audio image can be gleaned by looking at the clarity of the leading edge of the 300Hz square wave. A single leading edge with some mild overshoot and quick settling to the wave form top indicates clear acoustic edges with which the brain can accurately determine the arrival time of signals and thereby form a convincing sense of depth and space.
- Harshness - Again looking at the 300Hz square wave, if the overshoot is extreme, or if the signal rings strongly after the leading edge, you can usually expect that the headphone will sound "biting" or "harsh."
- Blackness - Though much better observed using Cumulative Spectral Decay plots (CSD, waterfall), residual ringing and noise created by the cans after a sound has been played can be seen in the impulse response plot by observing how long it takes for the headphones to go silent after the impulse. This is a fairly good measure of how "black between the notes" a headphone will sound, or sometimes how "confused" they sound.
- Isolation - Though not directly related to sound quality, the measurement system does an excellent job of measuring the amount of isolation a headphone can provide.
The meaningfulness of measurements described above allows us to do three things:
- Coarsely Gradate Headphone Performance - Many, if not most, headphones available for purchase are absolute junk. They sound bad, and measurements will show them deviating strongly from neutral. Careful observation and interpretation will allow you to coarsely sort headphones into three piles: miserable fail; reasonably competent; and potentially very good.
- General Character - Using the observations listed above (overall tonal balance, bass tightness, imaging, harshness, etc.) one can get a preliminary read on the general sound qualities one might expect while listening. Basic information about whether a headphone is warm, bright, or balanced can be seen in the data.
- Selecting Personally Suitable Product for Audition - It is my belief the most important thing the InnerFidelity database of headphone measurements offers is the ability to look through a wide variety of headphones for those that might suit your personal listening tastes. I like slightly warm but very linear headphones that enunciate well, but am strongly averse to harshness or treble emphasized headphones. Your listening tastes may differ. By becoming familiar with the basic nature and interpretation of the headphone measurements, you and I could sort through the list of cans and pick those that might please our personal tastes. If we are good at interpreting the data and know our preferences, then our lists of headphones would differ depending on our tastes.
I'll readily admit forum dialog and the general consensus developed therein is a very good evaluation tool, as well--better in some ways than measurements in that it includes subjective experience. Unfortunately, that consensus is developed in conversation that can be long, convoluted, spread out over numerous threads, and difficult to find. Even a forum regular like myself will often find it difficult to integrate the forum dialog into a clear picture of a particular headphone's performance. For the general buying public, forum dialog can be a mire of random opinion and noise. A headphone measurement database, by contrast, is dense with meaning for those who can interpret them; is located in one place; and provides an excellent ability to compare headphones directly against each other. Of course, for the general public, headphone measurements can be just as impenetrable as forum yabber. Job security for me, I guess.
My strong conviction that headphone measurements are meaningful and valuable makes me an objectivist of sorts. You might rationally expect that I'm about to espouse the virtues of audio blind testing for evaluative purposes...not really gonna happen.