Harman Researchers Make Important Headway in Understanding Headphone Response

What should a headphone sound like?
You wouldn't believe how complicated it is to answer that question. The problem, in a nutshell, is this: put a calibrated measurement microphone in front of an ideal speaker and measure the frequency response and you get a flat line. Stick your head where that calibrated mic was and look at the signal at your ear drum and what you get is far from a flat line. Boundry gain from your torso adds a bit of mid-range energy; the shape of your pinnae act to amplify signal between roughly 2kHz and 5kHz; and ear canal resonances make minced-meat of the highs.

And it gets worse....

As you move your head from side to side, or up and down, relative to the speaker, the response at the ear drum changes significantly. It's a good thing because all these changes are what allow you to tell where a sound is coming from. But it makes measuring a pair of headphones and being able to tell if they're neutral a veritable nightmare. How, the hell, to do that!?

130627_Blog_HarmanVisit_Graph_FFDFThis family of response curves describing what our ear drum hears as sound comes from various angles as a class is called the Head Related Transfer Function (HRTF). Audio engineers have long established two curves as important for compensating for the HRTF: Free Field, and Diffuse Field. The Free Field curve models what is heard at the ear drum with sound coming from directly in front of you and without any reflected energy—like if your were listening to a speaker in an anechoic chamber. The Diffuse Field curve models what you'd hear at the ear drum if you were in a very live room (all concrete walls) with speakers placed in numerous places around the room pumping it full of energy—in other words, what's heard at the ear-drum with flat sound coming at you from all directions.

The problem, as I see it, is that neither of these compensations model what would be heard at the ear drum listening to two good speakers in an acoustically treated listening room, which, I would assume, is what headphones are supposed to be mimicking. But as much sense as that might make to a simple journalist like me, it doesn't move the state of the art forward. Someone, with serious science chops, lots of money to throw at the problem, and all the gear to do the studies right, has to step up to the plate and do the hard work needed to establish a really good target curve for headphone reproduction. One of those someones is Dr. Sean E. Olive, Director of Acoustic Research at Harman International.

Approaching the Problem
Having been aware for some time of Dr. Olives work at Harman in developing target response curves for headphones, I contacted him about the possibility of a visit to his lab while in L.A. for T.H.E. Show Newport. I was delighted when he agreed, and simply tickled pink when he also invited me to join himself and Dr. Floyd E. Toole for dinner during the show.

Dr. Toole lead a team of researchers—which included Sean Olive—in a very important body of work dubbed the "Athena Project", which started in 1986 and was a partnered effort between Canada's National Research Council (NRC) and a non-profit consortium of five Canadian audio companies. The project's task was to develop a set of target specifications for speaker performance known to be pleasing to listeners. The basic premise of the research was that speakers that sound good will sell well, and Toole's job was to find out what "sound good" meant. He didn't simply settle on "flat is best", he set out to determine what people actually preferred by developing a comprehensive regime of subjective testing to very carefully relate listener preferences with the technical performance of speakers.

Spoiler alert: Flat, neutral response was preferred.

But it's not nearly as simple as that. Off-axis response, evenness of response, room reflections, bass reinforcement, bass extension, and a wide variety of other factors strongly come into play. It would be unfair of me to over-simplify the work or insights that arose from a very serious effort to critically evaluate the subjective impressions that lead them to their conclusions, so i'll point you to Dr. Tooles marvelous book, "Sound Reproduction: Loudspeakers and Rooms." In it, Dr. Toole takes the reader on a grand tour of the world of audio reproduction and his life's work investigating the subject. His written voice is clear, humorous, and eye-opening, making accessible to the avid enthusiast the nuts and bolts of what makes for a great listening experience. I've skimmed the book and now have started reading it in ernest, and I find it a great treat for my inner audio geek. Highly recommended.

The take-away point for this article is this: When research using very carefully controlled double-blind testing is done to evaluate audio performance, results may point toward the simple conclusion that neutral is preferred, but the data will also be rich with nuance—nuance that may provide the information needed to build not just good audio transducers, but great ones.

Current Work
The current work being lead by Dr. Olive at Harman, with the help of researchers Todd Welti and Elisabeth McMullin, to determine a target headphone response is similar in its approach to the previous work done by Dr. Toole at the NRC. And though still in its early stages, some interesting results have begun to emerge, which are documented in three AES papers that Dr. Olive provided me for this article. (I am an AES member, so it's all Kosher.) I'll briefly summarize these papers below.

The Relationship between Perception and Measurement of Headphone Sound Quality - Presented at the 133rd AES Convention Oct 2012, San Francisco
For this paper double-blind tests were performed on six popular circumaural headphones (AKG K701 and K550; Audeze LCD-2 v2; Beats by Dre Studio; Bose Quiet Comfort 15; and V-Moda Crossfade LP). Test subjects were unable to see the headphones being placed on their head by the test administrator, and small handles were installed on the headphones so the subject could adjust the headphones on their head without any tactile cues as to which cans they were wearing. This was a two part test where subjects were asked to provide responses on various sound quality attributes (spectral, spatial, dynamics, and distortions) in the first part, and perceived spectral balance and comfort in the second.

The paper explains in excruciating detail the efforts to develop an experiment that was free of bias, which includes: level matching techniques; the selection of music used; the use of short clips; the development of listener feedback metrics and comments; and selection of qualified listeners. It goes on to describe the rigorous application of statistical analysis on the resulting raw data, which not only held the grading information on headphone performance, but also was able to determine the reliability of the various subjects impressions.

Sitting in my catbird seat of having long experience with all the headphones included in the test I have to tell you that it's quite amusing to see all this science tease out observations that resonate strongly with my experience. If I had a billion bucks I'd love to perform such tests for all InnerFidelity headphone reviews...but I don't, so I guess we'll just have to be satisfied with trusting reviewers for the task.

The paper goes on to compare the subjective listening test results with the measured performance of the headphones using standardized test equipment. But it also takes the measurements one step further and some subjects had small microphones inserted into their ear canals to have the headphones measured on real heads.

Six conclusions were drawn from the experiment:

  1. Subjects perceived significant differences between the headphone's in the areas of comfort, preferred attributes of sound quality, and spectral balance.
  2. The most preferred headphones had the least deviation from flat and neutral in spectral balance rating.
  3. Sound quality attribute preferences and overall spectral balance ratings occurred in separate parts of this two-part test, but there was a strong correlation between perceptions of poor spectral balance and comments associated with low preference rating.
  4. The more preferred headphones had measurements showing flatter, smoother amplitude response, and better extended bass. The measured amplitude response was generally a good predictor of perceived spectral balance and preference rating.
  5. The most preferred headphone did not have the 12dB peak at 3kHz which exists in the diffuse field standard curve. Two headphones which did have this peak were judged to be too thin and bright.
  6. In-ear measurements showed significant variations in amplitude response depending upon the listener and model headphone used. Some headphones varied more than others. How these headphone/listener variations effect the accuracy, reliability, and validity of subjective testing will be explored in future work.
While it was fun to watch the preferences unfold and match my experience, the most interesting thing about the paper for me was seeing the gray areas appear. For example, the AKG K550 is well known in the headphone enthusiast community to have fit problems—particularly in the ear-pad seal below the ear—that effect the low frequency performance. It seemed to me this caused the K550 to have a wider set of subjective responses than some of the other models. It strikes me that researchers aren't likely to have these tidbits of information as they conduct tests, and will have to ferret the problems out for themselves when observing variations in the data. In other words, measured performance into a coupler is not necessarily how the ear will hear that headphone, and knowing that those differences might exist and figuring out where those differences come from will need to include study of the headphone/head physical interface. Not easy.

Listener Preference for Different Headphone Target Response Curves - Presented at the 134th AES Convention May 2013, Rome, Italy
You can't build a bridge over a river from only one side. Sure, science can be done by simply amassing data and then observing which way the data points, but sometimes it's more efficient to take a stab at what you think is right (a hypothesis) and then test it. Like I said before, it just seems obvious to me that headphones should sound like good speakers in a good room, but until you test that hypothesis you can't know if that's true or not.

In this paper, the Harman team set out to test that idea. A new headphone compensation curve was developed using calibrated speakers in a room, which were then "listened to" by a dummy head with a calibrated "ear". A GRAS 43AG coupler was mounted in a head in the listening position, the head was then rotated to three different positions to spatially average response (+30 deg, 0 deg, +30 deg) and measurements taken. This gave the researchers a starting target curve for what the ear hears in a room with speakers.

Then two different headphones (Audeze LCD-2 v2, and Sennheiser HD 518) were measured on a coupler and inverse filters calculated to cause the headphones to measure perfectly flat on the coupler. Having a filter that could flatten the headphones now allowed the researchers the ability to add various target EQs to the headphones to determine which was most preferred in subjective listening tests.

The target EQs tested were:

  • Diffuse-field based on Hammershoi & Moller.
  • Diffuse-field based on Moller
  • Modified Diffuse-field based on Lorho
  • Free-field based on Hammershoi & Moller
  • In-room speaker response based on measurements in Harman Reference Listening Room.
  • Modified in-room response with less bass and treble.
  • Unmodified sound of the headphones.
To make a long story short, the results of the test showed a strong preference for the speaker-in-room curves with the modified curve being most preferred. (This modified speaker-in-room curve is essentially a tilted straight line with the bass end about 10dB above the treble end.) The diffuse-field curves came in next, with listeners describing it as too bright or thin, and lacking bass. The free-field curve was least desired, being described as thin in the bass, harsh in the treble, with too much mid-range emphasis. The unequalized Audeze LCD-2 was described as dull and lacking presence or energy around 1-2kHz; the unequalized HD 518 described as dull, boomy, and colored.

I'll quote the paper here:

"The underlying premise or hypothesis was quite simple: since stereo recordings are optimized to sound good through loudspeakers in a room, they will only sound good through headphones that simulate the response of a loudspeaker system in a room. This study provides empirical evidence that this premise is well grounded"

Damn! Dontcha just love it when common sense prevails. I'm going to be watching the continuing work on this subject with great interest and hope one day in the future I'll have a new curve to compensate my headphone measurements. Thank you Olive et al!

A Virtual Headphone Listening Test Methodology - not yet published, will be presented at AES 51st International Conference, August 2013, Helsinki Finland
In this last paper I'll report on here, the Harman team take the Sennheiser HD 518, measure it, create an inverse filter, and then apply it to the headphone to make it flat. They then take measurements of the six headphones in the first paper, and create filters that represent their response. These new filters are then applied to the flattened HD 518 so as to mimic the sound of the other headphones. Basically, the researchers are making the HD 518 sound like the other headphones in the test. The paper includes a series of graphs comparing the amplitude response of the real and virtual headphones...and the similarity is remarkable in all but the top octave.

Above 10kHz, there are significant difficulties getting reliable and stable amplitude response measurements as even very small positional changes on the measurement coupler causes wild swings in measured response. The researchers chose to not apply any filters above 10kHz.

Now, with six virtual headphones available on one headphone, the researchers can repeat the double blind testing of sound quality without the difficulty of having a facilitator placing the various headphones on the subjects head, and without the additional factors of fit, weight, and comfort differences between cans skewing evaluations of sound quality. The paper then goes on to describe the test design, methods, and data reduction in great detail.

A number of results were observed:

  • Overall, the preference ratings of virtual headphones were lower than with the real headphones, but the distribution of preference was wider with the virtual cans than the real ones. It's a little complicated, so you'd have to read the paper to understand the justifications, but it is thought that subjects having the ability to select from all six headphones at the touch of a button allowed listeners to give a wider and more stable scaling of their headphone preferences.
  • Broadly, there was good agreement between the headphone preference ratings in the standard and virtual test (correlation coefficient r=0.85).
  • There was also very good agreement on the perceived spectral balance of four of the six headphones. Listeners were essentially asked to draw a frequency response curves while listening to both the real and virtual headphones. These perceived curves were compared and correlation coefficients werevery good at: r=0.98; 0.91; 0.83; and 0.80.
  • Two of the headphones had poor correlation coefficients (r=0.05 and 0.69). The AKG K550 is known to have fit problems, and the heavy weight and higher clamping force of the Audeze LCD-2 v2 may have made it identifiable in the standard test. These physical and comfort issues are thought to have skewed the listening test results.
  • Statistical methods allowed the researchers to determine that the virtual test was significantly less prone to variations from influences other than sound quality, making it a more reliable indicator of sonic preference by listeners.
  • The virtual test method required far fewer trials and delivered more discriminating results making it a far more efficient method for testing headphone sound quality than double-bling testing with real headphones.

The paper concludes that while the virtual method of testing headphones does not address some important factors in headphone satisfaction (fit, comfort, bass leakage), it does provide a superior method for subjective testing of audio performance. It's not only more discriminating than testing the actual headphones, but it provides a method that is much more efficient (cheaper, faster) than the standard testing methods.

I'll quote the authors' concluding remarks:

"The authors hope this paper and the virtual method will encourage more headphone manufacturers and researchers to do proper headphone listening tests. Such tests can provide new information and guidance needed to improve the measurement, design, and sound quality of future headphones."

A final note on this paper: Some of the more technical InnerFidelity readers will be aware of the fact that the process of virtualizing headphone performance using DSP techniques do not easily take into account non-linear distortions of the headphones themselves. The authors acknowledge this issue, and believe it may be significant when testing headphones at higher listening levels, but also believe the effects of non-linear distortion are quite small and likely have little effect on test results. Fodder for future research no doubt.

My Closing Thoughts
I believe the work being done by Dr. Olive and his team brought out in these three papers represents a significant step forward in the development better headphone for us all. To summarize:

  • The first paper shows that there is a direct relationship between measured performance of headphones and the listening experience. Listeners strongly tend to prefer neutral, even amplitude response, and good bass extension.
  • The second paper shows that the long time diffuse-field target curve is not preferred by listeners. People prefer headphones that mimic the sound of good speakers in a good room.
  • The final paper describes a method for virtual double-blind headphone testing that is efficient and discriminating, allowing manufacturers and researchers a less onerous method for evaluating headphone performance.
A hearty round of virtual applause from all the InnerFidelity headphone geeks to Sean Olive, Todd Welti, Elisabeth McMullin, and Harman International. Great work!

I'll add that if you're curious about this type of testing, you should come to T.H.E. Show Newport next year. I haven't announced it officially yet, but InnerFidelity and Head-Fi members will be producing a display area called, "Candemonium: A Headphone Sideshow", which will appear in the "Headphonium" area of next year's show. You can think of it as a cross between a carnival sideshow and a science exhibit having a bunch of hands-on experiences with headphones in various ways. (Candemonium is being developed with the intention of drawing audiophiles not yet fully aware of the headphone hobby, and new blood from the youthful public who have an interest in headphones, further into the activity.) Dr. Olive has agreed to sponsor some listening stations in the Candemonium area that will allow show-goers the opportunity to participate in informal listening tests very similar to those described in the papers above. (We'll put some Beats in the mix and see how they do.)

Okie dokie, 'nuf of the technical stuff. Turn the page and I'll show you some pictures of my visit to Harman's research labs....

Sean Olive's blog is very informative.
Link to the two papers available in the AES library here and here. (Papers are available to non-members for $20 and for members $5.)
Head-Fi thread discussing first paper, Sean Olive (as Tonmeister2008) and Todd Welti (as MeatusMaximus) contribute to the lively and informative dialog.
Head-Fi thread started by Todd Welti discussing whether or not pinnae should be used on couplers when measuring headphones.

Harman International Industries, Incorporated.
8500 Balboa Blvd.
Northridge, CA 91329

Seth195208's picture

..but the "virtual headphone" thing is almost exactly the same technique that the Accudio program uses to simulate other headphones. Nothing new there. 

mward's picture

Yes, it's been done before. But what the paper demonstrates (at least, per my understanding if Tyll's writeup) is that test results for virtual headphones are likely to match test results for real headphones. Therefore, if you're designing a set of headphones and want to try 10 different voicings, you don't have to make 10 prototypes; you can make one prototype (or even use an off-the-shelf headphone), use processing to reproduce the voicings, and get results that match what you would have gotten with physical instead of virtual headphones. 

So in the end, this particular paper represents a methodology that makes testing easier, not anything that's informative aobut how we listen. 

Tyll Hertsens's picture

Yes, this has been done before, but, that I'm aware of, Olive et al are the first to show that this method is valid.

The march of science is slow, and looking at these papers it sometimes seem that the thrust of the papers seem like very small steps. But the importance is that they are sure steps that can be built upon with a good degree of confidence.

GonzaloM's picture

Yes Tyll, the rythm of the science is slow and physical theories have to be demonstrated and this have been always the slowest part. Now the steps are being bigger and quicker than ever.

This is very hoping to the manufacturers and it will help them a lot I think if I understood it properly.

This make me remember that when I was in College one of my teachers tell us that ingeneers are the Umpa-Lumpas (charlie and the chocolar factory) of the physicists trying to bring their theories from paper to the real world.



Jazz Casual's picture

Seems like another example of science confirming what we instinctively know. Interesting read thanks Tyll.

mward's picture

Fantastically interesting work by the researches, and great reporting by Tyll—summarizing scientific papers, particularly for a hobbyist audience, is tough work!

Tyll, a question—the virtual headphone methodology is focused on frequency response, right? So it doesn't take into account the differences in time domain performance between the headphones?

If I understand correctly, this is potentially interesting. One could hypothesize that this means that frequency response is much more important than time domain performance to listener preference. Or, perhaps the headphones here are all sufficiently similar in time domain performance that no conclusions about the importance of time domain performance can be drawn, since they're all single-driver models that don't have big time domain issues such as group delay due to crossovers.

It's a question I find very interesting, especially given that JH seems to have wowed you guys with their phase coherence tech.

ultrabike's picture

Probably not something to worry too much about. For a particular Frequency domain response there is a unique Time domain response. Furthermore, an ideally flat and smooth Frequency response corresponds to a smooth and "fast" Time domain response. They sort of go hand in hand.

Music perception might be a bit more involved than that though. But thats probably a different subject.

As far as cross-overs, their issues show both in Time and Frequency domain, but from what I have read, the (Time and Frequency) issues might be a function of position as well.


Really interesting article Tyll!!!! and very happy to see serious research done on this subject. 

BTW, did they shared how the AKG K1000 on "Sidney" performed? ><

Tonmeister's picture

You're correct. For mininimum-phase systems, an improvement in the frequency response will lead to an improvement in the time domain since the two domains are mathematically related via the Fourier/Hilbert transforms. A high-Q resonance in the frequency response will exhibit itself as a ringing in the time domain. Equalize the resonance with an inverse filter so that the frequency response is now flat, and the ringing will disappear in the time domain.

I think Floyd Toole,shows an example is his book where he equalizes the low frequency response of his loudspeaker in his living room showing the before/after improvement in both time and frequency domains.  There are some cases in loudspeakers where in certain frequency regions non-minumum phase may occur (example: woofer/mid-tweeter cross-over regions ) but studies show the resulting group-delay is generally near or below  the threshold of audibility. You can hear the effects mostly on clicks in anechoic conditions but with music reproduced in semi-reflective rooms the reflections will mask the group delay effects rendering them inaudible.

The binaural manniikin in the picture was recently named "Sidney" by a PhD student of mine working on Cinema Audio System Calibration, I guess out of respect  for our former CEO, Sidney Harman. She also added a pair of boxer shorts to his lower naked torso to meet HR dress code requirements.

I'm not sure we've actually measured the AKG K1000 on our binaural mannkin so you've just added an item on my "to-do" list :)


Sean Olive

Tyll Hertsens's picture

I must say that my understanding of this topic is limited, but I discussed this exact issue with Sean and Floyd at dinner that night. Both immediately blurted out, "No, headphones are minimum phase devices."

From wiki hereIn control theory and signal processing, a linear, time-invariant system is said to be minimum-phase if the system and its inverse are causal and stable.

Bottom line: As you drive a headphone through the frequency range, the phase error remains modest. In this type of system, what Ultrabike sais is true: fix the frequency response, and you fix time domain response as well.

But the gear Harman used to correct the cans was quite a bit more sophisticated than a simple EQ. The development of the Infinite Impulse Response (IIR) is not described in great detail, but I'll quote the paper:

"The IIR flattening and virtual headphone filters were both designed using the auto-EQ function in the Harman Audio Test System (HATS), and implemented using the BSS Audio BLU-800 DSP."

So, the Harman Audio Test System is able to look at a systems response to various test signals and automatically create the appropriate inverse filter to flatten the systems response.

As a side note, and I don't know all the names and acronyms, in many of the audio systems I heard at Harman that day, DSP systems were in use that corrected the in-room response of speakers. For example, the home theater system I heard had in-room DSP corrections applied. The correction system is able to optimise performance in-room to significantly broaden the "sweet spot" for home theater listeners. My experience of the sound in that room was simply stunning, and it seemed quite obvious to me that the DSP corrections were working VERY well. Anyway, simply correcting EQ in headphones is peanuts compared to correcting and modifying speaker in-room response.

AstralStorm's picture

While it is true that headphones are minimum phase, it's not a linear system.

Therefore, simple IIR equalization will not always do, as that applies only to *linear* time invariant systems. Specifically, the headphone might have identical time response at different frequencies (as seen on CSD), but different magnitude responses. That's nonlinearity and it's pretty hard to correct - some "lag" might be taken away by the IIR, but you'll quickly run into problems designing the filter as you go towards linear phase and time with differing magnitude, which does happen in good headphones.

This means that typical IIR filters would introduce phase and time response errors. Acausal allpass corrections won't do, since the filter has to be realtime. (Like in this paper: http://www.ejournal.unam.mx/cys/vol10-04/CYS010000401.pdf
Even such IIR filters cannot be stable, causal and phase linear all at the same time.)

FIR filters are much better for this purpose, since you can tune phase/time response separately from magnitude response, all the way to linear phase and beyond. The cost, being compute power, is neglible nowadays. Preringing usually also is neglible if any.

Andre's picture

Non-linearity will affect both FIR and IIR since the analysis of systems for use with both kinds of filters assume a linear system.  In other words, if you are using a system in a non-linear domain, all bets are off.  For example, to do a Fourier analysis on a signal, you have to assume that the system is linear.

For minimum phase systems, the phase changes introduced by IIR will exactly correct the phase in the MP system since the MP system's amplitude changes have an accompanying phase change.  In other words, you want the phase changes introduced by IIR because they are necessary to the function of the IIR filter.

Tyll, are multi-element headphones (like some of the higher-end IEMs) still minimum phase?  It is my understanding that one of the main reasons speakers are not minimum phase is because the same frequencies are reproduced by different drivers because crossover slopes are not infinitely steep.  So there are multiple paths for the same frequency through the system and often with different delays, and this is why speakers in general are not MP.  It seems that multi-driver IEMs could be considered not MP too for the same reasons.

ultrabike's picture

I don't think headphones are fully minimum phase though. Notches in the frequency response may indicate localized non-minimum phase behavior... I could be wrong in my approach, but I think at one point I got the roots of the HD558 and K701 IRs and got zeros outside the unit circle in some places in the tremble. A headphone might be mostly minimum-phase though...

Furthermore, perfect zero cancellation using an all-pole IIR filter may be problematic if positional changes have an effect on the location of the zeros. Perhaps in practice this is not a big problem... Dunno.

An alternative approach to an all-pole IIR could be to create an FIR equalizer whose coefficients are calculated so as to minimize the MSE between an ideal IR and the headphone IR response. Dunno if this would be a great idea in the end, but some of these approaches might automatically take care of places with deep notches by not attempting to boost them (given low SNR). One could potentially make this a problem that accounts for some positional changes by optimizing for the average response as a function of position... There might be many other ways to....

As far as non-linearity, headphones should be mostly linear (specially at moderate listening volume). Not sure what it's meant by "the headphone might have identical time response at different frequencies (as seen on CSD), but different magnitude responses" tho.

AstralStorm's picture

By time response I mean decay. To be truly minimum phase as a whole, the response would be describable by a quite low order rational polynomial function. This is not what is seen, especially with some kinds of ringing.

If you see ringing corresponding to a dip, that's exactly where the system is not minimum phase. Peak combined with ringing is fine on the other hand, as long as it decays in polynomial relation to decay at other frequencies.

It is true that headphones are mostly minimum phase though.

inarc's picture

This is exactly the kind of content I am looking for here. Thank you.

Cami's picture

Thanks, Tyll!

itsastickup's picture

hmmm, so what does that mean for diffuse field etymotics which have a 12db bump at 3000hz, as mentioned by Tyll, but no bump in the compensated line? Should I be EQing out the 12db bump? 

I'm trying -12db at 2750Hz (following the innerfidelity graph) with a Q of 2.8 (which I believe is about the right width for this). Interesting. Might need to play for while and then switch back.

I think I'll play a sinesweep to see what's happening....

Tyll Hertsens's picture

...I think that's one of the conclusions drawn. Interested in what you hear as you play with the 3kHz bump...but I'd encourage you to try reducing the 3kHz bump in various amounts. 12 dB reduction may be too much.

Tonmeister's picture

It's important to remember that the diffuse-field calibrations derived from Moller, and Hammershoi and Moller, didn't include any low frequency room gain.  So, when we concluded listeners didn't prefer diffuse-field calibrated headphones in papers 1 and 2, those calibrations didn't have any room gain. If you want a clue how reducing the 12 dB peak at 3 KHz down to 3 dB will sound, look at the modified DF curve by Lorho (developed when he was Nokia, now at Apple).  It didn't score so well in our tests and similar tests conducted last year by Fleischmann et al from Fraunhofer. In our test, listeners said it sounded too dull and muffled.

In paper 2, if you look at the most preferred headphone response based on the measured loudspeaker response in the reference listening room,  it had a combination of low frequency room gain and a slightly modified DF response above 3 kHz. 

This is just the early stages of our research and we are further investigating what the preferred target response is using both trained and untrained listeners from different demographics.  I'm particularly interested in testing headphone sound preferences of young kids because many audio marketing folks and some audio journalists believe kids like boomy, colored headphones based on sales numbers alone. My hypothesis is that this has nothing to do with headphone sound quality -but all about marketing,and what's perceived by kids as a hip and cool headphone to own.

The audio industry has to figure out how to market good sound as something that is hip and cool, and then actually deliver it.


Sean Olive

itsastickup's picture

I can hear the bump on the sinesweep although I would estimate it as less than 12db.

Adjusting my EQ I seem happier with music at -8db than -12db. I do think it is an improvement though its quite a different sound. I usually have my etymotics at bass +9dbs at 30Hz Q 1.4 (using rockbox on a clip+ for neutrality), but the bass was overwhelming and even a touch too strong having removed the bass boost. So much for weak bass on the etymotics; seems it's the 3000Hz bump rather than weak bass.

intense sinesweep session later....

Sinesweeping seems flatter with -5dbs and the center is nearer 3000Hz and a width of Q 1.7 (quite wide). (-12db results in a significant dip at 3000Hz, so the uncompensated graph isn't giving the picture I'm hearing.) I've put my bass up +3db at 30Hz with a narrower Q of 2.8 (to give bass extension rather than overall bass which I don;t need anymore). My own ears also need -7db at 8500Hz Q4.8 for ear canal resonance which may be a working setting for you to adjust with a sinesweep of your own (7Khz to 10Khz log, with 'markers').

I think I like what I'm hearing.

Andre's picture

FWIW, Siegfried Linkwitz talked about EQing headphones here: http://www.linkwitzlab.com/reference_earphones.htm

It includes his EQ for the ER-4S as determined empirically by listening to a sine wave sweep. I tried figuring out the bumps on my Sennheiser HD-600s with much cruder equipment (no analog controlled sine sweep with a finely-adjustable big dial, but +- buttons on my keyboard), and ended up notching out 2.8 kHz (-4 dB, Q=10) and 6.6 kHz (-3.75 dB, Q=10), which does sound more natural.

On a Mac, I use Audio Hijack Pro, and use the very high quality Core Audio parametric EQ modules that comes with OS X to do the notching.  I used ToneGen to kind of do the sweep, but I'd rather rent an analog sine wave generator to provide the sine wave, but I'm sure there are better software versions of what I found.

itsastickup's picture

Thanks for that, Tyll. I had my reply to myself written and waiting to be checked before I saw your reply. Your comment seems confirmed; only a mere -5db to (subjectively) flatten. I'm sure I'm not so accurate as I didn't spend huge amounts of time on it; but it sounds flat enough. It's also made the treble a little more sparkly, which is something I always felt was a weakness with un-EQ'd Etymotics despite the seemingly forward/harsh treble.

sue4's picture

Hello Tyll! I thank you for this article. It gave me insight on dsp software, such accudio and dirac. But, I still wonder what your opinion is on them..

Tyll Hertsens's picture

Gotta get off my butt and check them out.

Inks's picture

Unfortunately, Accudio is not exactly accurate.  There's just too many variables the app overlooks and doesn't state. I assume all the IEMs are inserted at the reference plane, but then some aren't physically capable for such insertion for some ears and tips aren't mentioned, which can have quite an impact. Though even with that matched, the emulator fails to be accurate.


Tyll, would you change your compensation to perhaps the Olive-Welti curve? I always felt the ID curve wasn't the best choice here. Diffuse-Field curve may not be preferred [It's certainly too bright and thin] but at least it has scientific significance and is vital in representing an accurate orthotelephonic gain of a headphone. I think the Olive-Welti would certainly be a step-forward.

Tyll Hertsens's picture

There will come a time when I have graphs automated for web display (like HeadRoom's), at that point the user will be able to select from a number of compensation curves, one of which, I hope, will be the target headphone response curve developed by Harman. 

Paul Barton is doing similar work up at the NRC, and I hope to include his target curves when developed.

Inks's picture

So are Lorho from Nokia and Fleischmann of Fraunhofer. Now lets see which gets enacted for international standards...

zobel's picture

How does your dummy head compare in calibration to others, such as the one Harman uses? I think it would be interesting to do a live recording with your dummy (does she/he have a name?) and play the recording back through a flat system to see how it fares. It would be informative to listen through your dummy's ears in different situations, such as concerts or ambient in room sounds; or how about listening to what the dummy "hears" with different headphones on, then compare the actual sound of the headphones directly to the headphone/dummy head loop?

This is fun!

Lets have a "name-that-dummy-head" contest.

zzffnn's picture

Question for Dr. Olive or Tyll:

The virtual headphone experiment is a better blinded test. Regarding the headphones with poor correlation coefficients (r=0.05 and 0.69), did the virtual LCD-2 vs real LCD-2 get r=0.69? Was the rank (among listener rating) of the virtual LCD-2 remain the same as the real LCD-2?

Also is distortion of different headphones considered in the virtual test? LCD-2 has significantly less bass distortion than the other 5 dynamic headphones. Did your test songs contain enough low bass frequency? And what do you think is the audible threshold of bass distortion? 5%? 10%?

Thank you in advance.

donunus's picture

I wonder if this means that we can finally get a "correct" sounding version of the k7xx family :D 

This test should have been done a long time ago. I've been talking about this method of measurement and speaker comparison with my friends for quite some time.

Hifihedgehog's picture

By this, I mean to ask if this is a subtle hint that a bona fide new headphone from AKG is soon to be released, not a umpteenth rebranding and repackaging of the veteran K701, especially like the new K712 "Trick or Treat Edition." ;) By the way, the K712 appears to be a Q701 by another name and face, which also has a definite bass boost of about 3 dB compared to the classic K701.

Tyll Hertsens's picture

....to get from research to finished product. I actually asked this question of Sean, and, as I recall, he said two or three years before product may appear that benefit from these studies. 

I also mentioned to their senior headphone product manager that rebadging the K701 incessently was pissing off the enthusiasts. Don't know if that will change anything though. They've got to play with the cards in their hand, I s'pose.

AGB's picture

Many moons ago in a couple of Inner Fidelity comments I suggested that the LCD-2 be EQ'd using, for example, the parametric EQ supplied with FIDELIA software...a technique I've been using since. There is no question that the improvement is dramatic.

Whatever reservations one may have had with soft bass and a rolled off top vanishes - predictably.

Whether parametric EQ improves phase is immaterial for me - I have no way to measure anything other then using my relatively-experienced ears.

I'll assume for the moment that EQing the FR improves phase too. Good enough for practical purposes. If it doesn't, ask me if I care.

What counts is the sound I'm getting.

The LCD-2 flat and equalized cannot be recognized one from the other.

All I can say is that just about every recording you're listening to have been touched with a heavy hand. For those who have issues about EQ, I'd worry about other things. For example, Iran going nuclear or an invasion by extraterrestrials wearing headphones.

Concerning the dummy head, the only one I''ve become intimately acqainted with is the one I was born with.

itsastickup's picture

Well, I've spent some time on equalizing my etymotics (HF2). I found the audio had a certain thickness having equalized down the 3000Hz hump; and discovered it at about 650Hz. I also worked somewhat on bringing out the low bass. I couldn't get 30Hz and below to come through but 40Hz and up seems good.

The settings I'm pleased with (and seem to be the flattest on a full sine sweep) are 40Hz 5db q=1.4, 650Hz -4.5db q=2.8, 2900Hz -5db q1.4, (resonance : 8500Hz -8db q.4.8 ). I also take off -5db on treble (rockbox). These are settings for a loud listener. Doubtless both treble and bass will need adjusting to taste with something of an increase for quiet listeners. It's quite marvelous to hear that low bass.

I'm now getting the sort of pleasure I get with my full-sized headphones which just a simple bass increase couldn't do. Still no soundstage, but ah well.