Break-In, Part Deux

Tyll Hertsens's picture

Many of you will remember my previous attempt at analyzing break-in from this post. A lot of good comments and suggestions were made after the post, and I'd encourage those not familiar with this previous attempt to check it out.

Deciding that my time away at T.H.E. Show Newport would be the perfect opportunity to run some more break-in tests, I programmed my headphone measurement system to take frequency response, square wave, impulse response, and Total Harmonic Distortion data every hour for 100 hours.

Man! That's a lot of numbers ... I'm glad I have a computer.

Preparation
I had some unfortunate glitches on my computer that resulted in incomplete tests subsequent to the first experiment, but I did get enough data in these preparatory tests to suggest that, indeed, the headphone pads may be settling onto the head for the first day or two as was suggested by xnor, IIRC. I also came to the conclusion that, however slim, there was a chance that the voice coils were suffering from heat build-up during the previous experiment as pink noise was played continuously right up until the time of each testing sequence. I knew this was going to be a sensitive test, and that I really had to rule out as many things as possible.

In preparing for this round of testing, I placed the headphones on the head 40 hours before beginning the test to let them settle in. I also wrote into the testing program 15 minutes of silence before each test so that the headphones had some time to cool off before the measurement sequence began.

The Test
The headphones used were a brand new, fresh out of the box pair of black AKG Quincy Jones Q701 headphones --- a headphone believed by headphone enthusiasts to require lots of break-in.

The test itself was programmed to run for 100 cycles. During each cycle, the system would measure frequency response, 30Hz and 300Hz square wave response, impulse response, and THD+noise spectra. Then the system would play pink noise into the headphones at about 90dB SPL, followed by 15 minutes of silence, and then would run through the cycle again. The complete test took 125 hours ... just over 5 days.

Results
Not wanting to get complaints about "fruit salad" multi-color graphs again, this time I crunched the numbers on my corporate MacBook Pro that has a current copy of Excel so that I could display the results on 3D surface graphs. Oooooo ... so cool! Here we go!

Frequency Response

Fig 1. This chart shows the left channel differences in frequency response. Data from the last measurement (100 hour) is subtracted from all previous measurements to show how the headphones change over time relative to the last measurement.

In the graph above, the frequency spectrum goes from 10Hz to 22kHz in 511 steps which you see on the axis to the right labeled 1 through 481. The series of 100 hourly measurements run right to left, and are on the axis closest to you labeled "Series 1" and "Series 51." The magnitude of the vertical scale is in dB SPL of difference and is +/- 2dB maximum.

Fig 2. This chart is similar to Figure 1, but shows for the right channel differences in frequency response.

The first thing to note is that virtually all data (with the exception of some of the highest frequencies) is within 1/2 dB of the last measurement. Really, I couldn't be more pleased with the performance of the system being able to record so accurately and noiselessly the trends observed.

Both charts show remarkable similarity between the left and right channel over the course of the test. The charts show some wavy motion over time, I suspect this is temperature differences between night and day over the course of the test, and I doubt it's actually a function of something actually changing in the headphones themselves.

At the low end of the audio spectrum (closest to you) you can see that the bass rises over time, and then falls fairly rapidly at about hour 70.

The set of ridges at about 150 on the right scale is the primary resonance of the driver. You'll notice for about the first 20 hours there are small ripples, then between about hour 20 to hour 70 the features get larger, and after hour 70 they seem to settle out. Also note the features between roughly 250 and 450 in the upper mid-range and lower treble likewise get stronger during 20 to 70 hour period of time.

In both channels, data features appear in bands: 0 to 20 hours is fairly settled; data between 20 and 70 hours is bumpy; then during the last 20 hours of the test seems to settle down quite a bit.

Impulse Response

Fig 3. This chart shows the differences over time of the impulse response for the left channel. Time of the impulse runs from left rear to right front. The series of 100 hourly tests runs from bottom left to top right.

Fig 4. This chart shows the differences over time of the impulse response for the right channel.

The differences in impulse response are also remarkably similar between the left and right channel. Notice again the characteristic nature of the data in the bands from hour 0 to 20, hour 20 to 70, and the last 30 hours.

THD+noise Differences

Fig 5. This chart shows the differences over time of the THD+noise for the left channel. Frequency runs from 20Hz (closest to you) to 7kHz. The series of 100 hourly tests runs from right to left.

Unfortunately, I had a problem with the right channel data, but it looked fairly similar to the left channel shown above. The only things of note here (to me) are the dip in the distortion in the lows between 20 and 70 hours, and the elevated amounts of distortion between samples 200 and 350 in the last 30 hours. (The line of spikes at about sample 380 is a measurement artifact of the range change relays in the Audio Precision.)

The extra distortion features in the last 30 hours of the test lead me to believe that further burn-in testing might be required to see if these artifacts settle out over time.

Summary
Well ... do we see something burning in here? I don't know. What do you think?

It certainly does seem like something is happening between hours 20 and 70. I haven't opened the chamber and taken the headphones off the dummy yet, so I really think I need to run the test for another 100 hours to see what happens. That way we can compare these data with a new set of graphs. It's going to tie up my headphone testing for a while, but I think it's worth it. I'm gonna go push the button.

I'm also going to defeat my automatic thermostat in the house to try to keep the temperature more constant over the test.

Lastly, I want to remind people of two things:

  1. The magnitude of the changes observed are very small. If the features seen are evidence of break-in, the effect is small. When you buy a pair of headphones and listen to them right out of the box, that's going to be pretty much what they sound like after a year.
  2. Check out "World Access for the Blind." This is an organization that teaches blind people how to echolocate their way through the world by clicking their tongue and listening for the echo. When they hear the return echo from their environment they can get a "picture" of the physical objects around them, and can navigate using it. These blind folks are riding bikes and playing basketball completely unsighted. I'm pretty sure this almost magical ability is indicative of the extraordinarily exquisite sensitivity and processing power of the human senses and brain. Just because the measured data of headphone break-in may be vanishingly small does not mean it can't be sensed.

Okay, what I'm really interested in is your comments. I got a lot of good ideas last time from them, and I like the feedback. So, if anything strikes you, pipe up, I'm interested --- even if it seems a little off the wall, throw it out there. You never know.

Oh yeah! The contest!

Contest Winner
Last break-in article I offered up a pair of Q701s to the comment I found most interesting. This one kept ringing in my ears:

By Shike:
Unlike the headphones, I much enjoyed seeing you run pink noise through them for the sake of at least trying to get to the core of the burn-in argument.

The problem, unfortunately, is this is only one down. The curse this headphone has left is one you and your inbox will feel, as people start questioning whether their own headphones burn-in. "Do the HD800, HE6, LCD-2" they'll cry. The 701's will be long gone, but you will be stuck with requests locking you up in the measuring chamber for eternity.

Oh, and Gatepc forgot the main ingredient for boutique cables: virgin tears gathered on the top of the Himalayas . . . that's how they achieve "quantum tunneling" you know?

Okay buddy, you win. And yes, I know this adventure is going to keep tying up my measurement chamber ... such is life. Someone's got to do it, why not me?

Email on the way! Congrats buddy!

Share | |
Comments
maverickronin's picture
Echolocation

Well, blind people can devote a much larger portion of their brains to their hearing. Unless you want to sacrifice your sight I would expect to be able to learn that.

I think I'm going to try it myself though...

inarc's picture
Thanks for the experiment. I

Thanks for the experiment. I personally feel assured by the results that headphone burn-in is mostly psychological and not physical in nature: The differences, if not due to measurement inaccuracies or other side effects, were small, our echoic memory is poor, and burn-in is always said to tendentially improve sound, so there is already an expectation bias to begin with.

sgrossklass's picture
Future candidates

You see, I'm also a radio nut. Now which kind of receiver do you think can you learn the most from? The top-end model that has near-perfect specs and performance all around? Certainly not. A far more basic model with its share of images, intermodulation products, spurs and frequency instability plus noticeably imperfect IF filtering will be far better suited for learning about receiver technology.

Hence in this case I'd grab some far more basic cans for which people have reported noticeable changes, commonly within the first few hours. DT231s/235s, RP-HJE900s (both reported to gain noticeably on the bottom end within a short time), HD238s (reported to pick up in terms of low-frequency level handling / distortion), stuff like that.

Sure you could subject a pile of the usual audiophile cans to the same procedure, but that might well keep you busy until next life with nothing much in terms of breathtaking results.

xnor's picture
Quote:Hence in this case I'd
Quote:
Hence in this case I'd grab some far more basic cans for which people have reported noticeable changes, commonly within the first few hours.
People have reported day/night differences (scnr) between stock and 100h burned-in K701's.
donunus's picture
My Burn In Experience

I have always believed in burn in but 90% of the time, a bad sounding headphone out of the box will not likely be a great one after hundreds of hours of use.

I believe that headphones get better over time due to the combination of the drivers burning in and the human brain getting used to what its processing. The weirder and more unnatural sounding headphones might actually be the ones that need more brain acclimation though so those are the possible candidates of being called the "headphones that take a long time to burn in".

So far as the drivers burning in, I didn't actually notice as much of a change in the infamous k701 against my old Sennheiser hd555. In the case of my old senn, I really believe that they turned from zero to hero due to the drivers burning in because out of the box, they were just ridiculously bright and full of reverb compared to my other headphone at the time, the px100. After having had the hd555 for months, The reverb/extremely blurred imaging tamed down quite a bit. The HD555s sounded warm and a little dry after a few months of use and having the same reference of comparison(px100), they didn't have the huge difference in brightness I remember them to have before.

This is a really good article because it proves to the skeptics that headphone drivers do change over time. Whether a piece of crap out of the box is going to turn into an audiophiles wet dream is another story though and is something I have yet to experience in this lifetime. Hmmm, although that hd555 I had did change quite a bit...

thegr8brian's picture
Temperature / humidity monitoring?

Hey Tyll,

Great article as usual. It may be too late now, but I would suggest maybe settings up a simple temperature sensor in the chamber and perhaps even a humidity sensor and track these values during testing and later use them in a graph if you find correlation with temp/humidity. I imagine something like this is available for purchase but it would not be too hard to get something up and running with any sort of project based microcontroller.

Looking forward to your comments on the new LCD-2 you've tested!

dalethorn's picture
Burn-in

I tried with the Beyer DTX-300p, a very, very economically constructed headphone. I expected the new set to sound slightly different than the one I had played for a couple hundred hours, just because of the, er, economic construction. It didn't sound any different to me. Still, if headphone drivers aren't made of metal, they should be able to change shape or something, slightly. The scientific part of my brain says this test or at least a small subset of it should be run on a few different models by different manufacturers. It's entirely possible that the K701 is built in such a way as to self-compensate for changes, either like those self-healing new materials that are being manufactured, or in some other way. The fact that there are small but consistent (non-random) changes in the 701 at least implies that other headphones could have larger changes. One thing I do notice in extended listening to any headphone is how very small differences in frequency response, for example, can be hugely important. If I listen to one headphone whose bass is just OK, and then listen to another whose bass is just one db weaker, it can sound like a world of difference, since the new bass is not "just OK".

donunus's picture
"One thing I do notice in

"One thing I do notice in extended listening to any headphone is how very small differences in frequency response, for example, can be hugely important. If I listen to one headphone whose bass is just OK, and then listen to another whose bass is just one db weaker, it can sound like a world of difference, since the new bass is not "just OK"."

I agree with you here Dale. Take the px100-II and the Koss Portapro. Put on Koss Portapro pads on both cans and I tell you that if you switch between headphones and only listen for a few seconds each, there is a big possibility one can't even tell the difference between the two of them. Listening longer or when music that show where their differences lie are played, one can now tell that the px100-II has more upper mids and lower highs than the koss which are a little more recessed in that region starting its climb around 10khz behaving more like a beyer in this area. Of course, the portapro also has much more power handling so playing music very loud will instantly reveal that the senn is more distorted than the koss Smile I still prefer the senn to the koss overall though.

Back to the topic, little differences in measurement can affect the performance enough to make a person like a headphone that he once hated out of the box. I know by experience since it has happened to me in the past. Of course my brain isn't exactly perfect and is still capable of adapting to the sound so it is probably not purely due to changes in the headphone alone. Heck, when I'm drunk, something could totally sound killer even when it was total crap while listening sober Smile

xnor's picture
Quote:If I listen to one

Quote:
If I listen to one headphone whose bass is just OK, and then listen to another whose bass is just one db weaker, it can sound like a world of difference, since the new bass is not "just OK".

I really have doubts here. If we take a closer look at http://www.innerfidelity.com/images/AKGK701.pdf for example, and zoom in on the gray lines in the FR graph we can see how really small changes in positioning can result in 2+ dB changes.

Since it's unrealistic that you manage to place the headphones precisely (lets say within a +-1 mm margin) on your own head you'll get bigger changes solely from placement than compared to the really small changes Tyll measured after tens/hundreds of hours of 'burn-in'.
Additionally the ear-pads compress and get dirty over time which again will change the FR regardless of driver 'burn-in'.

What's also very very questionable about 'burn-in' is that reviewers practically always describe how it improved the sound balance (FR) but also sound quality! This only makes sense to me from a psychological point of view.

dalethorn's picture
Good points

Good points xnor, concerning variances with just one headphone. My experiences with certain sealed headphones has shown me how not only positioning makes a significant difference, but how well the 'pleather' pads seal on the ear has a very noticeable effect with the bass. Matter of fact, some highly regarded reviewers have reported weak bass with headphones that don't have the problem when a good seal is achieved. So since I've fell into that trap as well, I've learned to control the variables and get the "right" fit and other conditions set, such as ambient noise and my own noise level set before launching a serious listening session. But then, when you get these preparations down, those differences in the headphone itself tend to jump out at you.

donunus's picture
I agreed to Dales statement

I agreed to Dales statement there to a certain degree which is why I quoted that and compared the koss portapro and the px100-II but 1db changes here and there? hmm maybe not that small of a difference hehehe

dalethorn's picture
Donunus' experiences

I have followed the postings on the low-priced PX-100ii and Portapros, and noted the differences that occur with different earpads or other mods. I even did some of that myself successfully. I would vote with Donunus that the $100 price range has some really good headphones, and given that they don't have the tight Q/C of the expensive headphones (and don't look now, but they certainly don't get a manufacturer's burn-in that the $500 headphones *might* be getting hint hint), they should be a more fertile ground for investigating break-in or burn-in. We're all scientists here, yes? Logic suggests exploring the lower price ranges where we can be sure that the headphones come right off the assembly line and onto the distribution truck without hours of burn-in that's described as "Finishing and Q/C".

PMM's picture
Re: eternity

Regarding "you will be stuck with requests locking you up in the measuring chamber for eternity," I say you should turn lemons into ramune and make it a planned event.

Purely speculative suggestion off the top of my head:

Step 1: Create a poll. Announce that you're not opposed to conducting this sort test again at some point in the future, but only if you can get X number of people to vote for a single model to be tested the next time you've got a week free, and only if the manufacturer of that model is interested in sending a brand new model for testing. This puts a hard limit whether you can or can't do it.

Step 2: Come up with a friendly boilerplate email explanation of the situation, and link to the poll. This will save you the trouble of having to hand-respond to everyone.

The key problem I see with this is that the poll will be difficult to manage. I'm not sure if the CMS that Source Interlink uses is capable of presenting a poll, for starters. And then even if it is, do people only get to vote one time ever? Can they undo their vote? Can they vote for multiple things? The answers to those questions would skew results.

Regardless of how you handle it, I believe that you can approach and embrace this challenge in a way that won't feel like a curse.

DigitalFreak's picture
Leave the cool aid behind and rock out in the peanut gallery

I'm enjoying your article and the hard work you're putting into it Tyll. Unfortunately the final outcome will be people either ignoring your findings or going out of their way to discredit your work. Whether the final results prove burn in is a mental placebo or a true factor in a HP's final sound signature people will pick away at your data nonstop bringing up the most far flung excuses under the sun.

I'm quietly sitting here in the shadows of the peanut gallery awaiting the first posts from the various factions of the cool aid drinking crowds to begin throwing up posts with comments such as "each HP is different and therefore will have different reactions sonically to break in", "mod the cord and burn in will be more apparent" and "your equipment is faulty or not sensitive enough therefore not recording the proper data for your graphing" or "your graphs can say what they want I trust my ears and they say there's no changes". In the end for some people it doesn't matter what you show them they only choose to see the cool aid pitcher in front of them.

On the plus side it can be quite fun sitting in the shadows of the peanut gallery. You should try it sometime Tyll. The peanuts are fresh and salty and the beer is smooth and nicely chilled. In my book there's nothing more enjoyable then rocking out to tunes on your HP with some chilled beer and peanuts. Bring in some good pizza and life would become utopian.

dalethorn's picture
Not scientific

DigitalFreak, it's not scientific to make your mind up in advance of the data, nor is it very scientific to base conclusions about many headphones from tests of just one headphone. You may be absolutely right, or very close, but I would like to see other models tested - at least the frequency response tests.

EDIT: I would also like to respond to the notion of "I trust my ears and they say there's no changes".
I did ear measurements of a headphone that was graphically measured and found huge differences. I don't automatically trust my ears - in fact I tend to trust measurement by scientific equipment, being a long time owner of half a million dollars in Hewlett-packard scientific computers. So when I did the ear measurements with fixed tones from several sources, comparing with different techniques I described in detail, I could feel confident that the measured differences were more in error than my listening tests. I feel so confident of this in fact, that I am on the lookout for possible explanations. And no, there is no possibility that I could be off by much. I can tell what a one db difference is, having equalized headphones within that difference many times.

PMM's picture
OK, going over the data.

On the impulse response, the Z axis is still voltage, right? You've made efforts to stabilize the voltage for this second test, compared to the first test, and indeed the measured voltage differences are an order of magnitude smaller. This confirms what I ultimately suspected after wrapping my brain around the last test.

You accidentally glossed over the square wave results in this article, by the way -- but it's probably no loss, since they'll just be sympathetic with the impulse results anyway.

Could I see a PDF measurement sheet from this set? I don't need all of the hundreds of results; just one will do. I'd like to compare the exact distribution of left and right channel differences on the impulse response (where precisely the red humps and blue humps are distributed) -- it will tell me how different the driver pairing quality is from one set to another. What does this have to do with burn-in? Well, if the differences between one brand new pair of drivers and another brand new pair of drivers is more evident than what little we suspect is due to burn-in, then that would provide a sense of perspective.

Alright, so. In the end, with your newly standardized testing, it appears that all of the measured differences are inconsequential. We are unlikely to suss out any more details from these measurement methods, and if anyone is going to find incontrovertible proof of meaningful burn-in effects in the future, they'll come from different measurement methods entirely.

jaggervm's picture
Cumulative Frequency Response

If I read the graphs correctly, they are showing change in the frequency response in relation to the last measurement, right? In that scenario, what about the cumulative change in the frequency response curve of the headphones at 0, 50 & 100-hour marks? the smaller changes of 0.5 dB at every cycle between 20-70 hours would accumulate to a much larger change in the curve. Would it be possible for you to show this change in the curve?

PMM's picture
"last measurement"

No worries, there. He meant "last" as in the 100th hour, as opposed to the immediate previous measurement.

dalethorn's picture
One headphone only

It appears, in the informal poll we have here already, that the readers are ready to close the book on burn-in with only *one* headphone tested. Tsk tsk.

PMM's picture
Aw.

I'm the guy who, in the comments of part 1, was asking about waterfall plots, and impulses at different voltages, and recordings of recordings, and wavelet transforms. I'm up for more exotic testing methods if anyone has a good suggestion and Tyll is willing to do them. I'm just also tossing out an idea as to how he might take the edge off of the ensuing years of emails from uninitiated people saying HAY TEST MY FAVRITE HEADPHONES PLZX

As you said in the comments of the first test (I'll link to it, although for some reason the link doesn't take me directly to the comment that it's supposed to): "Four, we assume the freq. response measurement is the end-all of sonic difference. Five, we assume that what we can't hear consciously we also don't hear subliminally."

I don't disagree. As I said recently:
"We are unlikely to suss out any more details from these measurement methods, and if anyone is going to find incontrovertible proof of meaningful burn-in effects in the future, they'll come from different measurement methods entirely."

If the subliminal aspect is quantifiable, we don't currently know how. That's all I'm thinkin'.

SAS's picture
Raw data

The 3D graphs are a big improvement. They make the argument for break-in much more convincing.

Would you consider making the raw data available for others to analyze?

Tyll Hertsens's picture
Mmmmm..
.... sure. email me at tyll(at)tyllhertsens.com
donunus's picture
My Current Burn In Adventure

Talking about burn in, I have some Fischer FA-003s cooking here right now. I am now on approximately the 50th hour since they were plugged in and they are playing a different track every few hours with different patterns of vibration and beats to exorcise the demons out of their hellish sounding drivers. When I feel I need to tame certain annoying parts of the mids for example, i play a song that makes them spit out those annoying frequencies. A little paranoid and non-scientific there but it makes me feel good to burn cans in this way hehehe.

These Fischers are really quite nasty sounding out of the box. They were very undefined and full of reverb at first listen and I was having a hard time hearing certain details in the music that even my el cheapo AKG k44 was revealing to me. Right now, they are not as raucous as they were out of the box but I still feel that they are a long way from being done. I sure hope that LFF and I have similar tastes since he did praise these very highly on headfi. If these change quite a bit within the next week then I will surely be reassured again that burn in is real. There is no way that any psychological effect is going to make me hear details that I could not hear out of the box when I am mentally aware of those specific details I am looking for. Hmmm well see...

xnor's picture
Simply concentrating on a

Simply concentrating on a different frequency range or instrument or singer will let you hear details you never heard before (when you listened to a track 'as a whole').
Hearing is not a measurement process. Hearing is perception.
I'm sure you know a couple of optical illusions that are actually trivial but make us look like fools. The same applies to our hearing.
Our sensory impressions are also affected by indeterminable variables, our subconsciousness etc. and result in unique interpretations. This may sound paranoid, but sometimes we can hear things that don't even exist, like when doing EQ fine tuning... until you notice that bypass was checked the whole time.
And 1 dB is a small difference. Sure, in an ABX test with software support a 1 dB change can be easily spotted, but try to do an ABX test with headphones. All of the problems mentioned above will kick in. Plus manufacturing variations that most probably cause bigger changes between two headphones than what Tyll measured above.

If you listen to the Fisher you'll get used to its traits eventually and it will probably sound more acceptable compared to your initial impression.
A better approach would be to listen to the stock headphones for a couple of hours spread over a few days, then torturing them with pink noise and finally listen again. Ideally, you shouldn't use any other headphones during the whole process and all listening sessions should be equally spread over time, at the same time of day. Laughing out loud

dalethorn's picture
Not exactly

Not exactly, xnor. Listening to most music, perhaps. Listening to test tones in multiple test situations you can corroborate your results to less than a decibel. It's possible to do this with music as well, but it's more difficult and I don't know if anyone is claiming to have done rigorous tests with music.

Edit: I EQ'd a Beyer DT48 in 1974 to sound exactly like a Koss ESP9, and the apparent EQ did not change from day to day. Some things are true - that such EQ works only on a per-song basis because the music is complex and the number of EQ center frequencies were limited. But the technical part of it can be made nearly exact, by ear.

donunus's picture
Test tones are definitely

Test tones are definitely easier to work with I agree. With music, you have to know certain cues in certain parts of a song to look for in order to discern between imagined and non-imagined differences in sound.

dalethorn's picture
Test tones, music, measuring

What astounds me is how people can look at a response curve, and noting a 20 db drop in the bass and a 35 (35!) db drop in the lower highs, just assume that's real because the measurement says so. And BTW, I am not saying that the measurement equipment didn't actually see those values - I'm fairly sure it did. But when my ears, after much, much effort and cross-comparison says the actuals are a small fraction of that 20 and 35, any reasonable person should be asking *why*. I suppose there's either a lack of curiosity about the measuring process, or a blind faith in the results. Not good. Given that certainty, we should be looking at other ways to evaluate burn-in that aren't purely (or mainly) subjective.

donunus's picture
Maybe the new dt48 pads are

Maybe the new dt48 pads are made to work only with human skin for a good seal and the right sound. Tyll's dummy head could be giving a different sound than what we might be able to hear on our own heads.

Tyll Hertsens's picture
If anything ...
... the headphones will seal better on the head than with human skin --- hair and whatnot getting in the way. That's why I measured the DT48 with a poor seal as well.
dalethorn's picture
The measurement effort

The measurement effort for the DT48 was extraordinary, which makes the mystery aspect as legendary as the, umm, legend I suppose.