Headphone Measurements Explained - Square Wave Response

Read about square waves and the most common thing you'll hear is they are made of a fundamental and an infinite set of odd harmonic tones in a very particular amplitude and phase relationship. That's true, of course, but in the real world and with regard to testing headphones, it may not be the best way to think about them. It's a decent place to start though.

Viewing Square Waves as Made of a Fundamental and Odd Harmonic Series
Science_InterpretingSquareWaves_Illustration_HowSquareWavesWorkTo show you how square waves are built from a fundamental tone and its odd harmonics, I created a little Excel spreadsheet. In the spreadsheet (you can download the Excel file here). I used some simple math to create a series of columns holding data that represented the appropriate sine waves that I could then add together to form the square wave. The formulas used also had some variables with which I could control the relative amplitude and phase of the odd harmonics.

The top plot to the left shows the fundamental tone and the first four odd harmonics. The next plot shows the result of simply adding together the amplitude of the signals at each point in time. You can see clearly that the result begins to look quite like a square wave. The third plot shows the result using the first 11 odd harmonics, and you can see does a better job of making a facsimile of a perfect square wave. As more and more odd harmonics are used, the square wave gets closer and closer to the ideal. In band width limited systems--like a DAC that might have a limit of 22kHz for 16/44 CD playback--only a limited amount of odd harmonics available, and square wave reproduction will look quite like the third plot.

There are two other things that can misshape the square wave response: phase and relative amplitude errors between the fundamental and harmonics. In the fourth plot, the phase (time relationship) of the odd harmonics is advanced as the frequency gets higher. In audio electronics gear, all frequencies are delayed a little, but high frequencies typically move through the circuits a little more quickly than low frequencies. The words "group delay" are used to identify this phenomenon. Group delay creates phase error in the timing relationship of all the odd harmonics of a square wave. In the fourth plot to the left, I've introduced some phase error, and you can see how it changed the shape of the square wave some and introduced an overshoot on the leading edge.

Errors in the amplitude relationship of all the odd harmonics can occur if the system's frequency response is not flat. In the second to last plot at the left, I've simulated a tipped-up frequency response (less bass, more treble), and in the last plot I've simulated the opposite with more bass and less treble.

This is a valid way to look at square waves for some applications, but it makes some assumptions about the sine waves being continuous forever that make it a little misleading in terms of the type of square wave testing I do on headphones. For example, in the third plot down, you can see little spikes before the transitions. This is called "pre-ringing." How could an analog system know the signal is about to change and create the pre-ring? Answer: it can't. That's one of the reasons why it's better to look at square wave response as a series of step responses.

Step Response and Square Wave Response
Step response is simply measuring how the device under test responds to an instantaneous shift in level. From 0Volts to 1Volt, for example. As long as the frequency of the square wave used is longer than the lowest frequency of interest, a square wave is simply a series of step responses.

Step response is used broadly in audio electronics and electronics in general to indicate a variety of performance characteristics in the time domain, such as rise time, overshoot, settling time, and ringing. But step response also contains phase and frequency response information. For example, it is one of the speaker characteristics John Atkinson measures to look at the phase coherence of a multi-driver speaker. (His very interesting explanation is here.)

You can also think of step response as a measure of frequency response where the leading edge slew rate indicates the high-frequency limit, and the length of time it can keep the step at the new level an indication of its low-frequency limit. At every point between, you can think of the level of the top of the step response as related to the frequency response at the frequency whose quarter wavelength is equal to the elapsed time since the leading edge of the step/square wave.

Another way to look at it is similar to the summation of a series of odd harmonics we talked about above. You can think of a step response as the amplitude response of a continuous series of equal amplitude narrow-band pulses.


This graphic illustrates the idea that the step response can, in part, be thought of as the plot of amplitude maxima of a series of narrow band pulses swept through the frequency range of interest.

This isn't the whole picture, though, as superimposed on the frequency component of the step response are time domain artifacts like ringing and phase information. The bugaboo about measurements is that if you test for information in one domain to make it perfectly clear, the information in the other domains disappear from view. You can't see time information in the frequency response plot, but it's there and you can calculate it. And you can't see frequency response in the impulse response, but again, it's there. The cool thing about step and square wave response is that you get a nice, albeit hazy and sometimes difficult to interpret, mix of both time and frequency information that, for me, feels a bit more naturally accessible and rich.

Now let's look at the special case of 30Hz and 300Hz square wave testing of headphones, and how to interpret the results.


dalethorn's picture

Reading page 2 here brought tears to my eyes in a couple places, but when I got to the PS-1000 - yikes! Glad I didn't buy that one.

Jazz Casual's picture

So the PS1000's square wave response isn't up to snuff. And it has an abundance of treble, a slightly recessed midrange compared to the Prestige and Reference series Grados and a mid-bass hump. But the PS1000's treble is not at all sharp and nor is it bass lite to my ears. I've read criticisms of this phone for having too much bass. Having heard the Fostex TH900, which has a measurably better square wave response, I found it suffered from a noticeably recessed midrange and overpowering bass. So let's not make the mistake of judging a headphone on how well it measures alone. Imagine if you had judged your Grado PS500 on that basis. You would have denied yourself the opportunity of hearing a headphone, that compelled you to share your positive experience in a forum thread at this very site.  

Alondite's picture

The problem with your ears is that perception differs from person to person, and even within one person's perception as they "adapt" to a particular signature. I occasionally have to check to make sure I don't have the bass boost turned on on my amp with my AD900s, but they certainly have light bass. My GR07s are much closer to neutral in terms of quantity, but when I go to them from my AD900s they sound like nothing but bass. The opposite is true when I do it in reverse. 

These graphs show the objective qualities of the cans. That is, they show how they actually sound regardless of perception. Now even Tyll has said that they don't tell the whole story, but they can give you an idea of the sound characterisitcs of any given headphone. Unless your ears and personal preferences are identical to mine, we are likely going to percieve headphones differently. Such is the problem with subjective experience: it is only valuable to you as an individual. The same is true for any subjective experience. 

However, perceived relativism can still be valuable. You may not find the PS1000 treble to be piercing, but I'm sure you will still find cans that measure as having less treble presence than the PS1000 to, in fact, have less treble. For example, you experience with the TH900 as being recessed in the midrange is likely relative to your own experience rather than to neutral (though there is a bit of a notch in the midrange). 

dalethorn's picture

Now here's a thought Sir Jazz - I just picked up a Sennheiser Amperior at the Apple store in Akron Ohio, where the manager and asst. manager were curious about the headphone business. I pointed them here. But back to the Amperior - very rich sound - the sort of sound I would expect of a Grado PS1000. My wife tried them and remarked "Wow - the bass is great, the highs are rich, and you can hear every detail in the mids. So I was thinking, if you could borrow these for 2 or 3 days and give them a thorough shakedown, how the PS1000 might sound after that.

At $350 USD you wouldn't expect a lot of refinement, but there's still a lot to like...

Jazz Casual's picture

I'd be happy to audition it Mr Thorn. I'll see what I can do. However, being a closed design I wouldn't expect it to have the clarity, airiness and grand soundstage of the PS1000.  

dalethorn's picture

Definitely not the soundstage. No comparison there. But since these properties are interrelated (soundstage does affect perception of clarity and airiness too), that might make the suggestion moot.

Jazz Casual's picture

I've found that closed-back headphones lack the openness and clarity of open-backed models, which I suppose isn't all that surprising. Grados excel at conveying this and the PS1000 even more so.  

frenchbat's picture

Great piece Tyll. Is it your "As I see it" piece ?

Anyway I surely do understand your methodology much better now. Thanks a lot.

Phos's picture

I half suspect if you were to fully dissasemble the XB500 you'd find an inductor in the signal path somewhere.  


Take apart the solo HD and you'd probably find a talking action figure in each cup.  

Tyll Hertsens's picture

"Take apart the solo HD and you'd probably find a gold tooth and a 40 of Old English 800 in each cup."

yuriv's picture

Sounds pretty good too. Better than the UE700 IMO. Time for a proper review?

Also, almost all cheapo IEMs have that extreme elevated bass response with the peaky treble, like you have in the last graph for the Turbines and Beats. For example, JVC, Panasonic, and Philips IEMs. A notable exception is the Monoprice MEP-933.

In some cases, there's an easy mod that fixes the response: Make the vent hole bigger until you get the bass response you want. But if you do only that, you get more noticeable treble peaks, which sound harsh. For that, place a tiny amount of acoustic dampener material in front of the nozzle opening. Tips like Comply TX400 have a wax guard that can hold the absorber in place. (Actually the TX400 by itself helps a lot.)

The result is a much, much better sounding cheap IEM. It works for the Panasonic HJE120 and the Philips 3580. I wonder what the square waves will look like? Maybe we can send you some modded ones for measurements?


Tyll Hertsens's picture

I've measured some of the UEs and heard more at RMAF last year.  I thought they were remarkable good.

Sure, If you want me to send some modded cans, I'm always ready to run them through. Contact me at tyll(at)tyllhertsens.com.

donunus's picture

very cool article, among the best and most informative I've seen!

dalethorn's picture

Yes, this is the best presentation I've seen so far for interpreting square waves etc.

ClieOS's picture

Fully agree. Another excellent writeup, Tyll!

Willakan's picture

Wonderful article! I would be very interested to hear more about your rationale behind links between blips on the edges of the waveform and imaging ability.

Tyll Hertsens's picture

The primary mechanism that contributes to our ability to create and aural image is interaural time difference (ITD). The ITD is the arrival time difference between ears for an off-axis signal. For a 30 degree off-axis speaker to the right, the left ear hears the audio 400 micro seconds after the right. You brain listens for arrival ITDs by listening to the "edges" in the sound, typically in the upper midrange and low treble.  But if your headphone is adding a second edge to each feature at about the normal ITD, it's likely going to confuse your brain as it searches for exactly where the delay is.  The second blip in the LCD-2 is about 300 micro seconds after the initial edge, so it's right in the region of ITDs needed to build an audio image between two speakers.

I'll that if you look at all the 300Hz square waves measured so far, the great majority do have significant features after the leading edge, and generally headphone imaging is fairly poor.  I'll also mention that the HD800, a headphone that has an extraordinarily clean leading edge with little secondary features, is well know to image very well.

Hope that helps.

Willakan's picture

It did indeed. I hope that the forthcoming "Headphone Measurements Explained" maintains this level of detail throughout, because this is great.

Tyll Hertsens's picture

Wouldn't have it any other way.

I took me a week to research and write though ... I learned a hell of a lot in the process too.  Unfortunately it won't do a lot for long term page views, but I think it's terribly important to improve the level of understanding among the headphone faithful. My hope with this stuff is the raise the collective wisdom of headphone enthusiast dialog, so sometime I feel like I have to do things that don't directly increase pageviews.

You'll understand though, I hope, if these types of posts aren't quite as frequent as we all might like. 

ultrabike's picture

I see visually significant differences in the 30Hz and 300Hz SW responses between the Grados and a DT770. Same could be said about the PortaPros and the Philips L1's. More importantly, the explanations and discussion in the article regarding the audio qualities assigned to each of the characterization plots are invaluable, as they seem well correlated to the real world audio experince.

That said, some SW curves are a little more difficult for me to differenciate. I was comparing the SW responses between the HD650 and the DT880-600ohm and they seem more similar than different. They sort of seem to fall between the HD800 and the DT770. Yet, the HD650 has been described as sort of dark compared to the DT880-600 and the FR seems to back this up. Same could be said when comparing the SkullcandyHesh2 and the Noontech Zoro. Similar SW responses in my opinion, yet very different distortion and FR curves.

My comments here are not geared towards nit picking. In fact, because I'm more familiar with other characterization measurements, I have learned quite a bit from this and many other articles here at IF. I'm an avid reader of them smiley. My point is that, as with any characterization tool, it is important to understand it's limitations, so as to not go out with a feeling that these and other measurements tell the whole story. Different set of measurements provide us with different views, and understanding, of a system's behaviour.

I also would like to add that I very much apreaciate your discussion regarding headphone FR phase impact.

Tyll Hertsens's picture

"My point is that, as with any characterization tool, it is important to understand it's limitations, so as to not go out with a feeling that these and other measurements tell the whole story. "

Absolutely. One really has to scan the whole page of graphs to get a reasonable picture. Even then it's missing things like CSD plots that are extremely valuable. 

I think one of the most valuable things measurements provide is something to have in mind when you do listening test. You can make observations from the data and then see if you can hear what the measurements indicate. Most times you can but sometimes you can't. Listening is such a different thing than measuring, and it can be quite disorienting to try to objectively parse a subjective experience. The measurements provide a little road map for headphone evaluation, but it's in the listening we actually travel the territory.

HammerSandwich's picture

Fantastic article, Tyll.

Think I found a typo in the LCD2 section.  When you mention "the 30Hz wave form" & it's 2nd spike, aren't you talking about the 300Hz?

Also, do you believe that imaging is more dependent on clean transients or channel matching?  And how the hell could we test that without a zillion samples?




Tyll Hertsens's picture

Man, that little zero messes me up ... a lot. 

"Also, do you believe that imaging is more dependent on clean transients or channel matching?"

I've read studies where researchers would make a left speaker a little louder but advance the right signal in time so it's signal arrived first at the ears. What they found was the interaural time difference was something like ten times more powerful than level differences in developing localization experience. So I believe clean transients are significantly more important than level matching. 

AstralStorm's picture

Not to mention that while linear effects like frequency response are easy to correct, nonlinear ones like phase issues or ringing (visible on CSD) are really hard to fix, if not impossible altogether.

ultrabike's picture

FR phase and CSD issues are linear since they are derived from the impulse response though linear operators.

However, the fact that these issues are linear, does not mean they are fixable. In the digital domain, impulse response issues due to zeros outside the unit circle cannot be corrected (requires an unstable filter) Severe notches may not be corrigible either as signal may be attenuated bellow the noise floor if not completely absent... Furthermore, the fact that headphone impulse response is a bit positional variant complicates things.

That said, a good equalizer can go a long way in fixing some FR magnitude/phase and CSD issues.

dalethorn's picture

EQ'ing small frequency response deviations may be easy when they're small and the fix is simple, but when a bigger fix is needed the simpler fixes tend to create large narrow peaks and dips between the sliders' center frequencies. So then you get to 30 or more band equalizers and a lot of tedium. I think these big equalizers were made for loudspeakers and people with sound meters where most of the process can be automated. With headphones I don't think the fixes are easy because you really have to go by your hearing and not a meter, and you're equalizing the device and your ears at the same time.

Tyll Hertsens's picture

Much better, IMHO.

HammerSandwich's picture

Meant to ask another question: why do you consider an initial overshoot to be ideal?  Is this because the electronics send that signal, so measuring it implies an accurate headphone?  Or do you believe that characteristic indicates that the whole system is more accurate overall?

Tyll Hertsens's picture

Remember that while I have HRTF curves to compensate the frequency response, I don't have compensation information to correct the time based signals. I'm thinking that the small overshoot feature is a result of sound being modified by the pinnae. But the main reason for the observation is that when I do see a headphone that has a 300Hz square wave with a nice square leading edge without overshoot, it tends not to sound fast enough to me. So it's purely an imperical thought at the moment and I have no technical justification.

HammerSandwich's picture