AES Headphone Technology Conference: Sensory Profiling

The image above shows an organized set of words used in sensory profiling to describe subjectively sensed audio. More about the "Sound Wheel" in this article.

Sensory Profiling is the study of how we experience things....headphones in this case. There were many papers on the subject, here's a few just to whet your whistle.

Descriptive Analysis of Binaural Rendering with Virtual Loudspeakers Using a Rate-All-That-Apply Approach

AESHeadphoneConference_SenoryProfiling_Photo_Speakers

Spatial audio content for headphones is often created using binaural rendering of a virtual loudspeaker array. It is important to understand the effect of this choice on the sound quality. A sensory profiling evaluation was used to assess the perceived differences between direct binaural rendering and virtual loudspeaker rendering of a single sound source with and without head tracking and using anechoic and reverberant binaural impulse responses. A subset of the Spatial Audio Quality Inventory (SAQI) was used. Listeners first selected only attributes that they felt applied to the given stimuli. Initial analysis shows that tone colour and source direction are most affected by the use of this technique, but source extent, distance, and externalisation are also affected. Further work is required to analyse the sparse attribute rating data in depth.

The most common way to place a trumpet in a stereo image left-to-right is by varying its signal level between the left and right speaker—by panning it. The current paper looks at whether there's a difference between creating two virtual speakers in headphones and panning a signal left and right versus using a binaural a rendering of the sound source and moving it left-to-right (virtually synthesizing a single source and then moving it's position with HRTFs). Researchers also investigated what perceived attributes were most sensitive to the two conditions.

The table above shows how many times listeners chose a particular sound attribute to distinguish between the virtual speaker display and the binaural display, both with head-tracking and without. The higher up an attribute is in the table, the more often is was useful. Researchers found:

  • It was very clear that there is a difference between panning a sound between two virtual speakers and moving a virtual sound (binaural rendering) left-to-right.
  • Tonal coloration was the most commonly reported difference, but the study was not designed to determine which was tonally more correct. Comb filter effects were the second most commonly described tonality differentiator, but came in a distant seventh as a discrimination tool.
  • Horizontal position was a strong discriminator, but interestingly was more powerful without head tracking. Participants commonly remarked the sound source appeared to move with head movement; researchers fell this is likely due to generic HRTFs being used rather than participant specific HRTFs.
  • The perceived sound width was consistently wider, and perceived distance reduced for the virtual loudspeaker presentation. This was particularly true when head tracking is used.

The message I get from this research is that it's better to synthesize each sound in a particular position rather than synthesize sounds in space by panning them between a fixed number of virtual speakers. What that means to me is that we're likely to see headphone virtual audio completely leap-frog the various surround standards and go directly to a completely virtualized audio experience with each sound coming from its own particular direction.

I suspect this will largely be driven by game manufacturers, and that youngsters will come to expect that level of binaural rendering in all their media consumption, putting pressure on music sellers to deliver more immersive experiences. As an example: Record each instrument separately, and then place them in virtual space as they would in real space.

I'm afraid virtual audio of the future will begin to reduce the number of music recordings that are native two-channel recordings. In future, an increasing amount of two-channel audio content will be down-mixed from far more complex virtual audio soundscapes.

Modeling Perceptual Characteristics of Prototype Headphones
AESHeadphoneConference_SenoryProfiling_Photo_Perceptions

This study tested a framework for modelling of sensory descriptors (words) differentiating headphones. Six descriptors were included in a listening test with recordings of the sound reproductions of seven prototype headphones. A comprehensive data quality analysis investigated both the performance of the listeners and the suitability of the descriptors for modelling. Additionally, two strategies were investigated for modelling metrics describing these descriptors, both relying on specific loudness estimations of the test stimuli. The stability of the initially found metrics was tested with a bootstrap procedure to quantify the potential of the metrics for future predictions within the perceptual space spanned by the headphones. The most promising results were metrics for Bass, Clean and Dark-Bright with correlations values of r^2 = 0.62, r^2 = 0.58, and r^2 = 0.90 respectively.

Here again we see a paper trying to narrow down the number of characteristics needed for subjective testing. It's filled with the complexities of testing nine headphones with four different pieces of music by 18 subjects and rating things by six sonic attributes: bass strength; midrange strength; treble strength; dark-bright; clean; and punch. Various methods are used to parse the strength and correlation between the various sonic attributes.

Punch turns out to be a rather pesky attribute mainly because people seemed to have different perceptual definitions for it. Dark-bright appeared to correlate more strongly to treble strength than bass strength. Midrange and treble strength metrics appeared unstable descriptive attributes. The descriptive attributes that appeared most stable and free from inter-correlation were: bass strength; clean; and dark-bright.

Your take-away thought: If you're a headphone enthusiast one of the most important things you can do when you listen to headphones is remember what they sound like. It is widely known that our acoustic memory is frustratingly short. This paper suggests the simplest, most descriptive, and reasonably accurate way to describe and remember headphones would be to get a good grasp of their bass strength, how clean they sound, and whether you found them generally dark or bright.

Analysis of Subjective Evaluation of User Experience with Headphones
AESHeadphoneConference_SenoryProfiling_Photo_UX

The aspects of what provides a good user experience with headphones is initially investigated by an exploratory study (experiment I). Using KJ-Technique, 5 workshop teams of 4-6 participants each provide a number of aspects influencing their experience with headphones. Analysing the aspects for uniqueness and relatedness provides 144 aspects of user experience with headphones, arranged in 12 categories. The 144 influencing aspects from experiment I are condensed, and 24 attributes regarding user experience with headphones are selected. These attributes are tested in regard to their correlation with and effects on overall evaluation of headphones in a second experiment, thus investigating which attributes are most influential for user experience. Using a within-subject design, eight different headphones are evaluated according to the attributes along with an overall evaluation. The attributes are listed in the following categories: sound quality, comfort, build quality, design and brand. A factor analysis shows that the categories fit the attributes. Furthermore, some attributes show high correlations with the overall evaluation, suggesting that these attributes are important for user experience with headphones. The highest rated attributes are shape, design, quality of contact surfaces, comfort, goodness of fit and build quality. An interpretation of which attributes are the most influential in relation to user experience with headphones is discussed.

Sigh...this one's a bit depressing.

The paper sets out to do two things: determine which headphone characteristics people believe they feel are most important when choosing headphones; and which characteristics they actually use most importantly when choosing a headphone. No, sadly, they're not the same.

It seems to me from my reading of the headphone enthusiast community's postings that there is general agreement that sound quality is most important, with comfort coming in a close second. After that it get's a bit more difficult, but I'd guess build quality beats out styling for third. From the image above, the untrained consumers in these trials have similar tastes, though isolation ranks high—a characteristic that we hobbyists probably don't think about too much because we're willing to buy more than one headphone to cover our isolation requirements when needed.

The sad part is this study shows that when participants were actually given a number of headphones to experience and rate their satisfaction, the characteristics by which they made their choices didn't line up at all with their stated preferences.

AESHeadphoneConference_SenoryProfiling_Photo_UX2

As you can see from the table above, sound quality came in a distant seventh! Look and feel came in as the top three influencers on consumer satisfaction.

Great. We've got our work cut out for us. Remember to freely give good advice to folks on sound quality: They think it's important, so they'll appreciate it. But they may forget when faced with cool looking cans at the store, so it's good they've had a personal touch on sound quality. Let's keep reminding them about it.

A Comparison of Sensory Profiles of Headphones Using Real Devices and HATS Recordings
AESHeadphoneConference_SenoryProfiling_Photo_HATS

The above plot shows significant subjective evaluation differences for the sonic attribute "clean" with a HATS simulated headphone (blue) vs. when the real headphones were evaluated (red).

This study compares two sets of sensory profiles of eight headphones, obtained in two experiments, with the intent of revealing the differences and or limitations of these methods: The first experiment used a double blind approach with headphone auralizations and in the second experiment assessors listened to the actual headphones, as a non-blind experiment. The results of each experiment are analyzed and compared to reveal the differences, and causes for these differences, for each attribute.

I've experienced the demo of Harman's Headphone Virtualizer App and felt it pretty representative of the cans in the demo. But that was one guy on one day. Turns out it might be a difficult thing to do reliably.

The present paper, done by the folks at DELTA SenseLab (the folks that brought you the Sound Wheel), gave it a shot and found it quite difficult to get reliable results. The idea here was to evaluate (non-blind) eight real headphones, and compare that evaluation with synthesized headphones using head and torso simulator (HATS) recordings of the same headphones played back on an equalized reference headphone (HD 650).

The study showed this is a task fraught with problems. Likely causes of the problems were identified by sonic characteristic:

  • Bass Strength - It was found that problems with the ear pad seal of the sealed headphones in the test (all but one) when sealing to either the human heads or the HATS caused significant problems in obtaining reliable data.
  • Treble Strength - This area had two issues: Scale compression (where characteristics seem more extreme in one method of testing than the other) may have occurred because of the time it takes to switch real headphones as acoustic memory fades and differences seem less. Differences in bass strength (especially with the bass problems) made skewed the perceived treble differences.
  • Externalization - Apparently, subjects hear the actual real headphone as sounding much "cleaner" than one that was recorded through a dummy's ears than undergoing many complex DSP filters. Hm, who'd of thunk it. :)
    • Scale compression also played a role due to the HD 650 chosen. I'm nor really sure I understand it, but I'd bet it was the best headphone in the test. Bottom line: Faking headphones is hard.

      Next papers: Weird Science

COMMENTS
tony's picture

This list rating of consumer preferences matches rather well with consumer choices in Car Showrooms.

Funny how I have my Schiit stuff mounted well under my desktop, yet a Glass Sculpture Woo Amp sits prominently ( and is gorgeous looking ).

So, although I'd hate to admit it, looks trump SQ! and the quirky looks of the new Focal stuff is off-putting. I'll still own em though, I'm an Audiophile that admires transducer design.

It helps explain Schiit sort-of apologizing for their boxes and ranting against the beautiful chassis of the pricy stuff.

Of course, Audiophiles are supposed to be above choosing based on appearances but we aren't!

Lesson learned, I'll get that haircut my wife has been complaining about lately!

Tony in Beautiful Michigan