New paper: Quantifying subjective effort during listening

When we have to understand challenging speech - for example, speech in background noise - we have to “work harder” when listening. A recurring challenge in this area of research is how exactly to quantify this additional cognitive effort. One way that has been used is self-report: simply asking people to rate on a scale how difficult a listening situation was.

Self-report measures can be challenging, because they rely on meta-linguistic decisions: that is, I am asking you, as a listener, to have some insight into how difficult something was for you. It could be that people vary in how they assign a difficulty to a number, and so even though two people’s brains might have been doing the same thing, the extra step of having to assign a number might produce different self-report numbers. Because of this and other challenges, other measures have also been used, including physiological measures like pupil dilation that do not rely on a listener’s decision making ability.

At the same time, subjective effort is an important measure that likely provides information above and beyond what is captured by physiological measures. For example, personality traits might affect how much challenge a person is subjectively experiencing during listening, and a person’s subjective experience (however closely it ties to physiological measures) is probably what will determine their behavior. So, it would be useful to have a measure of subjective listening difficulty that did not rely on a listener’s overt judgments about listening.

Drew McLaughlin developed exactly such a task, building on elegant work in non-speech domains (McLaughlin et al., 2021). The approach uses a discounting paradigm borrowed from behavioral economics. Listeners are presented with speech at different levels of noise (some easy, some moderate, some difficult). Once they understand how difficult the various conditions are, they are given a choice on every trial to perform an easier trial for less money, or a difficult trial for more money (for example: I’ll give you $1.50 to listen to this easy sentence or $2.00 to listen to this hard sentence). We can then use the difference in reward to quantify the additional “cost” of a difficult trial. If I am equally likely to do an easy trial for $1.75 or a hard trial for $2.00, then I am “discounting” the value of the hard trial by $0.25.

discounting.png

We found that all listeners showed discounting at more difficult listening levels, with older adults showing more discounting than young adults. This is consistent with what we know about age-related changes in hearing and cognitive abilities important for speech that would lead us to expect greater effort (and thus more discounting) in older adults.

To complement these group analyses, we also performed some exploratory correlations with working memory and hearing scores. For the older adults, we found that listeners with better working memory showed less discounting (that is, found the noisy speech easier); listeners with poorer hearing showed more discounting (found the task harder). We also looked at a hearing handicap index, which is a questionnaire assessing subjective hearing and communication function, which also correlated with discounting.

individual_differences.png

I’m really excited about this approach because it provides us a way to quantify subjective effort without directly asking participants to rate their own effort. There is certainly no single bulletproof measure of cognitive effort during listening but we hope this will be a useful tool that provides some unique information.

Reference

McLaughlin DJ, Braver TS, Peelle JE (2021) Measuring the subjective cost of listening effort using a discounting task. Journal of Speech, Language, and Hearing Research 64:337–347. doi:10.1044/2020_JSLHR-20-00086 (PDF)

New paper: Acoustic richness modulates networks involved in speech comprehension (Lee et al.)

Many functional imaging studies have investigated the brain networks responding to intelligible speech. Far fewer have looked at how the brain responds to speech that is acoustically degraded, but remains intelligible. This type of speech is particularly interesting, because as listeners we are frequently in the position of hearing unclear speech that we nevertheless understand—a situation even more common for people with hearing aids or cochlear implants. Does the brain care about acoustic clarity when speech is fully intelligible?

We address this question in our new paper now out in Hearing Research (Lee et al., 2016) in which we played short sentences for listeners they varied in both syntactic complexity and acoustic clarity (normal speech vs. 24 channel vocoded speech). We used an ISSS fMRI sequence (Schwarzbauer et al., 2006) to collect data, allowing us to present the sentences with reduced acoustic noise but still obtain relatively good temporal resolution (Peelle, 2014).

In response to syntactically complex sentences, listeners showed increased activity in large regions of left-lateralized frontoparietal cortex. This finding was expected given previous results from our group and others. In contrast, most of the increases in response based on acoustic clarity were due to the presence of more activity for the acoustically detailed, normal speech. Although this was somewhat unexpected as many studies show increased response for degraded speech relative to clear speech, we have some ideas as to what might explain our result:

  1. Studies finding degradation-related increases frequently also involve a loss of intelligibility;
  2. We indeed saw some areas of increased activity for the degraded speech, they were just smaller in size than the increases;
  3. We used noise vocoding to manipulate the acoustic clarity of the speech signal which reduced cues to the sex, age, emotion, and other characteristics of the speaker.

These results continue an interesting line of work (Obleser et al., 2011) looking at the role of acoustic detail apart from intelligibility. This ties in to prosody and other aspects of spoken communication that go beyond the identity of the words being spoken (McGettigan, 2015).

Overall, we think our finding that large portions of the brain show less activation when less information is available is not as surprising as it seems, and extraordinarily relevent for patients with hearing loss or using an assistive device.

Finally, I'm very happy that we've made the unthresholded statistical maps available on neurovault.org, which is a fantastic resource. Hopefully we'll see more brain imaging data deposited there (from our lab, and others!).

References:

Lee Y-S, Min NE, Wingfield A, Grossman M, Peelle JE (2016) Acoustic richness modulates the neural networks supporting intelligible speech processing. Hearing Research 333:108-117. doi: 10.1016/j.heares.2015.12.008 (PDF)

McGettigan C (2015) The social life of voices: Studying the neural bases for the expression and perception of the self and others during spoken communication. Front Hum Neurosci 9:129. doi:10.3389/fnhum.2015.00129

Obleser J, Meyer L, Friederici AD (2011) Dynamic assignment of neural resources in auditory comprehension of complex sentences. NeuroImage 56:2310-2320. doi:10.1016/j.neuroimage.2011.03.035

Peelle JE (2014) Methodological challenges and solutions in auditory functional magnetic resonance imaging. Front Neurosci 8:253. doi: 10.3389/fnins.2014.00253

Schwarzbauer C, Davis MH, Rodd JM, Johnsrude I (2006) Interleaved silent steady state (ISSS) imaging: A new sparse imaging method applied to auditory fMRI. NeuroImage 29:774-782. doi:10.1016/j.neuroimage.2005.08.025

New paper: Methodological challenges and solutions in auditory fMRI

Fresh off the Frontiers press, my review paper on auditory fMRI methods. There are a number of other papers on this topic, but most are more than a decade old. My goal in this paper was to give a contemporary overview of the current state of auditory fMRI, and emphasize a few points that sometimes fall by the wayside. Scanner noise is often seen as a methodological issue (and a nuisance)—and understandably so—but it's one that can drastically impact our interpretation of results, particularly for auditory fMRI studies.

One key point is that acoustic scanner noise can affect neural activity through multiple pathways. Typically the most focus is placed on audibility (can subjects hear the stimuli?), followed by acknowledging a possible reduction in sensitivity in auditory regions of the brain. However, acoustic noise can also change the cognitive processes required for tasks such as speech perception. Behaviorally there is an extensive literature showing that speech perception in quiet differs from speech perception in noise; the same is true in the scanner environment. Although we may not be able to provide optimal acoustic conditions inside a scanner, at a minimum it is useful to consider the possible impact of the acoustic challenge on observed neural responses. To me this continues to be an important point when interpreting auditory fMRI studies. I'm not convinced by the argument that because acoustic noise is present equally in all conditions, we don't have to worry about it—there are good reasons to think that acoustic challenge interacts with the cognitive systems engaged.

Another point that has long been around in the literature but frequently downplayed in practice is that scanner noise appears to impact other cognitive tasks, too—so it's not probably just auditory neuroscientists who should be paying attention to the issue of acoustic noise in the scanner.

On the solution side, at this point sparse imaging (aka "clustered volume acquisition") is fairly well-known. I also emphasize the benefits ISSS (Schwarzbauer et al, 2006), which is a more recent approach to auditory fMRI. ISSS allows improved temporal resolution while still presenting stimuli in relative quiet, although because it produces a discontinuous timeseries of images, some care needs to be taken during analysis.

It's clear that if we care about auditory processing, scanner noise will always be a challenge. However, I'm optimistic that with some increased attention to the issue and striving to understand the effects of scanner noise rather than ignore them, things will only get better. To quote the last line of the paper: "It is an exciting time for auditory neuroscience, and continuing technical and methodological advances suggest an even brighter (though hopefully quieter) future."

[As a side note I'm also happy to publish in the "Brain Imaging Methods" section of Frontiers. I wish it had it's own title, but it's subsumed in the Frontiers in Neuroscience journal for citation purposes.]

 

References:

Peelle JE (2014) Methodological challenges and solutions in auditory functional magnetic resonance imaging. Frontiers in Neuroscience 8:253. http://journal.frontiersin.org/Journal/10.3389/fnins.2014.00253/abstract

Schwarzbauer C, Davis MH, Rodd JM, Johnsrude I (2006) Interleaved silent steady state (ISSS) imaging: A new sparse imaging method applied to auditory fMRI. NeuroImage 29:774-782. http://dx.doi.org/10.1016/j.neuroimage.2005.08.025

New paper: Listening effort and accented speech

Out now in Frontiers: A short opinion piece on listening effort and accented speech, written in collaboration with Wash U colleague Kristin Van Engen. The crux of the article is that there is increasing agreement that listening to degraded speech requires listeners to engage additional cognitive processes, under a generic label of "listening effort". Listening effort is typically discussed in terms of hearing impairment or background noise, both of which obscure acoustic features in the speech signal and make it more difficult to understand. In this paper Kristin and I argue that accented speech is also difficult to understand, and should be thought of in a similar context.

We have tried to frame these issues in a general way that incorporates multiple kinds of acoustic challenge. That is, the degree to which the incoming speech signal does not match our stored representations determines the amount of cognitive support needed. This mismatch could come from background noise, or from systematic phonemic or suprasegmental deviations associated with accented speech. A related point is that comprehension accuracy depends both on the quality of the incoming acoustic signal, and the amount of additional cognitive support a listener allocates: Degraded or accented speech may be perfectly intelligible if sufficient cognitive resources are available (and engaged).

Figure 1. (A) Speech signals that match listeners' perceptual expectations are processed relatively automatically, but when acoustic match is reduced (due to, for example, noise or unfamiliar accents), additional cognitive resources are needed to compensate. (B) Executive resources are recruited in proportion to the degree of acoustic mismatch between incoming speech and listeners' representations. When acoustic match is high, good comprehension is possible without executive support. However, as the acoustic match becomes poorer, successful comprehension cannot be accomplished unless executive resources are engaged. Not shown is the extreme situation in which acoustic mismatch is so poor that comprehension is impossible.

I like this article because it raises a number of interesting questions that can be experimentally tested. One of the big ones is the degree to which the type of acoustic mismatch matters: that is, are similar cognitive processes engaged when speech is degraded due to background noise as when an unfamiliar accent reduces intelligibility? My instinct says yes, but I wouldn't bet on it until more data are in.

Reference:

Van Engen KJ, Peelle JE (2014) Listening effort and accented speech. Front Hum Neurosci 8:577. http://journal.frontiersin.org/Journal/10.3389/fnhum.2014.00577/full