New paper: Methodological challenges and solutions in auditory fMRI

Fresh off the Frontiers press, my review paper on auditory fMRI methods. There are a number of other papers on this topic, but most are more than a decade old. My goal in this paper was to give a contemporary overview of the current state of auditory fMRI, and emphasize a few points that sometimes fall by the wayside. Scanner noise is often seen as a methodological issue (and a nuisance)—and understandably so—but it's one that can drastically impact our interpretation of results, particularly for auditory fMRI studies.

One key point is that acoustic scanner noise can affect neural activity through multiple pathways. Typically the most focus is placed on audibility (can subjects hear the stimuli?), followed by acknowledging a possible reduction in sensitivity in auditory regions of the brain. However, acoustic noise can also change the cognitive processes required for tasks such as speech perception. Behaviorally there is an extensive literature showing that speech perception in quiet differs from speech perception in noise; the same is true in the scanner environment. Although we may not be able to provide optimal acoustic conditions inside a scanner, at a minimum it is useful to consider the possible impact of the acoustic challenge on observed neural responses. To me this continues to be an important point when interpreting auditory fMRI studies. I'm not convinced by the argument that because acoustic noise is present equally in all conditions, we don't have to worry about it—there are good reasons to think that acoustic challenge interacts with the cognitive systems engaged.

Another point that has long been around in the literature but frequently downplayed in practice is that scanner noise appears to impact other cognitive tasks, too—so it's not probably just auditory neuroscientists who should be paying attention to the issue of acoustic noise in the scanner.

On the solution side, at this point sparse imaging (aka "clustered volume acquisition") is fairly well-known. I also emphasize the benefits ISSS (Schwarzbauer et al, 2006), which is a more recent approach to auditory fMRI. ISSS allows improved temporal resolution while still presenting stimuli in relative quiet, although because it produces a discontinuous timeseries of images, some care needs to be taken during analysis.

It's clear that if we care about auditory processing, scanner noise will always be a challenge. However, I'm optimistic that with some increased attention to the issue and striving to understand the effects of scanner noise rather than ignore them, things will only get better. To quote the last line of the paper: "It is an exciting time for auditory neuroscience, and continuing technical and methodological advances suggest an even brighter (though hopefully quieter) future."

[As a side note I'm also happy to publish in the "Brain Imaging Methods" section of Frontiers. I wish it had it's own title, but it's subsumed in the Frontiers in Neuroscience journal for citation purposes.]

 

References:

Peelle JE (2014) Methodological challenges and solutions in auditory functional magnetic resonance imaging. Frontiers in Neuroscience 8:253. http://journal.frontiersin.org/Journal/10.3389/fnins.2014.00253/abstract

Schwarzbauer C, Davis MH, Rodd JM, Johnsrude I (2006) Interleaved silent steady state (ISSS) imaging: A new sparse imaging method applied to auditory fMRI. NeuroImage 29:774-782. http://dx.doi.org/10.1016/j.neuroimage.2005.08.025

New paper: Listening effort and accented speech

Out now in Frontiers: A short opinion piece on listening effort and accented speech, written in collaboration with Wash U colleague Kristin Van Engen. The crux of the article is that there is increasing agreement that listening to degraded speech requires listeners to engage additional cognitive processes, under a generic label of "listening effort". Listening effort is typically discussed in terms of hearing impairment or background noise, both of which obscure acoustic features in the speech signal and make it more difficult to understand. In this paper Kristin and I argue that accented speech is also difficult to understand, and should be thought of in a similar context.

We have tried to frame these issues in a general way that incorporates multiple kinds of acoustic challenge. That is, the degree to which the incoming speech signal does not match our stored representations determines the amount of cognitive support needed. This mismatch could come from background noise, or from systematic phonemic or suprasegmental deviations associated with accented speech. A related point is that comprehension accuracy depends both on the quality of the incoming acoustic signal, and the amount of additional cognitive support a listener allocates: Degraded or accented speech may be perfectly intelligible if sufficient cognitive resources are available (and engaged).

Figure 1. (A) Speech signals that match listeners' perceptual expectations are processed relatively automatically, but when acoustic match is reduced (due to, for example, noise or unfamiliar accents), additional cognitive resources are needed to compensate. (B) Executive resources are recruited in proportion to the degree of acoustic mismatch between incoming speech and listeners' representations. When acoustic match is high, good comprehension is possible without executive support. However, as the acoustic match becomes poorer, successful comprehension cannot be accomplished unless executive resources are engaged. Not shown is the extreme situation in which acoustic mismatch is so poor that comprehension is impossible.

I like this article because it raises a number of interesting questions that can be experimentally tested. One of the big ones is the degree to which the type of acoustic mismatch matters: that is, are similar cognitive processes engaged when speech is degraded due to background noise as when an unfamiliar accent reduces intelligibility? My instinct says yes, but I wouldn't bet on it until more data are in.

Reference:

Van Engen KJ, Peelle JE (2014) Listening effort and accented speech. Front Hum Neurosci 8:577. http://journal.frontiersin.org/Journal/10.3389/fnhum.2014.00577/full

 

New paper: Relating brain anatomy and behavior (Cook et al.)

I'm happy to report that a collaborative project from Penn is now published, spearheaded by Phil Cook. In this paper we explored combining approaches in order to relate individual differences in gray and white matter to behavioral performance. Phil implemented two core steps in a group of participants that included frontotemporal dementia (FTD) patients and healthy older adults. For each participant we had T1- and diffusion-weighted structural images, providing cortical thickness and fractional anisotropy (FA) measurements. We also had a set of behavioral measures that included category fluency ("Name as many animals as you can in 30 seconds") and letter fluency ("Say as many words beginning with the letter F as you can in 30 seconds").

Regions of interest for cortical thickness (top row) and FA (bottom row) defined by eigenanatomy.

Phil first used eigenanatomy in order to define regions of interest (ROIs) for the gray matter images. Eigenanatomy is a dimensionality reduction scheme that identifies voxels that covary across individuals; ROIs are chosen that can maximally explain variance in the dataset. 

The second step is my favorite aspect of this work, and can be implemented regardless of how ROIs are defined. Phil used a model selection procedure implemented in R to assess which combination of ROIs best predicted behavior. He used a combination of cross-validation and AIC to evaluate what predictors performed best. The elegant thing about this approach is that it incorporates both gray matter and white matter predictors in the same framework; thus, the model selection procedure can tell you whether gray matter alone, white matter alone, or some combination best explain the behavioral data.

Cross-validation model selection suggests the including both gray and white matter predictors (black circles) results in significantly better performance than any single modality, and that 4 regions provide the best predictions.

ROIs significantly associated with verbal fluency.

Perhaps not surprisingly, combining gray matter and white matter was consistently better than using either modality alone, as one might expect from a cortical system comprised of multiple regions connected with white matter tracts. It is encouraging that the regions identified are sensible in the context of semantic storage and retrieval during category fluency.

More importantly, the approach that Phil put together tackles the larger problem of how to combine data from multiple modalities in a quantitative, model-driven approach. I hope that we see more studies that follow a similar approach.

Reference:

Cook PA, McMillan CT, Avants BB, Peelle JE, Gee JC, Grossman M (2014) Relating brain anatomy and cognitive ability using a multivariate multimodal framework. NeuroImage 99:477-486. doi:10.1016/j.neuroimage.2014.05.008 (PDF)