Processing of speech and non-speech sounds in the supratemporal plane: auditory input preference does not predict sensitivity to statistical structure

The supratemporal plane contains several functionally heterogeneous subregions that respond strongly to speech. Much of the prior work on the issue of speech processing in the supratemporal plane has focused on neural responses to single speech vs. non-speech sounds rather than focusing on higher-level computations that are required to process more complex auditory sequences. Here we examined how information is integrated over time for speech and non-speech sounds by quantifying the BOLD fMRI response to stochastic (non-deterministic) sequences of speech and non-speech naturalistic sounds that varied in their statistical structure (from random to highly structured sequences) during passive listening. Behaviorally, participants were accurate in segmenting speech and non-speech sequences, though they were more accurate for speech. Several supratemporal regions showed increased activation magnitude for speech sequences (preference), but, importantly, this did not predict sensitivity to statistical structure: (i) several areas showing a speech preference were sensitive to statistical structure in both speech and non-speech sequences, and (ii) several regions that responded to both speech and non-speech sounds showed distinct responses to statistical structure in speech and non-speech sequences. While the behavioral findings highlight the tight relation between statistical structure and segmentation processes, the neuroimaging results suggest that the supratemporal plane mediates complex statistical processing for both speech and non-speech sequences and emphasize the importance of studying the neurocomputations associated with auditory sequence processing. These findings identify new partitions of functionally distinct areas in the supratemporal plane that cannot be evoked by single stimuli. The findings demonstrate the importance of going beyond input preference to examine the neural computations implemented in the superior temporal plane.