This chapter reviews both the development and current trends of research into human abilities to recognize other speakers by the voice, considering convergent evidence from behavioural psychological experiments, clinical case studies, and studies using methods from the cognitive neurosciences. First, substantial evidence suggests that the recognition and identification of voices of well-known speakers and the discrimination of speaker identity for unfamiliar voices represent separate abilities. Second, and unlike for other social signals such as vocal emotion, there is no unitary set of acoustic parameters that is crucial to voice-identity recognition. Third, although much current research points to the possibility that voice identity is represented in a norm-based manner, we currently lack a detailed computational model of the representation of individual known voices. Rapid technological progress may soon promote better understanding of dimensions of statistical variation between familiar voices, and possibly the nature of their mental representation. Fourth, there are remarkably large individual differences in voice-recognition abilities in the general population, and this chapter discusses a number of factors related to those differences. Fifth, research into voice learning has begun to elucidate the time course of brain mechanisms mediating the acquisition of speech-invariant representations of voice identity, and neuroimaging research has both established voice-sensitive areas in temporal cortex and identified a network of areas that subserve various aspects of voice-identity processing. Finally, the chapter discusses the role and possible mechanisms of audiovisual face–voice integration in the perception of speaker identity.

Keywords: voice recognition, voice learning, familiarity, norm-based coding, voice-morphing technology, phonagnosia, face-voice integration, individual difference, EEG, fMRI

