The Voice in Computer Music and Its Relationship to Place, Identity, and Community

Abstract and Keywords

In computer music, the voice has been used to explore and problematize concepts of place, identity, and community. While the exploration of subjective and social experience through verbal content is often not as intensive in computer music as it is in song-setting or poetry, the transformation and contextualization of the voice is much more radical and multidimensional. This article is concerned with the spoken voice rather than singing, and its emphasis is cultural, interpretative, and highly selective rather than technical, taxonomic, comprehensive, or historical. It considers place, identity, and community separately, but inevitably these are all interwoven and interdependent. Electroacoustic sounds often function as multiple metaphors: a piece that can seem to be about nature, for example, but also be interpretable as a psychological landscape. In addition, many of the works function on a meta-level.

Keywords: computer music, poetry, spoken voice, electroacoustic sounds, psychological landscape

In Computer Music the voice has been used to explore and problematize concepts of place, identity, and community. While the exploration of subjective and social experience through verbal content is often not as intensive in computer music as it is in song-setting or poetry, the transformation and contextualization of the voice is much more radical and multidimensional. In computer music, techniques such as sampling, granular synthesis, filtering, morphing, and panning across the spatial spectrum provide the means for extensive exploration of the relationship of the voice to the environment, the voice as a marker of identity, and the role of voices in forging or eroding community. Computer music also exploits the range of the voice from speaking to singing—including breathing, mouth, and throat sounds—and this makes it easier to introduce a wider range of identities and contexts than in singing alone.

The chapter concerns the spoken voice rather than singing, and its emphasis is cultural, interpretative, and highly selective rather than technical, taxonomic, (p. 275) comprehensive, or historical; an analysis of different verbal/vocal techniques in computer music has already been effectively carried out by Cathy Lane (2006). I consider place, identity, and community separately, but inevitably these are all interwoven and interdependent. Electroacoustic sounds often function as multiple metaphors: a piece can seem to be about nature, for example, but also be inter-pretable as a psychological landscape. In addition, many of the works function on a meta-level (a piece that challenges us with two simultaneous sonic environments may also be about the difficulties and attractions of trying to listen to two different sound sources at once).

1. The Voice and Its Relationship to the Environment

Postmodern geographers such as Steve Pile, David Harvey, and Doreen Massey have suggested place is not bounded or fixed, but fluid and dynamic—culturally, politically, and economically. Any place is traversed by shifting economic and social relationships and has links to other places (Smith 2000). In computer music, the fluid relationship of the voice to the environment is projected in a multilayered and changing way by means of a “voicescape” that challenges the idea of place as static, unidimensional, bounded, or natural. The voicescape consists of multidimensional and multidirectional projections of the voice into space (Smith and Dean 2003). In the voicescape, there is a plurality of voices, and at least some are digitally manipulated: identities are merged, multiplied and denaturalized through technological interventions. Voicescapes may create a strong connection between voice and place, but they also undo any such relationship: a voice may seem to emanate from several places at once, morph through different contexts, or reside in a space that is abstract.

The voicescape in computer music features the voice in a range of environments from urban to rural, everyday to surreal, domestic to public, Western to non-Western, and some pieces move between multiple environments and between specific and nonspecific spaces. These voices may be “found” voices that arise in “sound walks,” voices that emerge as part of quasi-documentary interviews, or voices and environments that are constructed in the studio. Sometimes, there may be a strong sense of a particular place; at other times, the space into which the voice is projected is more abstracted. The environments are not necessarily discrete—they are often superimposed, juxtaposed, or morph into each other. The voice may be foregrounded or backgrounded with regard to the environment, detached from or integrated into it, commensurate with or antithetical to it, creating various kinds of “landscape morphology” (Field 2000): the relationship between voice and space is also always constantly changing and “becoming,” so that a voice may be (p. 276) foregrounded at one point of the piece and backgrounded at another. This range of techniques and approaches allows the composer to exploit the voice and its digital manipulation to evoke ideas not only about ecology, industrialization, gender, and ethnicity, but also about displacement, marginalization, and alienation.

In some pieces, voice and environment are molded together to produce what I have elsewhere called a “hyperscape” in which both body and environment are dismantled and then put together in continuously changing ways that challenge the unity and stability of each (Smith 2000). Trevor Wishart's explosive and dense Vox 5 (1990) is one of the most powerful examples of this. Wishart achieved these effects by employing computer programs he wrote to manipulate sound analysis data obtained from Mark Dolson's Phase Vocoder program. In his sleeve notes, he documents how the programs “permitted the spectra of vocal sounds to be stretched (making them bell-like) or interpolated with the spectra of natural events” (Wishart 1990, program notes).

In Vox 5, chest and throat sounds are superimposed onto, juxtaposed with, and morphed into, environmental sounds. There is sometimes a strong sense of a particular environment—such as when seagull and sea sounds are coordinated near the beginning—but there is often considerable ambiguity about some of the sounds that are a mixture of the bodily and environmental (as well as those that are more strongly instrumental). At times, sounds cross generic boundaries, for example, the human sounds seem like animal sounds, and there is a continuum among human, animal, environmental, and climate sounds. Moreover, the landscape and impression of the climate is constantly changing and unstable: At one point (ca. 0 min 32 s), we hear bird and simulated sea sounds; at another point (ca. 2 min 23 s), the throat sounds morph into bee sounds, and toward the end of the piece (ca. 4 min 32 s) there is a thunderstorm.

The voice in Vox 5 emanates not from the head but from the chest and throat, suggesting the deeply grounded home of the voice in the entire body. The vocal aspect of the piece projects the effort of coming into being: it struggles to articulate itself, and at one point (ca. 3 min 44 s) we hear the edges of words, the beginnings of language. Wishart has said that the piece contains “poetic images of the creation and destruction of the world contained within one all-enveloping vocal utterance (the ‘Voice of Shiva’)” (1990, sleeve notes); the elemental power of the piece seems to make it computer music's answer to Stravinsky's Rite of Spring. But, the unstable merging of the human and environmental mean that Vox 5 is widely interpretable; it could also be understood as an evocation of birth and the acquisition of language or the birth of language itself. Alternatively, the piece could be interpreted psycho-analytically as playing out the primal desires of the unconscious or as a glimpse of the Lacanian “real”: the prelinguistic realm that precedes the child's entry into language (Lacan 1977).

Other computer music pieces negotiate the relationship between the voice and different kinds of cultural space. Paul Lansky's Stroll (1994), which the composer refers to as a short “reality play,” projects the impression of two spaces that are both conflicting and complementary: that of a concert or studio space where a chamber (p. 277) music ensemble plays tonal, bright, and somewhat jazzy music and that of a crowd in an urban shopping mall (although the instrumentalists might also be playing in the mall). The voices in the shopping mall are initially more distinct—we can hear the voices of children—and are accompanied by other noises, such as the clatter of shopping carts. However, they become progressively more filtered and less individuated (although individual voices occasionally obtrude, and the sounds oscillate between voicelike and noiselike).

The piece reflects the heterogeneity of urban life: sometimes the two environments seem in conflict, at other times they appear to be interwoven, reciprocal, and reflective of each other. This ambivalent relationship between the two spaces also gives us a new slant on musical concordance and discordance. As the piece progresses, the crowd voices become more integrated with the chamber music, while the chatter from the shopping mall becomes inflected by rhythm and pitch and is more teleologically structured. Important here are the demands on our attention: the shopping mall crowd sounds and the chamber music seem to require different modes of listening that can only be resolved when they are fused. The piece consequently achieves a meta-level that raises questions about the act of listening itself. It becomes about our attempts to resolve the two modes of listening recorded—sometimes concentrating on one sound source more than the other—with our absorption greatest when the two are meshed together.1

The effect is at times reminiscent of weddings where people play to the chatter of voices and the tinkle of teacups and where musical space and social space are subject to continuous interruption. But, the piece also suggests that our lives have to adapt to a multiplicity of stimuli and social registers simultaneously, and that the urban is sonically heterogeneous. With one foot firmly rooted in the everyday and another in the making of music, the piece oscillates between integrating and separating the worlds of popular culture and high art.

Hildegard Westerkamp's A Walk Through the City (1996) is also an “urban environmental composition” (program notes) based on sound bites from Vancouver's skid row. However, it is concerned with interweaving real-world space and a studio-based, poetic space. It juxtaposes, but also intertwines, a soundscape of the city (including incidental voices and studio transformation of some of the sounds) with a performance of a poem that can be interpreted as commenting on, and relating to, the soundscape. The simultaneous separation and merging of spaces reflects the double face of the city, its dynamism but also its brutality; A Walk Through the City is the sonic equivalent of de Certeau's “long poem of walking” (1984, p. 101), which mobilizes resistant meanings beneath the city's surface. Walking is a way of unpicking what de Certeau calls “the city-concept,” the powerful, hierarchical, and sanitized city that “must repress all the physical, mental and political pollutions that would compromise it” (p. 63).

The cityscape is presented through synecdochal fragments (that is, parts substitute for the whole); it is also partially abstracted—so the opening helicopter sounds turn into sustained dronelike sounds with slowly sliding pitches that nevertheless retain something of their aerodynamic quality. This morphing (p. 278) between the concrete and the abstract creates “a continuous flux … between … real and imaginary soundscapes, between recognisable and transformed places, between reality and composition” (Westerkamp 1996, program notes). The more abstracted sounds are broken at about 3 min 29 s by the loud and intrusive screeching of car brakes, car horns, ambulance sirens, and the swish of traffic, although the drone sounds still fade in and out. The voice of the poet enters at about 4 min 54 s but is increasingly counterpointed with “unscripted” voices from the city: the looped, ethereal voices of children singing and speaking, human chesty sounds, and the words of street people and immigrants. We hear the voice of one drunk, who keeps asking why he has to be in this condition, “Tell me, just one thing, why do I have to get drunk, why, you tell me that?” (Westerkamp 1996). The voices not only are indices of the colorfulness of the city, but also convey its economic deprivations and underlying brutality in the form of noise pollution, the plight of the disadvantaged, the vulnerability of children. Similarly nonverbal sounds continue throughout the piece, conveying both the negative and positive faces of the city—at about 8 min 0 s onward they become, for a short period, highly rhythmic and dynamic.

The poem is performed by its author, Norbert Ruebsaat, whose voice sometimes seems to be projected naturalistically and at other times—including the first time it appears—is treated in such a way as to seem “boxed in.” The words are delivered at varying dynamic levels and toward the end (after a short monologue by the vagrant) transform into multiple whispers. The poem is not read in an entirely linear fashion; some sections of it are repeated, and sometimes the author's voice is multitracked, with different versions of the voice competing with each other. The words strongly reinforce the soundscape and speak of the violence, loneliness, deprivation, and alienation experienced in the city, for example, in phrases such as “somewhere a man/is carving himself/to death/for food” and “day like an open wound.” The poem both stands apart from the soundscape and is integrated into it, with some illustrative effects, for example, after the words “cymbal crash” there is a literal crash.

Katharine Norman's In Her Own Time (1996) negotiates not only different spaces but also different time frames. Here, the telling of stories interweaves personal and cultural memory through shifts between a past and present sense of a particular place. As Fran Tonkiss suggests, sound may be the most dominant aspect of memory: “Hearing … involves a special relationship to remembering. It might, as Benjamin says, be the sense of memory. The past comes to us in its most unbidden, immediate and sensuous forms not in the artifice of the travel photograph, but in the accident of sounds half-remembered” (2003, p. 307).

Relevant to the piece is also the Freudian concept of Nachträglichkeit or after-wardsness. This has been adapted by contemporary commentators to explore the act of remembering not simply as the uncovering or restoration of a buried past but as a reenactment or reinvention of the past in the present (King 2000, p. 12). This reworking of Nachträglichkeit has often pivoted around the idea that the past is not a recording that can be played back word for word but is “an intricate and ever (p. 279) shifting net of firing neurons … the twistings and turnings of which rearrange themselves completely each time something is recalled” (Grant quoted in King 2000, p. 15). This is particularly relevant to Katharine Norman's attempt to catch her mother's act of remembering in real-time sound and by means of a quasi-documentary style that has an unedited and naturalistic feel to it, so that it incorporates voices in a way that Norman herself calls “unclean” (2004). The piece is less a voicescape than an interruptive monologue. Norman's mother relates stories about bombing in London during World War II, accompanied by musical sounds and punctuated by occasional prompts from the interviewer (who is Norman). The mother's voice appears throughout in stereo and is at times processed, if somewhat minimally. The accompanying drone sounds sometimes suggest overflying airplanes, but at other moments have a more abstract connection with the text.

In Her Own Time (Norman 1996) evokes a domestic environment (at one point, the phone rings). Within this, however, we are displaced into a previous era as Norman's mother tells us that the prospect of war seemed exciting when they were children. Now, however, her adult view of the severity of war is superimposed on this. The piece also “travels,” and at one point we are transported—seemingly in a car—to the location at which an uncle who was killed by a bomb used to live. There is a stark contrast between the domestic warmth of the present, which is reinforced by the mother's unpretentious, light-handed storytelling, and the grim history of war. It evokes the ever-present tension between the freedoms of everyday life and the sometimes-horrific memories, both personal and cultural, that human beings repress.

Most of the pieces we have considered so far involve Western environments, but computer music composers have also negotiated the voice in relation to non-Western spaces. Pamela Z's Gaijin (2001a) is about how Z, an African American, responds to living in Japan. Z is a composer, performer, and multimedia artist who uses electronic processing and sampled sounds and whose oeuvre is a mixture of singing, speaking, and multimedia elements. These sounds are often triggered via custom MIDI (musical instrument digital interface) controllers such as BodySynth or Light SensePod, both of which allow her to manipulate sound with physical gestures.

Her work Gaijin, which arose out of her residency in Japan in 1999, is a multimedia work: it involves both live performance and electronic processing and combines spoken text, music, and butoh performance (Z 2001a, 2001b). In Gaijin, a sense of place is intertwined with feelings of displacement: it explores the experience of being foreign in Japan. The piece probes “the idea of foreignness—whether that means visiting a country that you're not used to, or feeling like a foreigner in the place where you live, or all kinds of other ways that a person could feel foreign” (Z 2000) and the word gaijin means foreigner. Z has said, “One of the things I learned in Japan is that if you're not Japanese, if you don't look Japanese, if you don't speak Japanese, then you will always be a gaijin. It was a real lesson to me because I began to be aware of what people who live in my own country must feel (p. 280) like when they're never allowed to feel that they belong, because other people don't allow them to or because they just don't feel like they do” (Z 2000). As George Lewis points out, this experience is intensified “when the subject singing of her Japanese experience is an African American woman who … sits as a group near the bottom of the US social hierarchy” (2007, p. 67).

Although Z refers to her adoption of “character” in such works (Z 2000), the notion of character is somewhat different from that in more conventional narrative because of its propensity for transformation rather than development or consistency. The excerpts from Gaijin on Z's Web site include multiple layers of sonic, verbal, and visual material; readings from a book about how to fit into Japanese culture; a polyphonic rendering of the word other; a performance of immigration requirements by Z, who puts herself in the visa document as its official voice and yet is clearly an addressee of it; Z singing in Japanese enka-style crooning (Lewis 2007, p. 68); and visual and sonic renditions of the Japanese alphabet. There are video images of butoh movement, whose minimalism and slowness is often in strong contrast to the polyphonic “Pamela” soundtrack and live performance, and Z herself is sometimes dressed in traditional Japanese style.

Finally, my collaboration with Roger Dean, The Erotics of Gossip (Smith and Dean 2008), is based on a particularly flexible relationship between voice and place that is sometimes realist, sometimes not. The piece is one of a series of sound technodramas by us that combine words and sound (and technological manipulation of words and sound). In the piece, the environments slide between locatable sites of power such as the courtroom, pulpit, and classroom and ambiguous, imagined, and futuristic spaces (Smith and Dean 2003). For example, at one point a churchlike environment produced by organ effects accompanies a monologue by a pompous clergyman. However, this fit between voice and place is seriously disrupted by atonal sliding sounds. In other places, sound and voice combine to evoke an unreal or imagined space. For example, there is a passage that begins, “For many months I observed the orders not to talk,” which is about a futuristic authoritarian regime where talk is prohibited, but two women develop an intimate relationship with each other through “a dialogue made of hands” (Smith and Dean 2008). The accompanying sound, which circles around a few notes initially but becomes progressively denser, reinforces the threatening scenario evoked by the words, but together they create an allegorical and futuristic space rather than a realistic one.

The relationship between voice and place is also complex and transformative in another of our sound technodramas Nuraghic Echoes (Smith and Dean 1996). Inspired by the Nuraghi, the stone towers that arose out of Nuraghic civilization in 1500 bc, the work consists of three textual/voice strands that relate to the past, present, and future and three sonic strands that are in both conjunctive and disjunctive relationship to the textual/voice strands. Through these shifting relationships the work both suggests a sense of place and yet dissolves it into “in-between and non places” (discussed in Bailes et al. 2007).

(p. 281) 2. Human to Nonhuman Identities

So far, the focus has been primarily on the relationship of voice to place but less on the identities that the voices project. However, in the voicescape the voice is no longer a fixed marker of identity: identities are merged, multiplied, and denaturalized through technological interventions. These include overlaying and multiplication of voices, exploitation of the spatial spectrum so that voices are projected from different positions in audio space, the use of reverberation to change the quality of the voice, and other forms of digital manipulation such as filtering and granular synthesis. Sometimes, the voice has a cyborg quality to it as in Charles Dodge's early speech songs (1992), and even when the voice is used naturalistically—as in Norman's In Her Own Time (1996)—it may be subject to some processing.

Computer music composers seem to play more with social identity and less with the direct expression of subjective experience that is central to much traditional poetry (although usually deconstructed in experimental poetry). If there is an emphasis on subjective experience, composers project fluid and transformative subjectivities. On the other hand, as noted, some works could be interpreted psychoanalytically as projecting unconscious experience or what Lacan calls “the real”: the prelinguistic realm in which the child exists before he or she enters language and that the adult always wishes to recapture but cannot. Wishart's Vox 5 (1990) for example, in its emphasis on prelinguistic utterance—and its merging of the body and environment—could be interpreted as a journey into the unconscious. Music is intrinsically stronger than poetry in its capacity to project the real because it is relatively unencumbered by semantic meaning. Due to its greater degree of abstraction, music also has an even greater capacity than poetry to project metaphors that have multiple meanings.

Sometimes, computer music composers manipulate the voice for the purposes of political intervention. Wishart's Voiceprints (2000c), for example, processes well-known voices, resulting in a comment about the commodification of identity, and has a strong satirical aspect. The sleeve has a made-up dictionary definition on the cover: “1. n. by analogy with ‘fingerprints’—sound recordings of individual persons' voices used for the identification or assessment. 2. vb. to make a ‘voiceprint’” (Wishart 2000c, sleeve notes). However, the word voiceprints also suggests footprints, which are more ambiguous as markers of identity. Footprints show that someone has been present and suggest the gender of that person, but do not identify exactly who (they also tend to reveal the imprint of the shoe rather than the foot). “Two Women” (Wishart 2000b) is divided into four sections in which the voice is subjected to extreme variations of speed and pitch. There is considerable fragmentation, looping, and repetition of the words that are integrated into sweeping musical textures; these are alternately forceful and delicate, dense and sparse. “Siren” treats the voice of Margaret Thatcher quoting St. Francis of Assisi, “Where there is discord, may we bring harmony” (program notes). The word harmony is stretched and emphasized, although some of the other words are (p. 282) speeded up: the voice is accompanied by train and other cataclysmic electroacoustic sounds. “Facets” treats the voice of Princess Diana talking about press photographers, “There was a relationship which worked before, but now I can't tolerate it because it has become abusive, and it's harassment” (program notes). It begins in a hesitant, fragmented way that captures a whining whimsicality; it then becomes much more dense as the talk turns into fast babble, accompanied by sustained electronic sounds that unwind toward the end. “Stentor” treats the voice of Ian Paisley, who says in a kind of reverse-prayer, “Oh God, defeat all our enemies … we hand this woman, Margaret Thatcher, over to the devil, that she might learn not to blaspheme. And Oh God in wrath, take vengeance upon this wicked, treacherous, lying woman…. Take vengeance upon her O Lord!” (program notes). The extreme elongation on “Oh God” sounds like a brutal clarion call, and there are accompanying thunderclaps and other elemental sounds. “Angelus” again processes Princess Diana's voice commenting on the “fairy story” in which she was a principal actor and saying she wants to be “the queen of people's hearts”; it transposes recognizable words to babble and then to pure pitch, after which Diana's voice returns again in recognizable but fragmented form. Each voice conjures up iconic moments in recent British history: Paisley's is inseparable from the divisive history of Northern Ireland, Diana's from the perturbation to the monarchy. There is also plenty of irony: Thatcherism could hardly be said to bring harmony, at least not in the sense of social cohesion. Again, identity is interwoven with place and movement between places, and the pieces are framed by train station voices announcing departures and arrivals—perhaps interpretable as a comment on shifting identities.

In Voiceprints (Wishart 2000c), the emphasis is on mediation: all the comments have been communicated via the media and are now processed again by Wishart. We are used to visual artists satirizing commodification (e.g., Warhol in his silk screens of Jackie Onassis and Marilyn Monroe), but these pieces engage with a similar phenomenon in sonic terms. Like Warhol's images, which often seem to combine the projection of a mediatized image with a certain affection—even reverence—for their subjects, these portraits are also highly ambiguous. Wishart depoliticizes the extracts of Diana's voice by calling them personal portraits, but they can be read, nevertheless, as combining satire and pathos. These processed voices also raise the question of how far a voice retains its own identity when it is digitally manipulated: the voices range from recognizable Diana and Thatcher to moments when we would not know who was speaking. Perhaps the pieces suggest the fragility of political power and celebrity (and more broadly speaking of any concept of identity or personality). Of these pieces, “American Tryptich” (Wishart 2000) seems to be particularly successful with its iconic blend of the voices of Martin Luther King, Neil Armstrong, and Elvis Presley—sometimes combined in ways that are highly polyphonic and rhythmic, but also employing environmental sounds such as crackly radio sonics from outer space.

Voiceprints (Wishart 2000c) plays with particular identities, but Lansky's pieces often call into question what constitutes an identity. Lanksy's “smalltalk” and “Late (p. 283) August” on his CD Smalltalk (1990) take up the idea of identity in compelling ways; here, identity becomes the contours and rhythms of speech:

Sometimes when at the edge of consciousness, perhaps just falling asleep or day dreaming, sounds that are familiar will lose their usual ring and take on new meaning. Conversation in particular has this ability to change its nature when one no longer concentrates on the meanings of the words. I remember when I was a child falling asleep in the back of a car as my parents chatted up front. I no longer noticed what they were saying but rather heard only the intonations, rhythms and contours of their speech. The “music” of their talk was familiar and comforting, and as I drifted off it blended in with the noise of the road. (program notes)

This effect is achieved through a filtering process: “The music was created by processing a conversation between Lansky and Hannah MacKay (his wife) through a series of plucked-string filters tuned to various diatonic harmonies. The resulting sound is like a magic zither, where each string yields a fragment of speech” (Garton 1991, p. 116).

The conversation is recognizable as conversation (one can even recognize the genders of the speakers), but the words are not distinguishable. Lansky, who sometimes sees himself as the musical equivalent of a photographer, compares the process “to blowing up the pixels of a colour photograph so that familiar shapes become abstract squares.” Lansky then added “a soft, sustained chorus” (1990, program notes).

Although Lansky sees the chorus as “a place to let your ears rest when listening to the music of conversation or attempting to hear the words behind it” (1990, program notes), I agree with Brad Garton that it is the chorus that underpins the whole work:

To me … these subtle harmonies embody the meaning of the piece. The chords, outlined by the choral sounds, define different harmonic regions as the piece progresses. I can easily imagine these harmonic regions shadowing the flux of context and meaning in the real conversation, or that the flow of harmony represents some deeper continuity which underlies even the most innocuous conversations we have with our family and friends. In any case, it is by pushing my concentration through the words to the longer tones that I become fully enveloped in the music. Some of the most gorgeous moments occur when there is a temporary lull in the conversation, and this harmonic substrate is allowed to exist briefly by itself. (Garton 1991, p. 116)

According to Lansky, Smalltalk tries to capture the spirit, emotions, and music behind and within our conversation (1990, program notes). This turns into an experiment in cultural identity when Lansky applies the same process to Chinese speakers in another piece, “Late August.” Although the pieces sound quite similar, Garton points out that “Late August” is “much more animated, the harmonic rhythm of the piece moving along at a faster rate” (1991, p.116). This may be because Chinese is a tonal language, and the Chinese use pitch to discriminate between words that look the same when written. Consequently, the piece is culturally resonant but in ways that are quite indirect.

(p. 284) Other works investigate identity from the point of view of ethnicity, although this may occur as part of a broader probing of identity. In Z's work, exploration of identity takes the form of an eclectic attitude toward different traditions—including opera, electroacoustic music, and sound poetry—which she moves between or superimposes. Bone Music (Z 2004a), for example, emphasizes a solo singing line (with rhythmic accompaniment) that seems to emanate from a mixture of Western and non-western musical traditions—there could be a hint of eastern European and Indian music—but then explodes into a babble of electroacoustically generated voices that grow to considerable intensity. Sometimes Z's works consist of a solo singing line, sometimes she duets with herself—singing or speaking through the use of a delay system or multitracking. In this way, she often creates several layers of utterance; sometimes, this becomes a dense polyphony of voices.

Lewis points to the transformation, multiplication, and layering of the voice in Z's work. Referring to Z's music as “post-genre (and post-racial) experimentalism,” he suggested that such multiple identities are commensurate with fluid and polyphonic notions of African American identity (2007, p. 65). Certainly, Z's work is interpretable from a number of different points of view, opening up many questions about cross-cultural relationships, articulation, and language—questions that embrace racial identity but are not limited to it.

In Z's work, identity is about her relation to others. Her text-based works are often interrogative, addressing ambiguously either another person or the audience at large. “Pop Titles ‘You’” (Z 2004c) takes one of the most interesting devices in poetry (the use of the second person and the ambiguity it creates about the addressee) and makes rhythmic and musical effects out of it. The “you” could be a lover, a relation, the general public, Westerners, and so on. “Questions” (Z 2004d) combines a track in which numerous questions are asked with another track that features expressive singing.

Lewis draws attention to the way in which black musicians have often been stereotyped as nontechnological or black culture as incompatible with technology, but nevertheless point to the fluidity of Z's thinking:

Certainly, artists such as Pamela Z reject the notion that electronics could be rigidly raced, and that any entry into the medium by African Americans necessarily constituted inferior imitation of white culture, economic opportunism, and/or general racial inauthenticity. Moreover, in Z's post-genre (and post-racial) experimentalism, we find a twinned practice of cultural mobility with self-determination that authorizes her to draw from any source, to deny any limitation whatsoever. This assertion of methodological and aesthetic mobility may be viewed as an integral aspect of the heterophonic (rather than simply hybrid) notion of identity found in her work. (2007, p. 65)

Lewis also discusses how Z draws on the operatic tradition without resorting to the stereotypical roles which women often take in opera: “Pamela Z takes on many dramatis personae, but two that practically never appear in her work are the hysterical victim and the Medea figure so beloved by European opera. In general, (p. 285) Z's stage personae represent women in full control and authority—never confrontational, but powerful and confident. For Z, opera (and its bel canto evocation and extension in her work) does not lead inevitably, following Catherine Clement, to the ‘undoing of women’” (2007, p. 69).

Gender identity is in fact an important issue in electroacoustic music because there is a predominance of male composers. Bosma argues that there is much stereotyping of the roles of men and women in this music, particularly with regard to singing: “A musical partnership of a male composer and a female vocalist is typical of electroacoustic music. This stereotype relates woman to body, performance, tradition, non-verbal sound and singing, and man to electronic music technology, innovation, language and authority. It resonates with the tendency in contemporary Western culture to associate singing with women, not men, … while technology is seen as a man's world…. More generally, it reflects the dualistic opposition of masculinity versus femininity and mind versus body that is so prevalent in our culture” (Bosma 2003, p. 12).

However, she argues that this stereotyping is somewhat deflected by recording: “The gender distribution of … pre-recorded voices in much more equal compared to the use of live voices, and the use of pre-recorded voices is much more varied than the use of live voices in this genre. Moreover, pre-recorded voices are often manipulated, sometimes dissolving or bending the gender of the voice” (2003, p. 12).

Donna Hewitt is an Australian composer-performer who has developed the eMic (Extended Microphone Interface Controller) and employs it to challenge male constructions of female identity:

The eMic is based on a sensor-enhanced microphone stand and I use this in combination with AudioMulch and Pure Data software. The interface, which is intended to capture standard popular music and microphone gestures, challenges the passive stereotype of the female singer (with their voice controlled by a male front-of-house mix engineer), in that it enables the performer to take charge of their own signal processing. The eMic project has helped me establish a unique creative space where I can comment on, and subvert, existing stereotypical male constructions of the female vocal performer. (2007, p. 186).

“Geekspeak” (Z 2004b), on the other hand, is a humorous exploration of masculinity. Masculinity was once stereotypically defined through heroism, virility, and rationality, but as David Buchbinder has pointed out (1994), the traditional models of masculinity have been dying since the end of World War II. It is arguable, however, that these defining markers of masculinity have been replaced, at least partly, by technological ability and know-how. “Geekspeak” satirizes the effect of technology on men and the degree to which they start to identify themselves through that technology; it is also about how we put labels on people, whether those labels have any validity, and whether they might also apply to ourselves (Z 2004). To make the piece, Z sampled the voices of a number of researchers and programmers who she met during an artist-in-residency at Xerox PARC (the Palo Alto Research Center) in Palo Alto, California (program (p. 286) notes). The piece is a mixture of genres: similar to sound poetry in its loops, multiple voices, collage, and overlays (but nevertheless arguably a parody of sound poetry in the kind of technical material it chooses to make into musical motives, such as a loop of the word backslash). Nevertheless, it also has elements of the documentary or radio show with much background noise—someone flicking the pages of the dictionary, the sound of a computer keyboard as one of the interviewees looks up the word geek on the Internet. The male voices struggle with definitions of what a nerd or geek might be while not ever admitting to being one.

Of course, whether the interviewees are nerds or geeks depends on what the definition of a nerd or geek is: different definitions vie with each other during the piece, some more pejorative than others, some eliding the notion of nerd and geek. Nobody raises the issue of whether a geek or nerd could be female, and all the voices are male. Certainly, the interviewees show characteristics that might be seen as geeky or nerdy (the final speaker says, “Why would I spend money on clothes when I could buy an external hard drive?”), and all of them are geeks in the sense of being computer geeks who rhapsodize about hardware/software. The phrase spoken by one of the men, “I find it difficult to articulate it” is also mischievously repeated (sometimes in truncated form), thereby hinting that the wordiness and self-confidence of some of the interviewees might just be top show. It is ironic that the piece is about technophilia but written by someone who is a technophile herself, and if it is partly satirical, it is also sympathetic. Is the implication that women who are technophiles actually manage to avoid being geeks, or does the composer actually implicate herself in geekiness?

Processing of the voice can also undermine gender stereotypes. For example, it can provoke ideas about the instability of gender and sexual ambiguity through “sonic cross-dressing,” or gender morphing, with the gender of the voice transmuted through pitch and timbre changes, or the continuum between male and female extensively activated (Smith 1999). (Such manipulation of the voice could have interesting applications with regard to ethnic identity, although seemingly this is a less-explored area.) Gender morphing is a prominent feature of my collaborations with Roger Dean. In The Space of History (Smith et al. 2006), for example, a putative female “terrorist” meditates on what the reactions of the audience would be if she locked them in the concert hall. However, later in the piece, after she has left the performance space, her voice is manipulated in ways that problematize her identity: her voice sounds sometimes like a child, sometimes like a man, and sometimes cyborglike and appears at many different points along the continuum between male and female. This challenges stereotypical images of terrorists in terms of gender, age, and behavior.

Such explorations can probe the norms of both gender and sexuality. Barry Truax (2003) has drawn attention to the lack of homoerotic content in electroacoustic music; his own work, however, has tried to overcome this by blurring gender distinctions, sometimes through manipulation of the voice:

(p. 287) My first work based on the granulation of sampled sound was The Wings of Nike (1987) whose sole source material for three movements was two phonemes, one male, the other female. They were used with high densities of grains, including transpositions up and down an octave, thereby sometimes blurring the distinction in gender. A more extensive use of gendered text was involved in Song of Songs (1992) where a decision has to be made how to use the original Song of Solomon text which includes lines that are conventionally ascribed to the characters of Solomon and Shulamith, his beloved. The simple solution was to have both the male and female readers of the text record it without changing any pronouns. When these versions are combined, the listener hears both the male and female voice extolling the lover's beauty. (p. 119)

In Truax's music-theatre piece/opera “Powers of Two: The Artist” (1995), he brings together two high-pitched male voices. He suggests that “the intertwining of two high-pitched male voices creates a potentially homoerotic sound that, it seems, most heterosexual composers have avoided” (Truax 2003). Truax points out that gender is inescapable in opera, and that while heterosexuality has been the predominant norm, there has also been a tradition of partially concealed homosexual themes that have been apparent to the homosexual community. It is this tradition to which he gives a new twist with his own electroacoustic operas.

3. The Voice, Computer Music, and Community

Voicescapes tend to embody ideas about social interactions and their role in community. Sometimes, the words are blurred or incomplete, conveying only the contours of speech but nevertheless suggesting the dynamics of social exchange, the ambience of talk and gossip, and the shifting balance between talking and listening. Different pieces engage with different kinds of community, from the family to the shopping mall, and often explore ideas around communication and the degree to which we are listening, or not listening, to each other.

Community does not necessarily mean knowing other people in depth. Rather, Iris Marion Young suggests that community can be defined as “a being together of strangers” and is the embrace of difference expressed in the overlapping and intermingling of different social groups. This brings about a form of public life in which “differences remain unassimilated, but each participating group acknowledges and is open to listening to the others. The public is heterogenous, plural, and playful, a place where people witness and appreciate diverse cultural expressions that they do not share and do not fully understand” (1990, p. 241).

Lansky's “Idle Chatter” could be interpreted as such a being together of strangers. Avery animated and rhythmic piece made out of the sounds of talking—treated (p. 288) in such a way that they sound almost like bee sounds—accompanied by sustained singing, it suggests a community of chatterers. Nevertheless, we cannot hear the content of the talk or identify individual speakers; the piece challenges the dividing line between speech and noise and is what Dean calls NoiseSpeech (Dean 2005, Dean and Bailes 2006): “NoiseSpeech … refers to a range of sounds close to the maximal complexity of noise but which seem to be derived from speech though lacking any detectable words or phonemes. NoiseSpeech can be made by digital manipulation of speech sounds, such that words and phonemes are no longer intelligible, or by superimposing the formant structure (spectral content) or prosodic pitch and dynamic features of speech onto other sounds, both noise, and environmental and instrumental sound” (Dean and Bailes 2006, p. 85).

Lansky describes his methods, intentions, and techniques in making the piece in the program notes to the CD: “Idle Chatter” is an eloquent attempt to say nothing without taking a breath …. The sounds in “Idle Chatter” were all created using linear prediction and a massive amount of sound mixing. Rhythmic templates and masks were used to scatter arbitrary vowel and consonant sounds in relatively even distributions throughout the piece. An underlying idea was to create an elusive illusion of regularity and coherence. The choral sounds were similarly created but without the use of rhythmic masks, and with much more intense distributions. A great deal of the piece was written with computer aided algorithms” (1987).2

Similarly, although with different intentions and effects, in “The Erotics of Gossip” (Smith and Dean 2008) an interactive Max patch is used to convey the impression of gossiping voices. The process is described as follows:

An interactive MAX patch was … used to create a varied multiplicity of overlapping spoken texts, notably in the passage about 2′ into the piece. Here the algorithm allows the random and controlled fragmentation of the individual recorded verbal phrases, and their overlapping in time and across the sonic space. One impact of this approach is to blur the identities of the individual speakers, and to merge their phrases into each others' as well as to disintegrate and reassemble them. At the moments where the greatest density of phrase overlap occurs, an impression is created of a speaking crowd, rather than a group of speaking individuals: this is another way in which the voicescape decentres voices. At other places in the piece, such a crowd voice is used as a lower dynamic sonic background; raising the question whether gossip can be music (or vice versa). In a notable passage after 9′20″, such a complex multiple and delocalised voicescape, seeming to include both real and unreal voices, crowds and individuals, transforms into a purely sonic (and apparently non-verbal) soundscape. (Smith and Dean 2003, p. 121)

Gossip is sometimes conceptualized as destructive and unethical, but it can also be seen to be subversive, creative, and a means of creating community. Through the use of Max patches in the piece, we were able to show these two different sides of gossip in passages that involved two women gossiping, two men gossiping, and a man and a woman gossiping.

(p. 289) Some pieces address the histories of particular types of community. Wende Bartley's “Rising Tides of Generations Lost” (1994) is a feminist piece about women's histories. Bartley says of the piece: “The ancient art of storytelling, always present among us, preserves what we cannot afford to lose, reminds us of what we have forgotten, highlights that which is collectively experienced, and points to new possibilities. Rising Tides of Generations Lost is one attempt to retell a small portion of the story of woman. Our collective foremothers issue forth a clear call to give voice to the rising energy centuries of common experience has created” (program notes).

In “Rising Tides of Generations Lost” (Bartley 1994), the sampled voices move from whispered vowel consonants and syllables to spoken syllables, words, and phrases accompanied by sustained electroacoustic sounds. The piece represents a coming into language, recapturing the historical and continuing need for women to find their own voices and language. The piece is at once both a historical remembering and a call to action and captures the way that women have been vilified and repressed throughout history. In the middle of the piece (ca. 11 min 30 s), one voice—overlaid with others that echo and intersect it—intones: “I compel you to see and feel…. To have the courage and the conscience to speak and act for your own freedom though you may face the scorn and the contempt of the world for doing so” (Bartley 1994). The multiple voices do not represent different aspects of the self, but more the collective sufferings and courage of women. The piece ends with some everyday conversation at about 12 min 45 s. One voice hints at loneliness and betrayal, “I don't have anybody,” but then the voices split into numerous renderings of “I,” some flattened and truncated to sound like “a.” During the piece, the musical sounds move from drone to fire sounds and then into more tonal and instrumental music. The fire sounds recall the burning of women as witches, while the tonal and instrumental music might refer to musical histories that have both included and excluded women.

The piece alludes to the hidden aspects of women's experience, a theme that is also taken up in Lane's “Hidden Lives” (1999). In this piece, women speakers read from The Book of Hints and Wrinkles (1939); Lane characterizes this as “a small piece of social history from the 1930s which describes how women should manage both their houses and themselves” (2006, p. 7). In “Hidden Lives,” the whispers and stutters of women gradually grow into fragments of “advice” taken from the book; this is the kind of domestic dogma that the history of female emancipation has tried to rebut and that suggests a crippling timetable of wifely duty and servitude. Here again, the treatment of the voice is closely tied to notions of place and begins with “outside” city crowd sounds and a loud shutting sound at about 40 s (if this is the door of a house, it sounds fittingly like the closing of a prison gate). After a series of voice sounds that are at the edges of words, but not actual words, there is a another door sound followed by breathy voice sounds. This gives an overall impression that is windlike, perhaps suggesting that women's voices can be blown to pieces. After that, the voices build up in restless stuttering and whispered patterns that evoke hidden secrets, repressed longings, underlying distress. Then, (p. 290) the words become clearer but gradually build up into a babble of phrases to do with housework and child care in which one kind of duty and obligation competes with another.

Toward the end of the piece, it sounds as if the women are literally blown away, and “Hidden Lives” finishes with reentry into the outside world. Lane says that she has “attempted to reinforce the sense of the words, or rather the world that they are describing, by structuring them as if moving through a series of rooms in a house, and the spatial claustrophobia of the work serves to emphasize the meaning and context” (2006, p. 8). Norman suggests that the voices themselves evoke domestic routines, “flurries of vocal fragments … gradually build into swishing, repetitive surges before subsiding again. These repetitive rhythmic waves are too fast to be soothing, and have a sense of industry that is perhaps reminiscent of sweeping, scrubbing or polishing” (2004, p. 111). “Hidden Lives” is, however, open to a variety of interpretations and is another example of a work that can be interpreted psychoanalytically, as I suggested. The prelinguistic babble at the beginning suggests the work of Julia Kristeva and her concept of the semiotic: a concept closely related to the Lacanian “real” but used by Julia Kristeva and other feminist writers to delineate certain aspects of female experience and expression (Kristeva 1986).

4. Conclusion

As discussed, the voice in computer music illuminates and problematizes place, identity, and community, and these different concepts are interwoven in the voicescape, which is based on the idea of multiple projections of the voice into space. The voice presents or simulates urban and rural spaces, non-Western spaces, and historical spaces, but it also juxtaposes, breaks up, and morphs them. The voice creates and deconstructs political, gendered, and ethnic identities, while assemblages of voices sometimes evoke particular types of communities and their histories. However, the exploration of place, identity, and community with regard to voice in computer music is very different from that in poetry, in which the semantic import of the words may be greater but the range and manipulation of the voice less. While computer music is not so likely to use the voice as a conveyor of subjective experience, many of the pieces can be interpreted psychoanalytically. Given that writers and musicians tend to approach words and voice differently, it is in many ways surprising that it has not been more common for writers and musicians to work together, drawing on expertise from both areas. However, it is clear that computer music is loosely connected to multimedia endeavors that combine image, voice, and text, and that this might be one of the directions in which the computerized voice becomes more omnipresent in the future.


                                                                                    (1.) An important early precursor to this kind of meta-piece is Alvin Lucier's “I Am Sitting in a Room” (1969), made with a tape recorder rather than a computer. In this piece, the voice is recorded, played back, and rerecorded until the words become unintelligible and dissolve into the “sound” of the room. The voice narrates a text that is about this process.

                                                                                    (2.) Pieces such as “Idle Chatter” manipulate the formants of speech. Formants are “regions of the sound frequency spectrum in which energy is particularly concentrated, and are found also in instrumental sounds…. Whether a human voice is soprano, or baritone, there are generally and characteristically five formants, and this has been the basis for a large body of research on digital speech synthesis …. Linear Predictive Coding (LPC) is an early digital technique which analyses a speech utterance into a simplified data set that includes representation of the characteristic formants. This data set is then used to resynthesise a version of the utterance and, in the resynthesis step, modifications can be introduced. For example, speech can be transformed into song, and words can be lengthened or shortened in time, with or without pitch change” (Smith and Dean 2003, p. 114).