Tag Archive | synthesis

Voice as Ecology: Voice Donation, Materiality, Identity

I first heard about voice donation while listening to “Being Siri,” an experimental audio piece about Erin Anderson donating her voice to Boston-based voice donation company, VocaliD. Like a digital blood bank of sorts, VocaliD provides a platform for donating one’s voice via digital audio recordings. These recordings are used to help technicians create a custom digital voice for a voiceless individual, providing an alternative to the predominately white, male, mechanical-sounding assistive technologies used by people who cannot vocalize for themselves (think Stephen Hawking). VocaliD manufactures voices that better match a person’s race, gender, ethnicity, age, and unique personality. To me, VocaliD encapsulates the promise, complexity, and problematic nature of our current speech AI landscape and serves as an example of why we need to think critically about sound technologies, even when they appear to be wholly beneficial.  

Given the extreme lack of sonic diversity in vocal assistive technologies, VocaliD provides a critically important service. But a closer look at both the rhetoric used by the organization and the material process involved in voice donation also amplifies the limits of overly simplistic, human-centric conceptions of voice. For instance, VocaliD rhetorically frames their service by persistently linking voice to humanity—to self, authenticity, individuality. Consider the following statements made by Rupal Patel, CEO and founder of VocaliD, in which she emphasizes the need for voice donation technology: 

“Here’s a way for us to acknowledge these individuals as unique human beings.” (Fast Company)

“I was talking to [a] girl we made a voice for. She told me that people are finally seeing her for who she really is.” (Medieros)

These are just a few examples from a larger discourse that reinforces the connection between voice and humanity. VocaliD’s repeated claims that their unique vocal identities humanize individuals imply that one is not fully human unless one’s voice sounds human. This rhetoric positions voiceless individuals as less than human (at least until they pay for a customized human-sounding voice). 

VocaliD’s conflation of voice and humanity makes me wonder about the meaning of “human” in this context. For example, notions of humanity have been historically associated with Western whiteness—and deployed as a means of separating or distinguishing white people from Others—as Alexander Weheliye points out. Though VocaliD’s mission is to diversify manufactured voices, is a “human-sounding” voice still construed as a white voice? Does sounding human mean sounding white? Even if there is a bank of sonically diverse voices to choose from, does racial bias show up in the pacing, phrasing, or inflection caused by the vocal technology? 

Photocredit: iphonedigital @Flickr CC BY-SA

I am also disturbed by the rhetoric of humanity and individuality used by VocaliD because the company adopts the same rhetoric to describe the AI voices they sell to brands for media and smart products. Here’s an example of this rhetoric from the VocaliD AI website: “When you need a voice that resonates, evokes audience empathy, and sounds like you, rather than your competitors, VocaliD’s AI-powered vocal persona is the solution. Your voice — always on, where you need it when you need it.” Using similar rhetorical strategies to describe both voiceless people and products is dehumanizing. And yet, having a more diverse AI vocal mediascape, especially in terms of race, is crucially important since voice-activated machines and products are designed largely by white men who end up reinforcing the sonic color line.

Interestingly, the processes VocaliD uses to create a custom voice reveal that these voices are not, in fact, unique markers of humanity or individuality. It’s hard to find a detailed account of how VocaliD voices are made due to the company’s patents, but here are the basics: VocaliD does not transfer a donated voice directly to a voiceless person’s assistive technology. VocaliD technicians instead blend and digitally manipulate the donated voice with recordings of the noises a voiceless person can make (a laugh, a hum) to create a distinct new voice for the recipient. In other words, donated voices are skillful remixes that wouldn’t be possible without extracting vocal data and manipulating it with digital tools. Despite perpetuating narratives about voice, humanity, and authenticity, VocaliD’s creative blending of vocal material reveals that donated voices are the result of compositional processes that involve much more than people.

Further, considering VocaliD voices from a material rather than human-centric perspective amplifies something important about voices in general. All voices are composed of and grounded in an ecology. That is, voices emerge and are developed through a mixture of: (1) biological makeup (or technological makeup in the case of machines with voices); (2) specific environments and contexts (geography may determine the kind of accents humans have; AI voices have distinct sounds for their brands); (3) technologies (phones, computers, digital recorders and editors, software, and assistive technologies preserve, circulate, and amplify voices); and (4) others (humans often emulate the vocal patterns of the people they interact with most; many machine voices also sound like other machine voices). Put simply, all voices are intentionally and unintentionally composed over time—shaped by ever-changing bodily (and/or technological) states and engagements with the world. Voices are dynamic compositions by nature. Examining voice from a material standpoint shows that voices are not static markers of humanity; voices are responsive and malleable because they are the result of a complex ecology that involves much more than a “unique” human being. 

However, focusing solely on the material aspects of vocality leaves out people’s lived experiences of voice. And based on online videos of VocaliD recipients—like Delaney, a seventeen-year-old with cerebral palsy—VocaliD voices seem to live up to the company’s hype. Delaney appears delighted by her new voice, stating: “I was so excited to get my own voice. I used to have a computer voice and now I sound like a girl. I like that. And I talk more.” Delaney’s teachers also discuss how her new voice completely changed her demeanor. Whereas before Delaney was reluctant to use her assistive technology to speak, her new voice gives her confidence and a stronger sense of identity. As her teacher explains in the video, “she is really engaged in groups, she wants to share her answers, she’s excited to talk with friends. It’s been really nice to see.” For Delaney, a VocaliD voice represents a newfound sense of agency. 

It’s important to recognize this video is not necessarily representative of every VocaliD recipient’s experience, or even Delaney’s full experience. As Meryl Alper notes in Giving Voice, these types of news stories “portray technology as allowing individuals to ‘overcome’ their disability as an individual limitation, and are intended to be uplifting and inspirational for able-bodied audiences” (27). While we should be wary of the technological determinism in the video, observing Delaney use her VocaliD voice—and listening to the emotional responses of her mom and teachers—makes it difficult to deny that donated voices make a positive impact. For me, this video also gets at a larger truth about humans and voice: the ways we hear and understand our own voices, and the ways others interpret the sounds of our voices, matter a great deal. Voices are integral to our identities—to the ways we understand and think about ourselves and others—and the sounds of our voices have social and material consequences, as the SO! Gendered Voices Forum illustrates so clearly. 

An image VocaliD used to advertise themselves on Twitter. Image used for purposes of critique.

It’s worth repeating that VocaliD’s mission to diversify synthetic voices is incredibly important, especially given the restrictive vocal options available to voiceless individuals. It’s also necessary to acknowledge the company has limitations that end up reproducing the structural inequities it tries to address. As Alper observes, “In order to become a speech donor, one must have three to four hours of spare time to record their speech, access to a steady and strong Internet connection, and a quiet location in which to record” (162-63). With these obstacles to donating one’s voice in mind, it’s not surprising that all the VocaliD recipient videos I could find feature white people. Donating one’s voice is much easier for middle to upper class white people who have access to privacy, Internet, and leisure time.

This brief examination of VocaliD raises questions about what a more equitable future for vocal technologies might look/sound like. Though I don’t have the answer, I believe that to understand the fullness of voice, we can’t look at it from a single perspective. We need to account for the entire vocal ecology: the material (biological, technological, financial, etc.) conditions from which a voice emerges or is performed, and individual speakers’ understanding of their culture, race, ethnicity, gender, class, ability, sexuality, etc. An ecological approach to voice involves collaborating with people and their vocal needs and desires—something VocaliD models already. But it also involves accounting for material realities: How might we make the barriers preventing a more diverse voice ecosystem less difficult to navigate—especially for underrepresented groups? In short, we must treat voice holistically. Voices are more than people, more than technologies, more than contexts, more than sounds. Understanding voice means acknowledging the interconnectedness of these things and how that interconnectedness enables or precludes vocal possibilities. 

Featured image: 366-350 You can’t shut me up, Jennifer Moo, CC BY-ND

Steph Ceraso is an associate professor of digital writing and rhetoric at the University of Virginia. Her 2018 book, Sounding Composition: Multimodal Pedagogies for Embodied Listening, proposes an expansive approach to teaching with sound in the composition classroom. She also published a digital book in 2019 called Sound Never Tasted So Good: ‘Teaching’ Sensory Rhetorics—an exploration of writing, sound, rhetoric, and food. She is currently working on a book project that examines sonic forms of invention in various contexts.

tape-reel

REWIND! . . .If you liked this post, you may also dig:

What is a Voice?–Alexis Deighton MacIntyre

Mr. and Mrs. Talking Machine: The Euphonia, the Phonograph, and the Gendering of Nineteenth Century Mechanical Speech – J. Martin Vest

Only the Sound Itself?: Early Radio, Education, and Archives of “No-Sound”–Amanda Keeler

From Kitschy to Classy: Reviving the TR-808

Before Roland’s new TR-8 Rhythm Performer, a contemporary drum machine, was unveiled this year, the company released a series of promotional videos in which the machine’s designers sought out the original schematics and behavior of its predecessor the TR-808, an iconic analog drum machine from the early 1980s. The TR-808 holds cultural cache–most recently due to its use by Outkast, Baauer, and Kanye West–that Roland is interested in exploiting for the Rhythm Performer. The video features engineers closely examining the TR-808’s sound with an oscilloscope, trying to glean every last detail of the original’s personality.

"Roland TR-808" by Flickr user Ethan Hein, CC BY 2.0

“Roland TR-808” by Flickr user Ethan Hein, CC BY 2.0

Things were not always this way. Upon its initial release, the TR-808 was widely dismissed. Because it did not sound like “normal” acoustic drums, many established musicians questioned its utility and many ultimately disregarded it.  However, its “cheap” circuit-produced sounds became bargain-bin treasures for emerging artists. Since its sounds now play such a large part in the landscape of electronic music, this essay takes a historical perspective on the TR-808 Rhythm Composer’s use and circulation. By analyzing how Juan Atkins  and Marvin Gaye used the TR-808 in the early 1980s, I show how the TR-808 created a sonic space for drum machines in popular music.

Drum machines, though commonplace today, were once seen as kitschy tools for broke amateur musicians. As audio engineer Mitchell Sigman explains, the 808’s low, subsonic kick drum and “tick” snare characterized a departure from the realistic, sampled drum sounds produced by high-end drum machines in the early 1980s. The 808 uses analog oscillators and white noise generators to make sounds resembling the components of a drum set (kick, snare, hi-hats, etc.) And, although these sounds are now commonplace, most contemporary artists use them precisely because they sound robotic, not because they sound like drums.  Even though the 808 at first seemed a failed imitation of “real” drums, the comparatively low cost of the 808, which originally retailed around $1,195, attracted musicians who were unable to afford other similar machines such as the LinnDrum that retailed at more than twice that price. Roland advertised the machine as a “studio” for musicians on a budget and even as they began to disinvest from the 808–as testified by the company’s decision to invest in marketing and research for other products–the 808’s so-called noises began their movement into mainstream American popular culture. In Detroit, electronic musician Juan Atkins, now known as one of the innovators of Detroit Techno, began experimenting with the machine’s sonic capabilities as early as 1981, while other artists such as Afrika Bambaataa were also using it in the Bronx by 1982.

"Industrial Records Studio 1980" by Flickr user Chris Carter, CC BY-NC-ND 2.0

“Industrial Records Studio 1980” by Flickr user Chris Carter, CC BY-NC-ND 2.0

A landmark year for the 808, 1982 saw the release of Juan Atkins’ “Clear” and Marvin Gaye’s “Sexual Healing,” tracks that illuminate the key features each musician realized in the 808.  For Atkins, the machine was something he felt could embody his early career; Atkins’ use of the 808 represented a pivotal moment in the American musical landscape, in which the futurism of the sound of synthesizers echoed other segments of the nation’s sonic imagination.  Gaye’s use of the 808 was a clear departure from his body of Motown work.  Although the instrument enabled different sorts of experimentation for the two, the new sorts of sounds the machine produced allowed them both to explore new possibilities for musical meaning.  Just as Trevor Pinch and Frank Trocco argue in Analog Days that analog synthesizers required validation by musicians such as Geoff Downes and Keith Emerson a decade before, the 808 broke into the mainstream through artistic experimentation.

Juan Atkins

In the early ‘80s, Juan Atkins was learning all he could about electronic music. As an able musician and the son of a concert promoter, Atkins was poised to couple his musical knowledge with a new breed of electronic musical instruments such as the 808. Together with a tightly knit group from Detroit, Atkins succeeded in promoting techno from a subculture to part of a global dance music scene. According to Atkins, the popularity of Detroit Techno came from its adoption in European urban centers like London and Berlin, which lent the music additional meaning stateside. In an interview with Dollop UK, Atkins emphasizes that the 808 was central to this musical development, as he calls the 808 (among other machines) “the foundation[s] of electronic dance music.”

"Cybotron-Clear" by Flickr user Alan Read, CC BY-NC-SA 2.0

“Cybotron-Clear” by Flickr user Alan Read, CC BY-NC-SA 2.0

Under the moniker of Cybotron, Atkins released the song “Clear” in 1982. “Clear”’s proto-techno soundscape pushes the 808 to the front of his mix, and provides the track’s backbone. The solid, resonant kick, swishy open high hat, and the piercing snare are decidedly machinic, departing from most rhythmic trends in popular music to date, since, as music scholar John Mowitt points out, a sense of “human feeling” comes hand-in-hand with drumming.

Atkins embraced these machine sounds and considered the 808 his “secret weapon.” Its ability to be programmed, manipulated, and warped on the fly lent it a very particular kind of performance and music making that Atkins exploited. Rather than rely on the breaks that DJs could find on records, the 808 allowed Atkins to create beats to his own liking, placing kick, snare, and hi-hat hits where he found them to be most effective. Because of this flexibility, the kitsch of the 808’s sounds empowered the difference between his music and other artists’ creations. The breaks Atkins produced on the 808, for example, were obviously impossible to find on vinyl.

"Juan Atkins" by Flickr user Rene Passet, CC BY-NC-ND 2.0

“Juan Atkins” by Flickr user Rene Passet, CC BY-NC-ND 2.0

As Bleep43, an online EDM collective, notes, Atkins’ vision for electronic music would eventually pick up in London, where he relocated in the late eighties. Although Detroit Techno had achieved regional success in the US, record sales and performance dates in London signaled techno had found a larger audience abroad.  Although Atkins considers himself an eclectically “Detroit” artist,  he recognizes the impact of his work globally, and thinks of the modern Berlin flavor of minimal techno as a notably clever offshoot.

Marvin Gaye

Marvin Gaye’s struggle with depression, drug use and relationship issues were the context for the subtle and understated 808 rhythmic backing he used in “Sexual Healing.” Gaye’s use of the 808 in “Sexual Healing” differs vastly from Watkins’ in “Clear,” operating as a tool of texture and punctuation from the noticeable timbric changes to the clever placement of  handclaps and clave in the composition.  While Gaye recovered from his personal crises in Belgium, Colombia Records sent him an 808 because it was more portable than a studio drummer. It also offered sonic capabilities new and exciting to Gaye’s seasoned ears.

“Synths of Yesteryear 5/5” by Flickr user Jochen Wolters, CC BY-NC-ND 2.0

The drum machine’s prevalence in “Sexual Healing” shows how culturally marginal sounds move into mainstream musical culture. Gaye and his producers, already squarely in the center of popular American music, experimented with the sound of the 808 not in an attempt to break through, but rather to exercise musical flexibility. Since he was already an extremely successful pop artist, Gaye’s use of the 808 marks him as a sonic risk-taker and innovator, weaving the machine sounds of the 808 seamlessly but noticeably into R and B.

The machine’s normally powerful snare is invoked only at the quietest of velocities, often being replaced by the now iconic handclap. Unlike many contexts in which the 808 is heard such as “Clear” and Afrika Bambaataa’s “Planet Rock,” “Sexual Healing manages to keep everything low key. Matching the lyrics that espouse peace, harmony, and sense of internal struggle (Whenever blue tear drops are falling/And my emotional stability is leaving me/Honey I know you’ll be there to relieve me/The love you give to me will free me), Gaye uses the 808 to evoke a surprisingly contemplative and serene atmosphere. It is this use that best shows the machine’s strange versatility, as both a harbinger of radically innovative musical genres and its ability to produce tranquil rhythmic textures for popular music.

Transformation

"Roland TR 909 Drum Machine Classic" by Flickr user Juliana Luz, CC BY-NC 2.0

“Roland TR 909 Drum Machine Classic” by Flickr user Juliana Luz, CC BY-NC 2.0

Although Atkins and Gaye’s work exemplify the TR-808’s early adoption, a long road toward mainstream popularity remained because of Roger Linn’s more “realistic”  sampled drums sounds included in his high-end machines. The LM-1 and its successors (famous for hit singles like Billy Idol’s “White Wedding”, Hall and Oate’s “Maneater,” and Don Henley’s “Dirty Laundry”) made sampled drums the gold standard of computerized rhythmic backing. In fact, Roland’s next drum machine, the TR-909, implemented samples alongside synthesis.  As a result, 808s couldn’t be given away until musical innovators gave its sounds gravitas (Sigman, 2011, 46).

The 808’s shift from sonically trashy and undesirable to ostensibly hip signifies a culturally important moment within the history of music technology. As shown in the examples above, subtle moments of economic, emotional, and geographic necessity seeded the popular music industry for the eventual 808 boom today. When techno eventually broke through to global popularity, the 808 was so fundamental to the canon of the genre that it has managed to retain a place of fundamental sonic importance for musicians and producers.

 11:40, 6/11/14: This essay was re-edited for clarity, grammar, and flow by Jennifer Stoever.

Ian Dunham is a musician and music scholar originally from northeast Ohio. He earned a B.S. from Middle Tennessee State University in the Recording Industry within the College of Mass Communications, and then worked as a recording engineer in Nashville and Germany. Afterward, he earned an M.M. in Ethnomusicology from the University of Texas at Austin, where he also operated a home recording studio. He will start a PhD in Media Studies at Rutgers in the fall, where he will pursue research related to music and copyright.

Featured image: “1980 Roland TR-808” by Flickr user Joseph Holmes, CC BY-NC-ND 2.0

tape reelREWIND!…If you liked this post, you may also dig:

“Into the Woods: A Brief History of Wood Paneling on Synthesizers*”-Tara Rodgers

“The Blue Notes of Sampling”-Primus Luta

“Revising the Future of Music Technology”-Aaron Trammell

%d bloggers like this: