Archive | Listening RSS for this section

A Brief History of Auto-Tune

3886588096_193dd13dd6_o

Sound and TechThis is the final article  in Sounding Out!‘s April  Forum on “Sound and Technology.” Every Monday this month, you’ve heard new insights on this age-old pairing from the likes of Sounding Out! veteranos Aaron Trammell and Primus Luta along with new voices Andrew Salvati and Owen Marshall.  These fast-forward folks have shared their thinking about everything from Auto-tune to techie manifestos. Today, Marshall helps us understand just why we want to shift pitch-time so darn bad. Wait, let me clean that up a little bit. . .so darn badly. . .no wait, run that back one more time. . .jjuuuuust a little bit more. . .so damn badly. Whew! There! Perfect!–JS, Editor-in-Chief

A recording engineer once told me a story about a time when he was tasked with “tuning” the lead vocals from a recording session (identifying details have been changed to protect the innocent). Polishing-up vocals is an increasingly common job in the recording business, with some dedicated vocal producers even making it their specialty. Being able to comp, tune, and repair the timing of a vocal take is now a standard skill set among engineers, but in this case things were not going smoothly. Whereas singers usually tend towards being either consistently sharp or flat (“men go flat, women go sharp” as another engineer explained), in this case the vocalist was all over the map, making it difficult to always know exactly what note they were even trying to hit. Complicating matters further was the fact that this band had a decidedly lo-fi, garage-y reputation, making your standard-issue, Glee-grade tuning job decidedly inappropriate.

Undaunted, our engineer pulled up the Auto-Tune plugin inside Pro-Tools and set to work tuning the vocal, to use his words, “artistically” – that is, not perfectly, but enough to keep it from being annoyingly off-key. When the band heard the result, however, they were incensed – “this sounds way too good! Do it again!” The engineer went back to work, this time tuning “even more artistically,” going so far as to pull the singer’s original performance out of tune here and there to compensate for necessary macro-level tuning changes elsewhere.

"Melodyne screencap" by Flickr user Ethan Hein, CC BY-NC-SA 2.0

“Melodyne screencap” by Flickr user Ethan Hein, CC BY-NC-SA 2.0

The product of the tortuous process of tuning and re-tuning apparently satisfied the band, but the story left me puzzled… Why tune the track at all? If the band was so committed to not sounding overproduced, why go to such great lengths to make it sound like you didn’t mess with it? This, I was told, simply wasn’t an option. The engineer couldn’t in good conscience let the performance go un-tuned. Digital pitch correction, it seems, has become the rule, not the exception, so much so that the accepted solution for too much pitch correction is more pitch correction.

Since 1997, recording engineers have used Auto-Tune (or, more accurately, the growing pantheon of digital pitch correction plugins for which Auto-Tune, Kleenex-like, has become the household name) to fix pitchy vocal takes, lend T-Pain his signature vocal sound, and reveal the hidden vocal talents of political pundits. It’s the technology that can make the tone-deaf sing in key, make skilled singers perform more consistently, and make MLK sound like Akon. And at 17 years of age, “The Gerbil,” as some like to call Auto-Tune, is getting a little long in the tooth (certainly by meme standards.) The next U.S. presidential election will include a contingent of voters who have never drawn air that wasn’t once rippled by Cher’s electronically warbling voice in the pre-chorus of “Believe.” A couple of years after that, the Auto-Tune patent will expire and its proprietary status will dissolve into to the collective ownership of the public domain.

.

Growing pains aside, digital vocal tuning doesn’t seem to be leaving any time soon. Exact numbers are hard to come by, but it’s safe to say that the vast majority of commercial music produced in the last decade or so has most likely been digitally tuned. Future Music editor Daniel Griffiths has ballpark-estimated that, as early as 2010, pitch correction was used in about 99% of recorded music. Reports of its death are thus premature at best. If pitch correction is seems banal it doesn’t mean it’s on the decline; rather, it’s a sign that we are increasingly accepting its underlying assumptions and internalizing the habits of thought and listening that go along with them.

Headlines in tech journalism are typically reserved for the newest, most groundbreaking gadgets. Often, though, the really interesting stuff only happens once a technology begins to lose its novelty, recede into the background, and quietly incorporate itself into fundamental ways we think about, perceive, and act in the world. Think, for example, about all the ways your embodied perceptual being has been shaped by and tuned-in to, say, the very computer or mobile device you’re reading this on. Setting value judgments aside for a moment, then, it’s worth thinking about where pitch correction technology came from, what assumptions underlie the way it works and how we work with it, and what it means that it feels like “old news.”

"Anti-Tune symbol"

“Anti-Tune symbol”

As is often the case with new musical technologies, digital pitch correction has been the target for no small amount of controversy and even hate. The list of indictments typically includes the homogenization of music, the devaluation of “actual talent,” and the destruction of emotional authenticity. Suffice to say, the technological possibility of ostensibly producing technically “pitch-perfect” performances has wreaked a fair amount of havoc on conventional ways of performing and evaluating music. As Primus Luta reminded us in his SO! piece on the powerful-yet-untranscribable “blue notes” that emerged from the idiosyncrasies of early hardware samplers, musical creativity is at least as much about digging-into and interrogating the apparent limits of a technology as it is about the successful removal of all obstacles to total control of the end result.

Paradoxically, it’s exactly in this spirit that others have come to the technology’s defense: Brian Eno, ever open to the unexpected creative agency of perplexing objects, credits the quantized sound of an overtaxed pitch corrector with renewing his interest in vocal performances. SO!’s own Osvaldo Oyola, channeling Walter Benjamin, has similarly offered a defense of Auto-Tune as a democratizing technology, one that both destabilizes conventional ideas about musical ability and allows everyone to sing in-tune, free from the “tyranny of talent and its proscriptive aesthetics.”

"Audiodatenkompression: Manowar, The Power of Thy Sword" by Wikimedia user Moehre1992, CC BY-SA 3.0

“Audiodatenkompression: Manowar, The Power of Thy Sword” by Wikimedia user Moehre1992, CC BY-SA 3.0

Jonathan Sterne, in his book MP3, offers an alternative to normative accounts of media technology (in this case, narratives either of the decline or rise of expressive technological potential) in the form of “compression histories” – accounts of how media technologies and practices directed towards increasing their efficiency, economy, and mobility can take on unintended cultural lives that reshape the very realities they were supposed to capture in the first place. The algorithms behind the MP3 format, for example, were based in part on psychoacoustic research into the nature of human hearing, framed primarily around the question of how many human voices the telephone company could fit into a limited bandwidth electrical cable while preserving signal intelligibility. The way compressed music files sound to us today, along with the way in which we typically acquire (illegally) and listen to them (distractedly), is deeply conditioned by the practical problems of early telephony. The model listener extracted from psychoacoustic research was created in an effort to learn about the way people listen. Over time, however, through our use of media technologies that have a simulated psychoacoustic subject built-in, we’ve actually learned collectively to listen like a psychoacoustic subject.

Pitch-time manipulation runs largely in parallel to Sterne’s bandwidth compression story. The ability to change a recorded sound’s pitch independently of its playback rate had its origins not in the realm of music technology, but in efforts to time-compress signals for faster communication. Instead of reducing a signal’s bandwidth, pitch manipulation technologies were pioneered to reduce the time required to push the message through the listener’s ears and into their brain. As early as the 1920s, the mechanism of the rotating playback head was being used to manipulate pitch and time interchangeably. By spinning a continuous playback head relative to the motion of the magnetic tape, researchers in electrical engineering, educational psychology, and pedagogy of the blind found that they could increase playback rate of recorded voices without turning the speakers into chipmunks. Alternatively, they could rotate the head against a static piece of tape and allow a single moment of recorded sound to unfold continuously in time – a phenomenon that influenced the development of a quantum theory of information

In the early days of recorded sound some people had found a metaphor for human thought in the path of a phonograph’s needle. When the needle became a head and that head began to spin, ideas about how we think, listen, and communicate followed suit: In 1954 Grant Fairbanks, the director of the University of Illinois’ Speech Research Laboratory, put forth an influential model of the speech-hearing mechanism as a system where the speaker’s conscious intention of what to say next is analogized to a tape recorder full of instructions, its drive “alternately started and stopped, and when the tape is stationary a given unit of instruction is reproduced by a moving scanning head”(136). Pitch time changing was more a model for thinking than it was for singing, and its imagined applications were thus primarily non-musical.

Take for example the Eltro Information Rate Changer. The first commercially available dedicated pitch-time changer, the Eltro advertised its uses as including “pitch correction of helium speech as found in deep sea; Dictation speed testing for typing and steno; Transcribing of material directly to typewriter by adjusting speed of speech to typing ability; medical teaching of heart sounds, breathing sounds etc.by slow playback of these rapid occurrences.” (It was also, incidentally, used by Kubrick to produce the eerily deliberate vocal pacing of HAL 9000). In short, for the earliest “pitch-time correction” technologies, the pitch itself was largely a secondary concern, of interest primarily because it was desirable for the sake of intelligibility to pitch-change time-altered sounds into a more normal-sounding frequency range.

.

This coupling of time compression with pitch changing continued well into the era of digital processing. The Eventide Harmonizer, one of the first digital hardware pitch shifters, was initially used to pitch-correct episodes of “I Love Lucy” which had been time-compressed to free-up broadcast time for advertising. Similar broadcast time compression techniques have proliferated and become common in radio and television (see, for example, Davis Foster Wallace’s account of the “cashbox” compressor in his essay on an LA talk radio station.) Speed listening technology initially developed for the visually impaired has similarly become a way of producing the audio “fine print” at the end of radio advertisements.

"H910 Harmonizer" by Wikimedia user Nalzatron, CC BY-SA 3.0

“H910 Harmonizer” by Wikimedia user Nalzatron, CC BY-SA 3.0

Though the popular conversation about Auto-Tune often leaves this part out, it’s hardly a secret that pitch-time correction is as much about saving time as it is about hitting the right note. As Auto-Tune inventor Andy Hildebrand put it,

[Auto-Tune’s] largest effect in the community is it’s changed the economics of sound studios…Before Auto-Tune, sound studios would spend a lot of time with singers, getting them on pitch and getting a good emotional performance. Now they just do the emotional performance, they don’t worry about the pitch, the singer goes home, and they fix it in the mix.

Whereas early pitch-shifters aimed to speed-up our consumption of recorded voices, the ones now used in recording are meant to reduce the actual time spent tracking musicians in studio. One of the implications of this framing is that emotion, pitch, and the performer take on a very particular relationship, one we can find sketched out in the Auto-Tune patent language:

Voices or instruments are out of tune when their pitch is not sufficiently close to standard pitches expected by the listener, given the harmonic fabric and genre of the ensemble. When voices or instruments are out of tune, the emotional qualities of the performance are lost. Correcting intonation, that is, measuring the actual pitch of a note and changing the measured pitch to a standard, solves this problem and restores the performance. (Emphasis mine. Similar passages can be found in Auto-Tune’s technical documentation.)

In the world according to Auto-Tune, the engineer is in the business of getting emotional signals from place to place. Emotion is the message, and pitch is the medium. Incorrect (i.e. unexpected) pitch therefore causes the emotion to be “lost.” While this formulation may strike some people as strange (for example, does it mean that we are unable to register the emotional qualities of a performance from singers who can’t hit notes reliably? Is there no emotionally expressive role for pitched performances that defy their genre’s expectations?), it makes perfect sense within the current affective economy and division of labor and affective economy of the recording studio. It’s a framing that makes it possible, intelligible, and at least somewhat compulsory to have singers “express emotion” as a quality distinct from the notes they hit and have vocal producers fix up the actual pitches after the fact. Both this emotional model of the voice and the model of the psychoacoustic subject are useful frameworks for the particular purposes they serve. The trick is to pay attention to the ways we might find ourselves bending to fit them.

.

Owen Marshall is a PhD candidate in Science and Technology Studies at Cornell University. His dissertation research focuses on the articulation of embodied perceptual skills, technological systems, and economies of affect in the recording studio. He is particularly interested in the history and politics of pitch-time correction, cybernetics, and ideas and practices about sensory-technological attunement in general. 

Featured image: “Epic iPhone Auto-Tune App” by Flickr user Photo Giddy, CC BY-NC 2.0

tape reelREWIND!…If you liked this post, you may also dig:

“From the Archive #1: It is art?”-Jennifer Stoever

“Garageland! Authenticity and Musical Taste”-Aaron Trammell

“Evoking the Object: Physicality in the Digital Age of Music”-Primus Luta

DIY Histories: Podcasting the Past

4349330191_387059d08b_o

Sound and TechThis is article 3.0  in Sounding Out!‘s April  Forum on “Sound and Technology.” Every Monday this month, you’ll be hearing new insights on this age-old pairing from the likes of Sounding Out! veteranos Aaron Trammell and Primus Luta along with new voices Andrew Salvati and Owen Marshall.  These fast-forward folks will share their thinking about everything from Auto-tune to techie manifestos. Today, Salvati asks if DIY podcasts are allowing ordinary people to remix the historical record. Let’s subscribe and press play.  –JS, Editor-in-Chief

 

Was Alexander the Great as bad a person as Adolph Hitler? Will our modern civilization ever fall like civilizations from past eras?

According to Dan Carlin’s website, these are the kind of speculative “outside-the-box” perspectives one might expect from his long-running Hardcore History podcast. In Carlin’s hands, the podcast is a vehicle for presenting dramatic accounts of human history that are clearly meant to entertain, and are quite distinct from what we might recognize as academic history. Carlin, a radio commentator and former journalist, would likely agree with this assessment. As he frequently emphasizes, he is a “fan” of history and not a professional. But while there are particularities of training, perspective, and resources that may distinguish professional and popular historians, an oppositional binary between these kinds of historymakers risks overlooking the plurality of historical interpretation. Instead, we might notice how history podcasters like Carlin utilize this new sonic medium to continue a tradition of oral storytelling that in the West goes back to Herodotus, and has since been the primary means of marginalized and oppressed groups to preserve cultural memory. As a way for hobbyists and amateurs to create and share their own do-it-yourself (DIY) histories, I argue that audio podcasting suggests a democratization of historical inquiry that greatly expands the possibilities for everyone, as Carl Becker once said, to become his or her own historian.

"Modified Podcast Logo with My Headphones Photoshopped On" by Flickr user Colleen AF Venable, CC BY-SA 2.0

“Modified Podcast Logo with My Headphones Photoshopped On” by Flickr user Colleen AF Venable, CC BY-SA 2.0

Frequently listed among iTunes’ top society and culture podcasts, and cited by several history podcasters as the inspiration for their own creations, the popularity of Hardcore History stems from Carlin’s unconventional and dramatic recounting of notable (but sometimes obscure) historical topics, in which he will often elaborate historical-structural changes through contemporizing metaphors. Connecting the distant past to more immediate analogies of present life is the core of Carlin’s explanatory method. This form of explanation is quite distinct from the output of academic historians, who assiduously avoid this sort of “presentism.” But as the late Roy Rosenzweig (2000) has suggested, it is precisely this kind of conscious and practical engagement with the past – and not the litany of facts in dry-as-dust textbooks – that appeals to non-historians. Rosenzweig and David Thelen claim have found that most Americans perceive a close connection with the past, especially as it relates to the present, through their personal and family life. Using the medium of podcasting to talk about the past is a new way of making the past vital to the present needs and interests of most people. This is how podcasters make sense of history in their own terms. It is DIY insofar as it is distinct from professional discourse, and less encompassing (and expensive) than video methods.

Podcasts can present an alternative model for making sense of the past – one that underscores the historymaker’s interpretive imprints, and which cultivates a sense of liveness and interactivity. Admittedly, Dan Carlin’s own style can be rambling and melodramatic. But to the extent that he practices history as a kind of storytelling, and acknowledges his own interpretive interventions, Hardcore History, like other independently produced history podcasts (I am thinking about a few of my favorites – Revolutions, The History Chicks, and The British History Podcast) give their listeners the sense that history is not necessarily something that is “out there,” or distant from us in the present, but part of a living conversation in the present. Podcasters construct a dialogue about history which, when combined with the interactivity offered by website forums, draws the listener into a participatory engagement. Rosenzweig and Thelen’s explain, Americans interested in popular history are skeptical of “historical presentations that did not give them credit for their critical abilities – commercialized histories on television or textbook-driven high school classes.” Such analytic skills are precisely what we as historians and teachers aim to develop in our students. Podcasting, when it constructs a collaborative dialogue in which audience and producer explore history together, can both be a valuable supplement to traditional historiography, and a way for people to connect with the past that overcomes the abstraction of textbooks and video.

"The histomap - four thousand years of world history" by Flickr user 图表汇, CC BY-SA 2.0

“The histomap – four thousand years of world history” by Flickr user 图表汇, CC BY-SA 2.0

But is the podcast as intellectually freeing as it might seem? Jonathan Sterne (et. al., 2008) notes that podcasting encompasses a range of technologies and practices that do not necessarily determine the liberation of content production from the dominance of established institutions and economies of scale. Indeed, there are many professional historians and media producers who have utilized audio (and sometimes video) podcasting to reach a wider audience. While the History Channel has not (yet) entered the field, one can surely imagine the implications of corporate-produced history content that homogenizes local and cultural particularities, or which present globalized capitalism as a natural or inevitable historical trajectory.

The kind of podcasts I am concerned with, however, are created by independent producers taking a DIY approach to content production and historical inquiry. While their resources and motivations may differ, podcasts produced on personal computers in the podcaster’s spare time have an intimate, handcrafted feel that I find to be more appealing than, say, a podcasted lecture. Ideally, what results is an intimate and episodic performance in which podcasters can, to use Andreas Duus Pape’s phrasing from an earlier Sounding Out! post, “whisper in the ears” of listeners. This intimacy is heightened by the means of access – when I download a particular podcast, transfer it to my iPhone, and listen on my commute, I am inviting the podcaster into my personal sonic space.

Complimenting this sense of intimacy is a DIY approach to history practiced by podcasters who are neither professional historians nor professional media producers. Relatively cheap and easy to produce (assuming the necessary equipment and leisure time), podcasting presents a low barrier of entry for history fans inspired to use new media technologies to share their passion with other history fans and the general public. Though a few podcasters acknowledge that they have had some university training in history, they are usually proud of their amateur status. The History Chicks, for example, “don’t claim to know it all,” and that any pretense toward a comprehensive history “would be kinda boring.” Podcasting and historical inquiry are hobbies, and their DIY history projects allow the relative freedom to have fun exploring and talking about their favorite subject matter – without having to conform to fussy disciplinary constraints. For Jamie Jeffers, creator of the British History podcast, most people are alienated by the way history gets taught in school. However, “almost everyone loves stories,” he says, and podcasting “allows us to reconnect to that ancient tradition of oral histories.” Others justify the hobby in more bluntly. For the History Chicks, women in history is “a perfect topic to sit down and chat about.” Talking about history, arguing about it, is something that history fans (and I include myself here) enjoy. Podcasting can broaden this conversation.

Despite my optimistic tone in this post, however, I do not want to suggest uncritically that the democratizing, DIY aspects that I have noted (among just a handful of podcasts) comprises the entire potential of the format. Nuancing a common opposition between the bottom-up potential of podcasting with the prevalent top-down (commercial) model of broadcasting for example, Sterne and others have asserted that rather than constituting a disruptive technology – as Richard Berry has suggested – podcasting realizes “an alternate cultural model of broadcasting.” Referring to earlier models of broadcasting – such as those Susan Douglas (1992) described in her classic study of early amateur radio – Sterne and company assert that analyses of podcasting should focus not on the technology itself, but on practice; not on the challenge podcasting poses to corporate dominance in broadcasting, but rather how it might offer a pluralistic model that permits both commercial/elite and DIY/amateur productions.

"Podcast in Retro" by Flickr user David Shortle, CC BY-NC 2.0

“Podcast in Retro” by Flickr user David Shortle, CC BY-NC 2.0

Adapting these recommendations, I argue that podcasting can help us conceptualize an alternate cultural model of history – one that invites reconsideration of what counts as historical knowledge and interpretation, and about who is empowered to construct and access historical discourse. Rather that privileging the empirical or objective histories of academic/professional historians, such an expanded model would recognize the cultural legitimacy of diverse forms of historiographical expression. In other words, that history is never “just” history, or “just” facts, but is always a contingent and situated form of knowledge, and that, as Keith Jenkins writes, “interpretations at (say) the ‘centre’ of our culture are not there because they are true or methodologically correct … but because they are aligned to the dominant discursive practices: again power/knowledge” (1991/2003, p. 79). But to reiterate Sterne’s (et. al.) caution however, such an alternative model would not necessarily determine a role-reversal between professional and DIY histories. Rather through podcasting, we might discover alternative ways of performing history as a new oral tradition – of becoming each of us our own historian.

Andrew J. Salvati is a Media Studies Ph.D. candidate at Rutgers University. His interests include the history of television and media technologies, theory and philosophy of history, and representations of history in media contexts. Additional interests include play, authenticity, the sublime, and the absurd. Andrew has co-authored a book chapter with colleague Jonathan Bullinger titled “Selective Authenticity and the Playable Past” in the recent edited volume Playing With the Past (2013), and has written a recent blog post for Play the Past titled The Play of History.”

Featured image: “Podcasts anywhere anytime” by Flickr user Francois, CC BY 2.0

tape reelREWIND!…If you liked this post, you may also dig:

“Music is not Bread: A Comment on the Economics of Podcasting”-Andreas Duus Pape

“Pushing Record: Labors of Love, and the iTunes Playlist”-Aaron Trammell

“Only the Sound Itself?: Early Radio, Education, and Archives of ‘No-Sound’”-Amanda Keeler

Sound Off! // Comment Klatsch #16: Sound and Pleasure

Sounding Off2klatsch \KLAHCH\ , noun: A casual gathering of people, esp. for refreshments and informal conversation  [German Klatsch, from klatschento gossip, make a sharp noiseof imitative origin.] (Dictionary.com)

Dear Readers:  Team SO! thought that we would warm up the dance floor for our upcoming Summer Series on Sound and Pleasure (peep the Call for Posts here. . .pitches are due by 4/15/14).   –J. Stoever, Editor-in-Chief

What sounds give you pleasure and why? 

Comment Klatsch logo courtesy of The Infatuated on Flickr.

 

This Is How You Listen: Reading Critically Junot Díaz’s Audiobook

2167001398_ff97f313a4_o

Last month, T.M. Luhrmann compared the experience of reading a written book versus listening to books in the New York Times article “Audiobooks and the Return of Storytelling.” Lurhmann points out how audiobook sales jumped 20% in 2012, whereas total industry book sales went down 1%. From the looks of it, books have benefited from audiobook sales, but in literary studies, print remains the primary vehicle for analysis. Might listening to an audiobook actually change how we critically read a text?

As I listened to Junot Díaz narrate This Is How You Lose Her  (2012), the first book Díaz has read as an audiobook and the first book of short stories the author has published since 1996’s Drown, I wondered how his reading influenced how I interpreted the text. Díaz’s reading sounds less like regular speech and more like a performance, with its own cadence and rhythm:

This post approaches the audiobook as a text in itself, coming from a sound studies perspective. I attempt to conceptualize the idea of “close listening” as a methodology akin to “close reading” in literary studies. I listen for how Diaz reads the text but more specifically how the reading itself becomes a way of authoring the text.  Ultimately, I argue that Díaz’s reading becomes a re-authoring the text—re-writing the text sonically. On a broader level, I hope to add to the conversation of what it means to read an audiobook, as Birgitte Stougaard Pederson and Ibsen Have brought up in “Conceptualising the Audiobook Experience.” Using This Is How You Lose Her, I show that reading an audiobook means engaging with the text from the angle of the ear, and that close listening can become an aural reading practice that relies not so much on the visual texts, but on aural cues from the narrator.

Not one but two (!) copies of This Is How You Lose Her

Not one but two (!) copies of This Is How You Lose Her

This Is How You Lose Her revolves around Yunior, a young Dominican immigrant who grows up in New Jersey and who ends up as a professor in Boston, and the many loves he has had or that he has encountered growing up. The stories trace his progress from a young, recently arrived Yunior, to a tenured, mature Yunior, showcasing certain relationships that influence how he relates to women—in sum, illustrating how he loses the women he loves. Throughout the short story collection, Díaz also calls attention to other relationships that may influence Yunior’s perspective, for example, his brother’s attachments with women, especially toward the end of his young life as he battled cancer, and his father’s relationship with his mistress, a Dominican woman who lived in New Jersey. At the end, Díaz illuminates how a mujeriego (womanizer) like Yunior comes to be; the short stories indicate that Yunior is as much a product of his environment as he is a seller of the merchandise.

Díaz is not a professional audiobook narrator. Although Díaz has done live readings, reading the full-length version of a book one has written is a different exercise. The Penguin Audio version of the collection is based on the actual short story collection (in other words, unabridged), so it does not contain additional stories or behind the scenes interviews. Technically, it is no different than the print version.

Listening to authors read their own work has value beyond the pleasure of hearing them read their text. Scholarly writing on audiobooks has emphasized the experience of listening to an audiobook for pleasure (like Deborah Phillips’ “Talking Books: The Encounter of Literature and Technology in the Audiobook” and James Shokoff’s “What Is An Audiobook?”), but it wasn’t until the 2011 edited collection Audiobooks, Literature, and Sound Studies that audiobooks were considered on their own instead of as extensions of the literature they were based on. The allure of doing this scholarly exercise with the audiobook version of This Is How You Lose Her is that Díaz’s delivery of the text is uncommon at the least.

"Junot Diaz at the Southern Festival of Books" by Flickr user Stacey Kizer, CC BY-NC 2.0

“Junot Diaz at the Southern Festival of Books” by Flickr user Stacey Kizer, CC BY-NC 2.0

Talking about Junot Díaz’s readerly voice requires to tune into conversations about his writerly voice. In many reviews of Díaz’s books, writers discuss how Díaz deftly conveys a writer’s voice in his text, indicating that his success is that his characters have a very clear voice—or at least Yunior does. Michiko Kakutani, for example, points out how “Junot Díaz has one of the most distinctive and magnetic voices in contemporary fiction: limber, streetwise, caffeinated and wonderfully eclectic, capable of conjuring for the reader everything from the sorrows of Dominican history to the banalities of life in New Jersey.” Although this quotation is in reference to Díaz’s second book, The Brief Wondrous Life of Oscar Wao, it describes Díaz’s writing in terms of his voice instead of, for instance, in terms of his use of metaphors or choice of subject.

Richard Wolinsky, in his Guernica interview with Díaz, sees an overlap between Yunior and Díaz: “He’s [Yunior] got a very distinct voice, and it’s a voice that’s informed by [Diaz’s] own reading, particularly science fiction and fantasy.” Although Díaz has pointed out that Yunior is loosely based on events that have happened to him,  Wolinsky “hears” Díaz in his main character. The tone and the language Yunior uses is read as a reflection of Díaz.

Conversations about the voice of the writer point to a sensibility about sound, but are often limited to a written text. Anna Barnet, in an interview with Junot Díaz, states “His two principal linguistic registers (‘this kind of crazy Caribbean language and music’ and ‘this sort of African-American-infused American vernacular’) grind against each other along with the many other voices he ventriloquizes in his writing.” Barnet reminds readers that Díaz’s writing style is based in spoken language—particularly Díaz’s spoken language. This language of “voice” to describe a writer’s style (or, specifically, a writer’s ability to convey a clear sense of who the character is and/or their views) is commonplace but gives the impression that there is a sonic aspect to an author’s work, when in reality it is but a metaphor for something that occurs at the level of text.

A critical reading of a text that includes the audiobook rendition allows critics to add substance to those references to “voice.” In Junot Díaz’s case, it is possible that readers encounter him first through written text, and so have an expectation of what Díaz (or Yunior) would sound like live.  In my textual analysis of eight audiobook reviews (and one book review that included a mention of the narration in the audiobook) most listeners showed some sort of discomfort with Díaz’s narration. One reviewer, for example, had issue with the “smoothness” of Díaz’s narration: “At times the reading was a little shaky and uneven”. Another reviewer stated “at times his cadence is choppy, with odd pauses and emphasis on strange words that detract from the overall experience.” Reviewers also had an issue with Díaz’s pace, which is characterized by pauses in places that many not seem normal in casual American speech. These statements hint at a “weird” quality in Díaz’s speech, something that does not come through when Díaz has a casual conversation. (Listen to this podcast episode of NPR’s Alt. Latino guest-starring Díaz and compare with this video of him reading part of This Is How You Lose Her.) Although one blogger pointed out that Díaz sounded “professorial” in the reading, others used the words “native,” “authenticity,” “Dominican” and even “Jersey accent” to describe how Díaz sounded. It is unclear how these reviewers define “native” or “authentic.”

"Junot Diaz" by Flickr user ALA The American Library Association, CC BY-NC-SA 2.0

“Junot Diaz” by Flickr user ALA The American Library Association, CC BY-NC-SA 2.0

Connecting sound to authenticity implies that Dominicans can only sound a certain way, or that the audio narration is lacking when it does not represent a “typical” Dominican voice. To the extent that Díaz is Dominican, his voice is of a Dominican male who has grown up in the Northeastern United States. His uneven audio narration creates a feeling of sonic unintelligibility in the listener, similar to the effect of including Spanish words in the written text. Díaz-as-narrator can make a listener uncomfortable, and by extension forces that reader to listen.

The sonic unintelligibility also relies on the text, on how Díaz plays with language by switching back and forth from English to Spanish. Díaz mentions in an interview with Marva Hinton that some readers are not happy with his choice of Spanglish in his writing: “There [are] folks who hear one Spanish word, and they’re convinced this is some sort of immigrant conspiracy” Farther down, in the same article, Díaz refers to his mix of Spanish and English (and a particular kind of Spanish and English at that, since he moves among Standard American English, African American Vernacular English, and Dominican Spanish) as “opaque language.” There’s a connection between the kind of “opaqueness” that Spanish gives and the unintelligible effect of Díaz read his work.

An example of how sonic unintelligibility operates in the audiobook is the first story, “The Sun, The Moon, The Stars.” This opener, told in first person, revolves about one of Yunior’s break-ups; Yunior and his girlfriend Magdalena, on whom he cheated, go to the Dominican Republic on a trip they had planned before she found out about the affair. It frames the book as being an in-depth analysis of loves lost, from the man who keeps losing them. It also sets the tone sonically for the audiobook reading: after the introduction of the book, a snippet of bachata music comes on, and then makes way for Díaz, who reads the title of the story. This is the pattern of the book: slices of bachata, followed by Díaz’s narration.

His voice is characterized by a slight sing-song cadence that is reminiscent of Dominican Spanish accent. If this were in Spanish, it might be easier to lose track of the cadence, but in English it sounds like a disembodied accent. I showcase the swing in Díaz’s narration by alternating capital letters and lower-case letters: “Her FAther, who usually would treat me like his HIjo, CALLS me an ASShole on the PHONE, SOUNDS like he’s STRANgling himself with the cord.” The voice seems to float for a while until Díaz arrives to the end of a paragraph or a series of sentences, and then it sinks. Moreover, this pattern does not change when Díaz switches characters: it’s hard to tell Yunior apart from Magdalena unless the reader pays close attention to when the narrator is switching characters and/or when the narrator uses a pronoun. The same effect comes from the odd pauses in the author’s narration: “Oh God, she wailed. Oh. My God.”


The choppiness and the emphasis in the reading are a way to dislocate the listener, in a similar way that Spanish phrases or lack of quotation marks in the text dislocate a reader who does not understand Spanish or who depends on the quotation marks to make sense of the prose.  Also, this story focuses on Magdalena withdrawing from Yunior and not communicating with him. The tone, cadence, and sound of Díaz’s voice can be read to mirror the relationship between Yunior and Magdalena (and the other women in the text): the sonic unintelligibility is manifest at the level of plot through Yunior’s relationships.

Although many audiobook reviewers may consider the plot in their reviews, part of what makes an audiobook stand out is the performance of the text. I take my cues from audiobook reviewers and consider critically my listening experience of This Is How You Lose Her and how this can become the basis for a critical interpretation of the text.  My analysis underscores that having an author read a text can provide a different way into analyzing the text and prompts readers to pay attention to sound. If, like Shokoff asserts, most audiobook readers listen to an audiobook while doing something else, Díaz shows that listening closely to the audio text can be as rewarding as reading a book.

Featured Image: “Junot Diaz” by WBUR Boston’s NPR News Station, Attribution-NonCommercial-NoDerivs License

Liana Silva-Ford is co-founder and Managing Editor of Sounding Out!.

tape reelREWIND! . . .If you liked this post, you may also dig:

“‘Or Does It Explode?’ Sounding Out the U.S. Metropolis in Hansberry’s A Raisin in the Sun-Liana Silva-Ford

“‘HOW YOU SOUND??’: The Poet’s Voice, Aura, and the Challenge of Listening to Poetry”-John Hyland

“Fade to Black, Old Sport: How Hip-Hop Amplifies Baz Luhrmann’s The Great Gatsby-Regina Bradley

%d bloggers like this: