Tag Archive | MP3

Tape Hiss, Compression, and the Stubborn Materiality of Sonic Diaspora

In an article for Pitchfork, music critic Adam Ward reminisces about digital music files that sound as if they’re “being played through a payphone,” and calls the extreme compression of the low-quality MP3 “this generation’s vinyl crackle or skipping CD.” The crackles, hisses, and compression that characterize such sound files are what I term “encoded materiality.”  Focusing on the encoded materiality of the digital helps us to reconfigure our approach to sonic media, understanding how the compression of early MP3s and tape hiss remind us not only of lost fidelity, but also of the richness of exchange. These warm and stubborn sonic impurities, having been encoded in our digital listening formats and thus achieving repeatability and variability, act as persistent reminders that we can think diaspora beyond melancholy and authenticity, sidestepping the questions of purity and loss that so often characterize dialogues in the field of diaspora studies.

In Mechanisms, his work on electronic textuality, Matthew Kirschenbaum proposes a “material matrix governing writing and inscription in all forms” composed of four elements: “erasure, variability, repeatability, and survivability” (xiii). The defects of sonic technology that become encoded in digital files are one such type of inscription. Tape hiss and other recording accidents–such as Casey Kasem ruining your attempt to tape record the first Western song you fell in love with after leaving Hong Kong by fading the outro and butting in with his banter–achieve repetition and survival during the digital encoding process, becoming a welcome reminder of time and place. Such materiality helps us to better understand the politics of diaspora. It clues us in to how the elements of textual encoding (erasure, variability, repeatability, and survivability) become embedded within diaspora’s complex logic.

Image by DraconianRain @Flickr CC BY-NC.

Image by DraconianRain @Flickr CC BY-NC.

To think through these complex moments of exchange, let me offer a story about my experience with tape hiss. I grew up listening to music touched by this particular sonic grain: a ground level of noise upon which my sonic experiences were built. After I received my first iPod in 2005, I connected a tape player to the input of my computer, recorded a stack of tapes, and then manually split them into MP3s—pseudo-piracy committed in earnest. A few weeks ago, I dug up these same files and put them on my phone, once again returning the buried albums to their former glory on a constant rotation playlist. I keep returning to these particular files, rather than finding the now easily available digital versions, because I admire the survivability of their materiality. The materiality of these tracks allowed me to trace the complexity of my own history—the tape hiss is just as much a part of this history as the songs themselves.

After first moving to Canada from Hong Kong, my family and I established ourselves by unswervingly performing the same routine each weekend. We would have late lunch at our favorite dim sum restaurant, drive around for a bit, and then relax at home; there wasn’t much to do in the ex-urbs of Toronto. On those drives, we listened to selections from a stack of cassette tapes in the glovebox of our old Pontiac Bonneville. Sally Yeh’s 1987 album Blessing was on constant rotation and received its fair share of wear. This was one of the tapes I recorded to my computer, destined for digitization.

Because I hit the record button a few seconds early, my MP3 of Sally Yeh’s Blessing begins with a few seconds of silence. It’s enough to trick me into thinking that the song isn’t playing. In a quiet enough spot, I can hear that it’s actually tape hiss. No matter where I am, on the road or in the shower, my mind fills in the blank with the thick ker-chunk of the cassette entering that Pontiac stereo right before that familiar tape hiss would fill the car, always giving us a few sometimes-needed, sometimes-awkward moments of silence before the music started. The sonic texture of that tape stems from its material nature as plastic and metal. The hiss itself is due to the size of the magnetized particles on the plastic. Because of these sounds, the song tells its own story. It recalls our shared sonic and material experience as I migrate it from device to device.

Before Blessing made its way into our car, it was one of the few cassette tapes that my parents carefully packed into a dozen cardboard boxes and shipped by sea to Canada in the late 1980s. This was in the midst of the countrywide protests in China that led to the events at Tiananmen Square. That insistent ker-chunk of plastic on metal that my brain inserts every time I play the MP3s keeps my experience of the music grounded in this earlier history, too. Strange that a fluffy pop song would remind me of the serious political strife taking place on the doorstep of a Hong Kong nervously awaiting its “handover.” This sonic anchor’s ability to recall to me these snippets of history, both personal, national, and transpacific has been crucial in the development of my own diasporic identity. Listening to this particular recording of Blessing helps me to keep track of my self and my history.


The act of withdrawal that many of us perform in order to interface with our sonic technologies, as Alexander Weheliye shows in his reading of Ralph Ellison’s Invisible Man in Phonographies, can play a powerful role in understanding one’s own racial subjectivity. Weheliye focuses on the scene in which the titular narrator-protagonist retreats to a subterranean cave-like space to listen to Louis Armstong’s recorded, disembodied voice in complete solitude. He asserts that the narrator builds his own subjectivity through a recognition of the self by projecting that self onto Louis Armstrong’s “vocal apparatus,” that is, his voice coming through a phonograph (143). “The phonograph’s ability to disconnect the singing voice from its face, or rather to replace it with a technological visage, further heightens its materiality, which impels the protagonist to imbue Armstrong’s voice with a surplus of signification” (Weheliye 145).

More than a black and white photo or a stern historical lecture from the elders, the “heightened materiality” of the digital format, a type of “technological visage” cathects my own diasporic history most forcefully to the sonic anchor of tape hiss because it acts as a “voice without a face” in the same way as the phonographic Armstrong. But despite the privacy of the phonographic listening act in this scenario, Weheliye suggests that

the phonographic listening modality also bears the traces of sociality… since the listening subject is drawn out of him/herself by encountering the technologically mediated sounds of other subjects—we might even go so far as to suggest that the phonograph itself functions as a subject, especially in its interfacings with various humans. (165)

So it is with similar sonic technologies that can encourage the “eschewing [of] the social” such as iPhones, CDs, and, yes, cassette tapes. Like Ellison’s narrator interfacing with the mechanical apparatus that conveys Armstrong’s voice, the insistent “defects” kept on the digital file keep the mechanism of its delivery at the fore, allowing me first to understand that diasporic feeling of dis-ease—and to imagine beyond it.

Sally Yeh's "Blessing." Image used with permission by the author.

Sally Yeh’s “Blessing.” Image used with permission by the author.

What I gain from the digital yet still stubbornly material tape of Blessing is not any overt lyrical or thematic gesture to a diasporic subjectivity on the artist’s part, but rather an induction into what Giorgio Agamben calls, “the idea of an inessential commonality, a solidarity that in no way concerns an essence” (18), or perhaps a community based on “belonging itself” (84). Likewise, Weheliye’s “diasporic citizenship coarticulate[s] the national and transnational instead of playing a zero-sum game with political identification” (369).  If diaspora is defined by the perpetual desire to seek an imagined originary point of true identity that inevitably leads to melancholy, as psychoanalysis maintains, tape hiss and other encoded materialities turn the gaze away from the mists of origin, validating instead the development of diasporic identity in the aftermath of emigration. Of course, loss and melancholy are legitimate psychic aspects of the diasporic experience, as persuasively demonstrated by scholars such as David Eng, Shinhee Han, Anne Anlin Cheng, but they neither define the whole experience nor are they mutually exclusive to it. It is in this way that we can think of diaspora as a community of belonging by becoming.

A consideration of the stubborn ways that materiality is encoded in the digital helps us to think of diaspora as more than psychic fait accompli—it is also a ‘coming community’ characterized by the process of belonging. Kirschenbaum’s matrix provides the right foundation for a study which considers how material inscriptions are related to our diasporic lives. The inscription that defined my diasporic becoming came from the cassette tape that travelled across the ocean in a boat for five weeks, escaped erasure, survived repeated playings, became digital, and lives on now as a hissing reminder of our history of emigration. What else may we find about our own becoming and belonging if we attune our ears to the encoded materialities of sonic diaspora?

Featured image “Decayed Cassette” by darkday @Flickr CC BY.

Chris Chien is a Ph.D. candidate in the Department of English at the University of Southern California working variously in the areas of sound, diaspora and transpacific studies, all with a distinctly queer bent. He completed his M.A. in English Literature at Loyola Marymount University and his Honors B.A. in English Literature and Latin at the University of Toronto. Chris has presented papers on angelic gender fluidity in John Milton’s Paradise Lost and post-colonial affect in the work of Herman Melville and Amitav Ghosh at the Rocky Mountain MLA and South Atlantic MLA conferences respectively. He is currently developing a paper that examines the performativity of diaspora, masculinity, and the capitalist ethos in Eddie Huang’s memoir Fresh Off the Boat and its adaptation as an ABC sitcom.

tape reelREWIND! . . .If you liked this post, you may also dig:

Pushing Play: What Makes the Portable Tape Recorder Interesting?Gus Stadler

SANDRA BLAND: #SayHerName Loud or Not at All — Regina N. Bradley

Listening (Loudly) to Spanish-language Radio — Dolores Inés Casillas

Brasil Ao Vivo!: The Sonic Pleasures of Liveness in Brazilian Popular Culture — Kariann Goldschmitt


A Brief History of Auto-Tune

Sound and TechThis is the final article  in Sounding Out!‘s April  Forum on “Sound and Technology.” Every Monday this month, you’ve heard new insights on this age-old pairing from the likes of Sounding Out! veteranos Aaron Trammell and Primus Luta along with new voices Andrew Salvati and Owen Marshall.  These fast-forward folks have shared their thinking about everything from Auto-tune to techie manifestos. Today, Marshall helps us understand just why we want to shift pitch-time so darn bad. Wait, let me clean that up a little bit. . .so darn badly. . .no wait, run that back one more time. . .jjuuuuust a little bit more. . .so damn badly. Whew! There! Perfect!–JS, Editor-in-Chief

A recording engineer once told me a story about a time when he was tasked with “tuning” the lead vocals from a recording session (identifying details have been changed to protect the innocent). Polishing-up vocals is an increasingly common job in the recording business, with some dedicated vocal producers even making it their specialty. Being able to comp, tune, and repair the timing of a vocal take is now a standard skill set among engineers, but in this case things were not going smoothly. Whereas singers usually tend towards being either consistently sharp or flat (“men go flat, women go sharp” as another engineer explained), in this case the vocalist was all over the map, making it difficult to always know exactly what note they were even trying to hit. Complicating matters further was the fact that this band had a decidedly lo-fi, garage-y reputation, making your standard-issue, Glee-grade tuning job decidedly inappropriate.

Undaunted, our engineer pulled up the Auto-Tune plugin inside Pro-Tools and set to work tuning the vocal, to use his words, “artistically” – that is, not perfectly, but enough to keep it from being annoyingly off-key. When the band heard the result, however, they were incensed – “this sounds way too good! Do it again!” The engineer went back to work, this time tuning “even more artistically,” going so far as to pull the singer’s original performance out of tune here and there to compensate for necessary macro-level tuning changes elsewhere.

"Melodyne screencap" by Flickr user Ethan Hein, CC BY-NC-SA 2.0

“Melodyne screencap” by Flickr user Ethan Hein, CC BY-NC-SA 2.0

The product of the tortuous process of tuning and re-tuning apparently satisfied the band, but the story left me puzzled… Why tune the track at all? If the band was so committed to not sounding overproduced, why go to such great lengths to make it sound like you didn’t mess with it? This, I was told, simply wasn’t an option. The engineer couldn’t in good conscience let the performance go un-tuned. Digital pitch correction, it seems, has become the rule, not the exception, so much so that the accepted solution for too much pitch correction is more pitch correction.

Since 1997, recording engineers have used Auto-Tune (or, more accurately, the growing pantheon of digital pitch correction plugins for which Auto-Tune, Kleenex-like, has become the household name) to fix pitchy vocal takes, lend T-Pain his signature vocal sound, and reveal the hidden vocal talents of political pundits. It’s the technology that can make the tone-deaf sing in key, make skilled singers perform more consistently, and make MLK sound like Akon. And at 17 years of age, “The Gerbil,” as some like to call Auto-Tune, is getting a little long in the tooth (certainly by meme standards.) The next U.S. presidential election will include a contingent of voters who have never drawn air that wasn’t once rippled by Cher’s electronically warbling voice in the pre-chorus of “Believe.” A couple of years after that, the Auto-Tune patent will expire and its proprietary status will dissolve into to the collective ownership of the public domain.


Growing pains aside, digital vocal tuning doesn’t seem to be leaving any time soon. Exact numbers are hard to come by, but it’s safe to say that the vast majority of commercial music produced in the last decade or so has most likely been digitally tuned. Future Music editor Daniel Griffiths has ballpark-estimated that, as early as 2010, pitch correction was used in about 99% of recorded music. Reports of its death are thus premature at best. If pitch correction is seems banal it doesn’t mean it’s on the decline; rather, it’s a sign that we are increasingly accepting its underlying assumptions and internalizing the habits of thought and listening that go along with them.

Headlines in tech journalism are typically reserved for the newest, most groundbreaking gadgets. Often, though, the really interesting stuff only happens once a technology begins to lose its novelty, recede into the background, and quietly incorporate itself into fundamental ways we think about, perceive, and act in the world. Think, for example, about all the ways your embodied perceptual being has been shaped by and tuned-in to, say, the very computer or mobile device you’re reading this on. Setting value judgments aside for a moment, then, it’s worth thinking about where pitch correction technology came from, what assumptions underlie the way it works and how we work with it, and what it means that it feels like “old news.”

"Anti-Tune symbol"

“Anti-Tune symbol”

As is often the case with new musical technologies, digital pitch correction has been the target for no small amount of controversy and even hate. The list of indictments typically includes the homogenization of music, the devaluation of “actual talent,” and the destruction of emotional authenticity. Suffice to say, the technological possibility of ostensibly producing technically “pitch-perfect” performances has wreaked a fair amount of havoc on conventional ways of performing and evaluating music. As Primus Luta reminded us in his SO! piece on the powerful-yet-untranscribable “blue notes” that emerged from the idiosyncrasies of early hardware samplers, musical creativity is at least as much about digging-into and interrogating the apparent limits of a technology as it is about the successful removal of all obstacles to total control of the end result.

Paradoxically, it’s exactly in this spirit that others have come to the technology’s defense: Brian Eno, ever open to the unexpected creative agency of perplexing objects, credits the quantized sound of an overtaxed pitch corrector with renewing his interest in vocal performances. SO!’s own Osvaldo Oyola, channeling Walter Benjamin, has similarly offered a defense of Auto-Tune as a democratizing technology, one that both destabilizes conventional ideas about musical ability and allows everyone to sing in-tune, free from the “tyranny of talent and its proscriptive aesthetics.”

"Audiodatenkompression: Manowar, The Power of Thy Sword" by Wikimedia user Moehre1992, CC BY-SA 3.0

“Audiodatenkompression: Manowar, The Power of Thy Sword” by Wikimedia user Moehre1992, CC BY-SA 3.0

Jonathan Sterne, in his book MP3, offers an alternative to normative accounts of media technology (in this case, narratives either of the decline or rise of expressive technological potential) in the form of “compression histories” – accounts of how media technologies and practices directed towards increasing their efficiency, economy, and mobility can take on unintended cultural lives that reshape the very realities they were supposed to capture in the first place. The algorithms behind the MP3 format, for example, were based in part on psychoacoustic research into the nature of human hearing, framed primarily around the question of how many human voices the telephone company could fit into a limited bandwidth electrical cable while preserving signal intelligibility. The way compressed music files sound to us today, along with the way in which we typically acquire (illegally) and listen to them (distractedly), is deeply conditioned by the practical problems of early telephony. The model listener extracted from psychoacoustic research was created in an effort to learn about the way people listen. Over time, however, through our use of media technologies that have a simulated psychoacoustic subject built-in, we’ve actually learned collectively to listen like a psychoacoustic subject.

Pitch-time manipulation runs largely in parallel to Sterne’s bandwidth compression story. The ability to change a recorded sound’s pitch independently of its playback rate had its origins not in the realm of music technology, but in efforts to time-compress signals for faster communication. Instead of reducing a signal’s bandwidth, pitch manipulation technologies were pioneered to reduce the time required to push the message through the listener’s ears and into their brain. As early as the 1920s, the mechanism of the rotating playback head was being used to manipulate pitch and time interchangeably. By spinning a continuous playback head relative to the motion of the magnetic tape, researchers in electrical engineering, educational psychology, and pedagogy of the blind found that they could increase playback rate of recorded voices without turning the speakers into chipmunks. Alternatively, they could rotate the head against a static piece of tape and allow a single moment of recorded sound to unfold continuously in time – a phenomenon that influenced the development of a quantum theory of information

In the early days of recorded sound some people had found a metaphor for human thought in the path of a phonograph’s needle. When the needle became a head and that head began to spin, ideas about how we think, listen, and communicate followed suit: In 1954 Grant Fairbanks, the director of the University of Illinois’ Speech Research Laboratory, put forth an influential model of the speech-hearing mechanism as a system where the speaker’s conscious intention of what to say next is analogized to a tape recorder full of instructions, its drive “alternately started and stopped, and when the tape is stationary a given unit of instruction is reproduced by a moving scanning head”(136). Pitch time changing was more a model for thinking than it was for singing, and its imagined applications were thus primarily non-musical.

Take for example the Eltro Information Rate Changer. The first commercially available dedicated pitch-time changer, the Eltro advertised its uses as including “pitch correction of helium speech as found in deep sea; Dictation speed testing for typing and steno; Transcribing of material directly to typewriter by adjusting speed of speech to typing ability; medical teaching of heart sounds, breathing sounds etc.by slow playback of these rapid occurrences.” (It was also, incidentally, used by Kubrick to produce the eerily deliberate vocal pacing of HAL 9000). In short, for the earliest “pitch-time correction” technologies, the pitch itself was largely a secondary concern, of interest primarily because it was desirable for the sake of intelligibility to pitch-change time-altered sounds into a more normal-sounding frequency range.


This coupling of time compression with pitch changing continued well into the era of digital processing. The Eventide Harmonizer, one of the first digital hardware pitch shifters, was initially used to pitch-correct episodes of “I Love Lucy” which had been time-compressed to free-up broadcast time for advertising. Similar broadcast time compression techniques have proliferated and become common in radio and television (see, for example, Davis Foster Wallace’s account of the “cashbox” compressor in his essay on an LA talk radio station.) Speed listening technology initially developed for the visually impaired has similarly become a way of producing the audio “fine print” at the end of radio advertisements.

"H910 Harmonizer" by Wikimedia user Nalzatron, CC BY-SA 3.0

“H910 Harmonizer” by Wikimedia user Nalzatron, CC BY-SA 3.0

Though the popular conversation about Auto-Tune often leaves this part out, it’s hardly a secret that pitch-time correction is as much about saving time as it is about hitting the right note. As Auto-Tune inventor Andy Hildebrand put it,

[Auto-Tune’s] largest effect in the community is it’s changed the economics of sound studios…Before Auto-Tune, sound studios would spend a lot of time with singers, getting them on pitch and getting a good emotional performance. Now they just do the emotional performance, they don’t worry about the pitch, the singer goes home, and they fix it in the mix.

Whereas early pitch-shifters aimed to speed-up our consumption of recorded voices, the ones now used in recording are meant to reduce the actual time spent tracking musicians in studio. One of the implications of this framing is that emotion, pitch, and the performer take on a very particular relationship, one we can find sketched out in the Auto-Tune patent language:

Voices or instruments are out of tune when their pitch is not sufficiently close to standard pitches expected by the listener, given the harmonic fabric and genre of the ensemble. When voices or instruments are out of tune, the emotional qualities of the performance are lost. Correcting intonation, that is, measuring the actual pitch of a note and changing the measured pitch to a standard, solves this problem and restores the performance. (Emphasis mine. Similar passages can be found in Auto-Tune’s technical documentation.)

In the world according to Auto-Tune, the engineer is in the business of getting emotional signals from place to place. Emotion is the message, and pitch is the medium. Incorrect (i.e. unexpected) pitch therefore causes the emotion to be “lost.” While this formulation may strike some people as strange (for example, does it mean that we are unable to register the emotional qualities of a performance from singers who can’t hit notes reliably? Is there no emotionally expressive role for pitched performances that defy their genre’s expectations?), it makes perfect sense within the current affective economy and division of labor and affective economy of the recording studio. It’s a framing that makes it possible, intelligible, and at least somewhat compulsory to have singers “express emotion” as a quality distinct from the notes they hit and have vocal producers fix up the actual pitches after the fact. Both this emotional model of the voice and the model of the psychoacoustic subject are useful frameworks for the particular purposes they serve. The trick is to pay attention to the ways we might find ourselves bending to fit them.


Owen Marshall is a PhD candidate in Science and Technology Studies at Cornell University. His dissertation research focuses on the articulation of embodied perceptual skills, technological systems, and economies of affect in the recording studio. He is particularly interested in the history and politics of pitch-time correction, cybernetics, and ideas and practices about sensory-technological attunement in general. 

Featured image: “Epic iPhone Auto-Tune App” by Flickr user Photo Giddy, CC BY-NC 2.0

tape reelREWIND!…If you liked this post, you may also dig:

“From the Archive #1: It is art?”-Jennifer Stoever

“Garageland! Authenticity and Musical Taste”-Aaron Trammell

“Evoking the Object: Physicality in the Digital Age of Music”-Primus Luta

Sounding Out! Podcast #27: Interview with Jonathan Sterne


CLICK HERE TO DOWNLOAD: Interview with Jonathan Sterne



This podcast provokes Jonathan Sterne to jam on the history of Sound Studies, critique the soundscape, and talk about MP3s. That said, it was really just a way to talk about his super-cool music projects (really, check them out!). Aaron Trammell interviews Jonathan Sterne, and digs deep into the questions at the core of our discipline.

Jonathan Sterne teaches in the Department of Art History and Communication Studies and the History and Philosophy of Science Program at McGill University.  He is author of The Audible Past: Cultural Origins of Sound Reproduction (Duke, 2003), MP3: The Meaning of a Format (Duke 2012); and numerous articles on media, technologies and the politics of culture.  He is also editor of The Sound Studies Reader (Routledge, 2012).  Visit his website at http://sterneworks.org.

tape reelREWIND! . . .If you liked this post, you may also dig:

À qui la rue? : On Mégaphone and Montreal’s Noisy Public Sphere— Lilian Radovac

SO! Reads: Jonathan Sterne’s MP3: The Meaning of a Format— Aaron Trammell

Quebec’s #casseroles: on participation, percussion, and protest— Jonathan Sterne

SO! Reads: Jonathan Sterne’s MP3: The Meaning of a Format

The promiscuity of the mp3. Borrowed from NYCArthur on Flickr.

SO! Reads3The point that had lingered with me after first reading Jonathan Sterne’s essay “The mp3 as Cultural Artifact,” was the idea that the mp3 was a promiscuous technology. “In a media-saturated environment,” Sterne writes, “portability and ease of acquisition trumps monomaniacle attention . . . at the psychoacoustic level as well as the industrial level, the mp3 is designed for promiscuity. This has been a long-term goal in the design of sound reproduction technologies” (836).  A technology, promiscuous? I did not have to look far to find support. Like germs, I could find copies of mp3s that I had downloaded from Napster in 2000 scattered across generations of my old hard drives. Often they were redundant, too – iTunes having archived a copy separate from my original download.

But, for Sterne, mp3s are also socially promiscuous. They accumulate in the hard drives of the working class and are shared, almost anywhere, through the branching left/right wires of iPod earbuds. Since the popularization of the mp3, there have been new opportunities to share how we listen with others. This is promise of the mp3, and the reason it forms such a key point of scholarly meditation.

MP3: The Meaning of a Format (Duke University Press, 2012) finds Sterne revisiting many of these key themes, with a larger focus on the genealogical beginnings of the mp3 technology. While many of the book’s chapters are extrapolations of prior work Sterne has done regarding the genealogy of listening practices, this work concerns itself less with the 19th century, and more with the 20th century. Perhaps this is related to some of the methodological decisions Sterne has made in planning the book – in seeking out the genealogical origins of the mp3, Sterne worked from archives and manuals described in interviews by engineers who were fundamental to the technology’s production. As such it finds much in common with Trevor Pinch and Frank Trocco’s Analog Days and Dave Tompkins’ How to Wreck a Nice Beach but incorporates the genealogical methods regarding sonic technology present in Sterne’s earlier work The Audible Past and Emily Thompson’s The Soundscape of Modernity. In MP3 Sterne positions himself as a critical cultural studies scholar working between the humanities and sciences, focusing specifically on the mp3 due to its social and technological relevance today. The critical is key here as MP3 is truly a work devised to underscore the economic connections between the construction of our selves as “hearing subjects” and the media industries.

Certainly, the mp3 can still be considered a promiscuous technology, but it is corporate capitalism that had failed to recognize the extent to which it relies on technological promiscuity to support its infrastructure. This focus, ironically, displaces the mp3 as the main object of Sterne’s analysis. It highlights instead the pathological logic of corporate capitalism, and the ways that this rationality has mutated, now, in the wake of mass replicable, malleable, and iterative digital culture. In other words, the mp3 is endemic to a much larger plot, wherein the culture industries adapt to their own deus ex-machina. The naive development of the mp3 by the motion picture industry is a large part of the story here, but it is only a small bit of a much larger whole. The real story involves understanding how a handful of vested corporate interests have shaped the ways that we interpret and understand what listening is. In MP3 Sterne addresses one of the great questions of sound studies: What are the politics of listening? Or, which individuals and institutions have a vested economic interest in questions of how we hear?

Sterne recalls this drama in three parts, each unfolding in a somewhat autonomous fashion, but unified in so far as they explore the economic interests behind the scientific construction of “hearing subjects.” In the first part, Sterne is at his best exploring AT&T’s (and the affiliated Bell Laboratories’) role in funding psychological, physiological, and cybernetic research on hearing. In the second, Sterne explains how this early research has been applied to the visual and technical abstraction of sound in the 1970s. And, in the third part of this genealogy, he explains how these analogs were made digital, specifically the corporate politics which went into the construction of the mp3 standard. Throughout this surprising and detailed trajectory, Sterne makes the invisibility of corporate interests apparent and explicit.

AT&T and Infrastructure. Borrowed from djbones on Flickr.

Sterne also hints toward several powerful economic rationalities that have guided the construction of the mp3. Key among these insights is the monetization of cybernetic discourse, or the incorporation of the human body within a scientific understanding of technical systems. In order to engineer an efficient technical system, the capacities and limits of how we interact with (or serve as parts of) these systems must be taken into account. Sterne refers to this mode of engineering as “perceptual technics,” and he goes to great lengths to explain it.

Basically, at the turn of the 20th century, AT&T had taken a keen interest in the science of how people listen because they wanted to maximize the amount of simultaneous conversations broadcast through a single telephone wire. More conversations meant the purchase of fewer wires, and therefore greater profits. Eventually, drawing on the research of the oft-cited Claude Shannon and Warren Weaver (within SO!: What Mixtapes Can Teach Us About Nois and Pushing Record; and soundBox: Mapping Noise), AT&T recognized an economic problem of technical efficiency within their wires – there was too much ambient noise. Because of this, AT&T sought to limit the audible signal transmitted from one phone to another. This would allow for more signals (and therefore conversations) to be transmitted through the same wire. Physiological research provided clues that some frequencies were more audible than others, so engineers worked to compress audio signals to reflect this scientific abstraction of hearing.

The scientific capture of listening. Borrowed from img.informer.com .

The reduction of listening–as an embodied practice–to the quantification and control of the audible spectrum, is, in other words, the history of compression. Which, according to Sterne, should be understood as the true meaning of the mp3. While the mp3 format, like the CD or cassette, may become obsolete, technologies of compression will not. Sterne argues convincingly that most advances in compression technologies have been guided by the invisible logic of corporate capitalism. It is this exact tendency of compression–to make things smaller and more efficient–that threatened to undo the entire project of corporate and branded music distribution in the year 2000, via platforms like Napster.  Sterne is well aware of this irony throughout MP3, and uses the final chapter to discuss, briefly, the moment of cultural transformation that is defined by file-sharing and mass distribution.

Bringing things full circle with a somewhat stoic conclusion about the democratic potentials of this moment, he remarks: “The end of the artificial scarcity of recording is a moment of great potential. Its political outcome is still very much in question, but its political meaning should not be” (224). Sterne points to the globalization and ubiquity of mediated listening as a sign that things may not have changed much even though mass networked society at one point promised freedom from a commodity form which privileged things like “liberal notions of property, alienated labor, and ownership” (224). He argues that even the music industries shall persevere, mostly because people have a sublime attraction to listening and music. In other words: Meet the new boss, same as the old boss. There are few moments of liberation to be found within MP3; it is instead a drama of the status quo where the conspirators of corporate capitalism succeed in spite of themselves.

The ubiquity of listening. Borrowed from κεηι on Flickr.

The disparity in Sterne’s tone, when juxtaposing the nefarious and efficient dispositifs of capitalism with an untroubled and authentic construction of music is striking to say the least. And although Sterne is clear to explain that he locates his scholarship as work on a container technology (the mp3) and not its content (the music), this is a somewhat unsatisfying distinction as an embodied practice, such as listening, must take both into account. And while I agree that the mp3 reflects the promiscuity of corporate capitalism, is this challenged by the plethora of ideological nuance coded into song lyrics and arrangements? Do the corporate ideologies of the music industries flow beyond the container of the mp3 into the music itself? Is there any crosstalk, or overlap between these historical constructions? In other words, what are the limits to theorizing a container technology, and how much does the discursive path of the mp3 sculpt the content of what we listen to?

Despite, or perhaps, because of the rather dystopic scene that Sterne alludes to at the end of MP3, it falls nicely in the space between Sound Studies and Critical Information Studies. It bridges humanistic scholarship on embodied listening practices with a critique of the economic interests that have funded much of the scientific research relating to the phenomenology of sound. To that end, MP3 reveals much about the social construction of hearing and the ways that the familiar mythology of audio fidelity has been produced, discussed and exploited by several communication industries. Even though the mp3 may have been eclipsed by industry as the main object of inquiry in the eponomously titled MP3, Sterne succeeds admirably in detailing the promiscuity of corporate capitalism in the listening practices of our everyday lives.

Aaron Trammell is co-founder and Multimedia Editor of Sounding Out! He is also a Media Studies PhD candidate at Rutgers University.

%d bloggers like this: