Acousmatic Surveillance and Big Data


Sound and Surveilance4

It’s an all too familiar movie trope. A bug hidden in a flower jar. A figure in shadows crouched listening at a door. The tape recording that no one knew existed, revealed at the most decisive of moments. Even the abrupt disconnection of a phone call manages to arouse the suspicion that we are never as alone as we may think. And although surveillance derives its meaning the latin “vigilare” (to watch) and French “sur-“ (over), its deep connotations of listening have all but obliterated that distinction.

Moving on from cybernetic games to modes of surveillance that work through composition and patterns. Here, Robin James challenges us to consider the unfamiliar resonances produced by our IP addresses, search histories, credit trails, and Facebook posts. How does the NSA transform our data footprints into the sweet, sweet, music of surveillance? Shhhhhhhh! Let’s listen in. . . -AT

Kate Crawford has argued that there’s a “big metaphor gap in how we describe algorithmic filtering.” Specifically, its “emergent qualities” are particularly difficult to capture. This process, algorithmic dataveillance, finds and tracks dynamic patterns of relationships amongst otherwise unrelated material. I think that acoustics can fill the metaphor gap Crawford identifies. Because of its focus on identifying emergent patterns within a structure of data, rather than its cause or source, algorithmic dataveillance isn’t panoptic, but acousmatic. Algorithmic dataveillance is acousmatic because it does not observe identifiable subjects, but ambient data environments, and it “listens” for harmonics to emerge as variously-combined data points fall into and out of phase/statistical correlation.

Dataveillance defines the form of surveillance that saturates our consumer information society. As this promotional Intel video explains, big data transcends the limits of human perception and cognition – it sees connections we cannot. And, as is the case with all superpowers, this is both a blessing and a curse. Although I appreciate emails from my local supermarket that remind me when my favorite bottle of wine is on sale, data profiling can have much more drastic and far-reaching effects. As Frank Pasquale has argued, big data can determine access to important resources like jobs and housing, often in ways that reinforce and deepen social inequities. Dataveillance is an increasingly prominent and powerful tool that determines many of our social relationships.

The term dataveillance was coined in 1988 by Roger Clarke, and refers to “the systematic use of personal data systems in the investigation or monitoring of the actions or communications of one or more persons.” In this context, the person is the object of surveillance and data is the medium through which that surveillance occurs. Writing 20 years later, Michael Zimmer identifies a phase-shift in dataveillance that coincides with the increased popularity and dominance of “user-generated and user-driven Web technologies” (2008). These technologies, found today in big social media, “represent a new and powerful ‘infrastructure of dataveillance,’ which brings about a new kind of panoptic gaze of both users’ online and even their offline activities” (Zimmer 2007). Metadataveillance and algorithmic filtering, however, are not variations on panopticism, but practices modeled—both historically/technologically and metaphorically—on acoustics.

In 2013, Edward Snowden’s infamous leaks revealed the nuts and bolts of the National Security Administration’s massive dataveillance program. They were collecting data records that, according to the Washington Post, included “e-mails, attachments, address books, calendars, files stored in the cloud, text or audio or video chats and ‘metadata’ that identify the locations, devices used and other information about a target.” The most enduringly controversial aspect of NSA dataveillance programs has been the bulk collection of Americans’ data and metadata—in other words, the “big data”-veillance programs.


Borrowed fro thierry ehrmann @Flickr CC BY.

Borrowed from thierry ehrmann @Flickr CC BY.

Instead of intercepting only the communications of known suspects, this big dataveillance collects everything from everyone and mines that data for patterns of suspicious behavior; patterns that are consistent with what algorithms have identified as, say, “terrorism.” As Cory Doctorow writes in BoingBoing, “Since the start of the Snowden story in 2013, the NSA has stressed that while it may intercept nearly every Internet user’s communications, it only ‘targets’ a small fraction of those, whose traffic patterns reveal some basis for suspicion.” “Suspicion,” here, is an emergent property of the dataset, a pattern or signal that becomes legible when you filter communication (meta)data through algorithms designed to hear that signal amidst all the noise.

Hearing a signal from amidst the noise, however, is not sufficient to consider surveillance acousmatic. “Panoptic” modes of listening and hearing, though epitomized by the universal and internalized gaze of the guards in the tower, might also be understood as the universal and internalized ear of the confessor. This is the ear that, for example, listens for conformity between bodily and vocal gender presentation. It is also the ear of audio scrobbling, which, as Calum Marsh has argued, is a confessional, panoptic music listening practice.

Therefore, when President Obama argued that “nobody is listening to your telephone calls,” he was correct. But only insofar as nobody (human or AI) is “listening” in the panoptic sense. The NSA does not listen for the “confessions” of already-identified subjects. For example, this court order to Verizon doesn’t demand recordings of the audio content of the calls, just the metadata. Again, the Washington Post explains:

The data doesn’t include the speech in a phone call or words in an email, but includes almost everything else, including the model of the phone and the “to” and “from” lines in emails. By tracing metadata, investigators can pinpoint a suspect’s location to specific floors of buildings. They can electronically map a person’s contacts, and their contacts’ contacts.

NSA dataveillance listens acousmatically because it hears the patterns of relationships that emerge from various combinations of data—e.g., which people talk and/or meet where and with what regularity. Instead of listening to identifiable subjects, the NSA identifies and tracks emergent properties that are statistically similar to already-identified patterns of “suspicious” behavior. Legally, the NSA is not required to identify a specific subject to surveil; instead they listen for patterns in the ambience. This type of observation is “acousmatic” in the sound studies sense because the sounds/patterns don’t come from one identifiable cause; they are the emergent properties of an aggregate.

Borrowed from david @Flickr CC BY-NC.

Borrowed from david @Flickr CC BY-NC.

Acousmatic listening is a particularly appropriate metaphor for NSA-style dataveillance because the emergent properties (or patterns) of metadata are comparable to harmonics or partials of sound, the resonant frequencies that emerge from a specific combination of primary tones and overtones. If data is like a sound’s primary tone, metadata is its overtones. When two or more tones sound simultaneously, harmonics emerge whhen overtones vibrate with and against one another. In Western music theory, something sounds dissonant and/or out of tune when the harmonics don’t vibrate synchronously or proportionally. Similarly, tones that are perfectly in tune sometimes create a consonant harmonic. The NSA is listening for harmonics. They seek metadata that statistically correlates to a pattern (such as “terrorism”), or is suspiciously out of correlation with a pattern (such as US “citizenship”). Instead of listening to identifiable sources of data, the NSA listens for correlations among data.

Both panopticism and acousmaticism are technologies that incite behavior and compel people to act in certain ways. However, they both use different methods, which, in turn, incite different behavioral outcomes. Panopticism maximizes efficiency and productivity by compelling conformity to a standard or norm. According to Michel Foucault, the outcome of panoptic surveillance is a society where everyone synchs to an “obligatory rhythm imposed from the outside” (151-2), such as the rhythmic divisions of the clock (150). In other words, panopticism transforms people into interchangeable cogs in an industrial machine.  Methodologically, panopticism demands self-monitoring. Foucault emphasizes that panopticism functions most efficiently when the gaze is internalized, when one “assumes responsibility for the constraints of power” and “makes them play…upon himself” (202). Panopticism requires individuals to synchronize themselves with established compulsory patterns.

Acousmaticism, on the other hand, aims for dynamic attunement between subjects and institutions, an attunement that is monitored and maintained by a third party (in this example, the algorithm). For example, Facebook’s News Feed algorithm facilitates the mutual adaptation of norms to subjects and subjects to norms. Facebook doesn’t care what you like; instead it seeks to transform your online behavior into a form of efficient digital labor. In order to do this, Facebook must adjust, in part, to you. Methodologically, this dynamic attunement is not a practice of internalization, but unlike Foucault’s panopticon, big dataveillance leverages outsourcing and distribution. There is so much data that no one individual—indeed, no one computer—can process it efficiently and intelligibly. The work of dataveillance is distributed across populations, networks, and institutions, and the surveilled “subject” emerges from that work (for example, Rob Horning’s concept of the “data self”). Acousmaticism tunes into the rhythmic patterns that synch up with and amplify its cycles of social, political, and economic reproduction.

Sonic Boom! Borrowed from NASA's Goddard Space Flight Center @Flickr CC BY.

Sonic Boom! Borrowed from NASA’s Goddard Space Flight Center @Flickr CC BY.

Unlike panopticism, which uses disciplinary techniques to eliminate noise, acousmaticism uses biopolitical techniques to allow profitable signals to emerge as clearly and frictionlessly as possible amid all the noise (for more on the relation between sound and biopolitics, see my previous SO! essay). Acousmaticism and panopticism are analytically discrete, yet applied in concert. For example, certain tiers of the North Carolina state employee’s health plan require so-called “obese” and tobacco-using members to commit to weight-loss and smoking-cessation programs. If these members are to remain eligible for their selected level of coverage, they must track and report their program-related activities (such as exercise). People who exhibit patterns of behavior that are statistically risky and unprofitable for the insurance company are subject to extra layers of surveillance and discipline. Here, acousmatic techniques regulate the distribution and intensity of panoptic surveillance. To use Nathan Jurgenson’s turn of phrase, acousmaticism determines “for whom” the panoptic gaze matters. To be clear, acousmaticism does not replace panopticism; my claim is more modest. Acousmaticism is an accurate and productive metaphor for theorizing both the aims and methods of big dataveillance, which is, itself, one instrument in today’s broader surveillance ensemble.


Featured image “Big Brother 13/365″ by Dennis Skley CC BY-ND.


Robin James is Associate Professor of Philosophy at UNC Charlotte. She is author of two books: Resilience & Melancholy: pop music, feminism, and neoliberalism will be published by Zer0 books this fall, and The Conjectural Body: gender, race and the philosophy of music was published by Lexington Books in 2010. Her work on feminism, race, contemporary continental philosophy, pop music, and sound studies has appeared in The New Inquiry, Hypatia, differences, Contemporary Aesthetics, and the Journal of Popular Music Studies. She is also a digital sound artist and musician. She blogs at and is a regular contributor to Cyborgology.

tape reelREWIND!…If you liked this post, check out:

“Cremation of the senses in friendly fire”: on sound and biopolitics (via KMFDM & World War Z)–Robin James

The Dark Side of Game Audio: The Sounds of Mimetic Control and Affective ConditioningAaron Trammell

Listening to Whisperers: Performance, ASMR Community, and Fetish on YouTube–Joshua Hudelson

SO! Amplifies: Carleton Gholz and the Detroit Sound Conservancy

DSC logo

Document3SO! Amplifies. . .a highly-curated, rolling mini-post series by which we editors hip you to cultural makers and organizations doing work we really really dig.  You’re welcome!

I founded the Detroit Sound Conservancy in 2012 in order to preserve what music producer Don Was has called “the indigenous music of Detroit.” I also did it to preserve my own archive of Detroit sound related artifacts – oral history interviews, recordings, vinyl records, cassette tapes, 8-tracks, posters, t-shirts, buttons, articles, clippings, books, magazines, zines, photos, digital photos, notes, jottings, and other miscellaneous ephemera — knowing that if I could not help preserve the materials of an older generation of musicians, producers, DJs, writers, collectors, and fans, my personal archive and passions would not weather the storms (literal and figural) of the early 21st century – PhD or not. After a year or so of organizing virtually from Boston where I had found academic work teaching media & rhetoric, the DSC had its first major success with an oral history project for Detroit music, funded through Kickstarter.  Donations allowed us to throw a great party in Boston, form the non-profit, and push me home to work on the DSC full time.


The results of the move have already manifested themselves. This summer we had a successful conference at the Detroit Public Library — the first of its kind — dedicated to Detroit sound.We will hold another next year on May 22 dedicated to the key role of Michigan in general, and Detroit in particular, in the emergence of the modern soundscape.  We plan to have the call for papers out this fall.


October 14, 2014, interview with historian / musician Larry Gabriel and myself at #RecordDET, image courtesy of the author

In addition, we currently have an organizing/promotional night called #RecordDET at a downtown coffee shop  called Urban Bean so that we can continue to both record interviews and playback the sounds / stories we are learning from. So far we’ve interviewed a retired disco / house DJ, a record retail and radio veteran, and two blues historians and musicians.


The long-term goal is to use the stories and sounds to propel us into a more sustainable future for Detroit’s sonic heritage. Recent local floods  have reminded Metro Detroiters just how vulnerable we are and continue to be. We must preserve or our sonic dreams will perish.

I imagine the DSC as the sonic dream weaver. As one of our inspirations, the Black Madonna in Chicago, says: We still believe.

Carleton Gholz (PhD, Communication Studies, University of Pittsburgh, 2011) is the Founder and Executive Director of the Detroit Sound Conservancy, a lecturer in Communication at Oakland University in Rochester, Michigan, and President of the Friends of the E. Azalia Hackley Collection at the Detroit Public Library.

tape reelREWIND!…If you liked this post, you may also dig:

SO! Amplifies: Mendi+Keith Obadike and Sounding Race in America

SO! Amplifies: Regina Bradley’s Outkasted Conversations

SO! Amplifies: Eric Leonardson and World Listening Day 18 July 2014

The Dark Side of Game Audio: The Sounds of Mimetic Control and Affective Conditioning


Sound and Surveilance4

It’s an all too familiar movie trope. A bug hidden in a flower jar. A figure in shadows crouched listening at a door. The tape recording that no one knew existed, revealed at the most decisive of moments. Even the abrupt disconnection of a phone call manages to arouse the suspicion that we are never as alone as we may think. And although surveillance derives its meaning the latin “vigilare” (to watch) and French “sur-“ (over), its deep connotations of listening have all but obliterated that distinction.

This month, SO! Multimedia Editor Aaron Trammell curates a forum on Sound and Surveillance, featuring the work of Robin James and Kathleen Battles.  And so it begins, with Aaron asking. . .”Want to Play a Game?” –JS

It’s eleven o’clock on a Sunday night and I’m in the back room of a comic book store in Scotch Plains, NJ. Game night is wrapping up. Just as I’m about to leave, someone suggests that we play Pit, a classic game about trading stocks in the early 20th century. Because the game is short, I decide to give it a go and pull a chair up to the table. In Pit, players are given a hand of nine cards of various farm-related suits and frantically trade cards with other players until their entire hand matches the same suit. As play proceeds, players hold up a set of similar cards they are willing to trade and shout, “one, one, one!,” “two, two, two!,” “three, three, three!,” until another player is willing to trade them an equivalent amount of cards in a different suit. The game only gets louder as the shouting escalates and builds to a cacophony.

As I drove home that night, I came to the uncomfortable realization that maybe the game was playing me. I and the rest of the players had adopted similar dispositions over the course of the play. As we fervently shouted to one another trying to trade between sets of indistinguishable commodities, we took on similar, intense, and excited mannerisms. Players who would not scream, who would not participate in the reproduction of the game’s sonic environment, simply lost the game, faded out. As for the rest of us, we became like one another, cookie-cutter reproductions of enthusiastic, stressed, and aggravated stock traders, getting louder as we cornered the market on various goods.

We were caught in a cybernetic-loop, one that encouraged us to take on the characteristics of stock traders. And, for that brief period of time, we succumbed to systems of control with far reaching implications. As I’ve argued before, games are cybernetic mechanisms that facilitate particular modes of feedback between players and the game state. Sound is one of the channels through which this feedback is processed. In a game like Pit, players both listen to other players for cues regarding their best move and shout numbers to the table representing potential trades. In other games, such as Monopoly, players must announce when they wish to buy properties. Although it is no secret that understanding sound is essential to good game design, it is less clear how sound defines the contours of power relationships in these games. This essay offers two games,  Mafia, and Escape: The Curse of the Temple as case studies for the ways in which sound is used in the most basic of games, board games. By fostering environments that encourage both mimetic control and affective conditioning game sound draws players into the devious logic of cybernetic systems.

Understanding the various ways that sound is implemented in games is essential to understanding the ways that game sound operates as both a form of mimetic control and affective conditioning. Mimetic control is, at its most simple, the power of imitation. It is the degree to which we become alike when we play games. Mostly, it happens because the rules invoke a variety of protocols which encourage players to interact according to a particular standard of communication. The mood set by game sound is the power of affective conditioning. Because we decide what we interact with on account of our moods, moments of affective conditioning prime players to feel things (such as pleasure), which can encourage players to interact in compulsive, excited, subdued, or frenetic ways with game systems.

A game where sound plays a central and important role is Mafia (which has a number of other variants like Werewolf and The Resistance). In Mafia, some players take the secret role of mafia members who choose players to “kill” at night, while the eyes of the others are closed. Because mafia-team players shuffle around during the game and point to others in order to indicate which players to eliminate while the eyes of the other players are closed, the rules of the game suggest that players tap on things, whistle, chirp, and make other ambient noises while everyone’s eyes are closed. This allows for the mafia-team players to conduct their business secretly, as their motions are well below the din created by the other players. Once players open their eyes, they must work together to deduce which players are part of the mafia, and then vote on who to eliminate from the game. Here players are, in a sense, controlled by the game to provide a soundtrack. What’s more, the eeriness of the sounds produced by the players only accentuate the paranoia players feel when taking part in what’s essentially a lynch-mob.

The ambient sounds produced by players of Mafia have overtones of mimetic control. Protocols governing the use of game audio as a form of communication between bodies and other bodies, or bodies and machines, require that we communicate in particular ways at set intervals. Different than the brutal and martial forms of discipline that drove disciplinary apparatuses like Bentham’s panopticon, the form of control exerted through interactive game audio relies on precisely the opposite premise. What is often termed “The Magic Circle of Play” is suspect here as it promises players a space that is safe and fundamentally separate from events in the outside world. Within this space somewhat hypnotic behavior-patterns take place under the auspices of being just fun, or mere play. Players who refuse to play by the rules are often exiled from this space, as they refuse to enter into this contract of soft social norms with others.

Not all panopticons are in prisons. "Singing Ringing Tree at Sunset," Dave Leeming CC BY.

Not all panopticons are in prisons. “Singing Ringing Tree at Sunset,” Dave Leeming CC BY.

Escape: The Curse of the Temple relies on sound to set a game mood that governs the ways that players interact with each other. In Escape, players have ten minutes (of real time) where they must work together to navigate a maze of cardboard tiles. Over the course of the game there are two moments when players must return to the tile that they started the game on, and these are announced by a CD playing in the background of the room. When this occurs, a gong rings on the CD and rhythms of percussion mount in intensity until players hear a door slam. At this point, if players haven’t returned to their starting tile, they are limited in the actions they can take for the rest of the game. In the moments of calm before players make a mad dash for the entrance, the soundtrack waxes ambient. It offers the sounds of howling-winds, rattling chimes, and yawning corridors.

The game is spooky, overall. The combination of haunting ambient sounds and moments where gameplay is rushed and timed, makes for an adrenaline-fueled experience contained and produced by the game’s ambient soundtrack. The game’s most interesting moments come from points where one player is trapped and players must decide whether they should help their friend or help themselves. The tense, haunting, soundtrack evokes feelings of high-stakes immersion. The game is fun because it produces a tight, stressful, and highly interactive experience. It conditions its players through the clever use of its soundtrack to feel the game in an embodied and visceral way. Like the ways that horror movies have used ambient sounds to a great effect in producing tension in audiences (pp.26-27), Escape: The Curse of the Temple encourages players to immerse themselves in the game world by playing upon the tried and true affective techniques that films have used for years. Immersed players feel an increased sense of engagement with the game and because of this they are willingly primed to engage in the mimetic interactive behaviors that engage them within the game’s cybernetic logic.

These two forms of power, mimetic control and affective conditioning, often overlap and coalesce in games. Sometimes, they meet in the middle during games that offer a more or less adaptive form of sound, like Mafia. Players work together and mimic each other when reproducing the ambient forms of quiet that constitute the atmosphere of terror that permeates the game space. Even the roar of bids which occurs in Pit constitutes a form of affective conditioning that encourages players to buy, buy, buy as fast as possible. Effectively simulating the pressure of The Stock Exchange.

Although there is now a growing discipline around the production of game audio, there is relatively little discourse that attempts to understand how the implementation of sound in games functions as a mode of social control. By looking at the ways that sound is implemented in board and card games, we can gain insight of the ways in which it is implemented in larger technical systems (such as computer games), larger aesthetic systems (such as performance art), economic systems (like casinos and the stock market), and even social systems (like parties). Furthermore, it is easy to describe more clearly the ways in which game audio functions as a form of soft power through techniques of mimetic control and affective conditioning. It is only by understanding how these techniques affect our bodies that we can begin to recognize our interactions with large-scale cybernetic systems that have effects reaching beyond the game itself.


Aaron Trammell is co-founder and Multimedia Editor of Sounding Out! He is also a Media Studies PhD candidate at Rutgers University. His dissertation explores the fanzines and politics of underground wargame communities in Cold War America. You can learn more about his work at

Featured image “Psychedelic Icon,” by Gwendal Uguen CC BY-NC-SA.

tape reelREWIND!…If you liked this post, you may also dig:

Papa Sangre and the Construction of Immersion in Audio Games- Enongo Lumumba-Kasongo 

Sounding Out! Podcast #31: Game Audio Notes III: The Nature of Sound in Vessel- Leonard J. Paul

Experiments in Aural Resistance: Nordic Role-Playing, Community, and Sound- Aaron Trammell

Sounds of Science: The Mystique of Sonification


Hearing the Unheard IIWelcome to the final installment of Hearing the UnHeardSounding Out!s series on what we don’t hear and how this unheard world affects us. The series started out with my post on hearing, large and small, continued with a piece by China Blue on the sounds of catastrophic impacts, and Milton Garcés piece on the infrasonic world of volcanoes. To cap it all off, we introduce The Sounds of Science by professor, cellist and interactive media expert, Margaret Schedel.

Dr. Schedel is an Associate Professor of Composition and Computer Music at Stony Brook University. Through her work, she explores the relatively new field of Data Sonification, generating new ways to perceive and interact with information through the use of sound. While everyone is familiar with informatics, graphs and images used to convey complex information, her work explores how we can expand our understanding of even complex scientific information by using our fastest and most emotionally compelling sense, hearing.

– Guest Editor Seth Horowitz

With the invention of digital sound, the number of scientific experiments using sound has skyrocketed in the 21st century, and as Sounding Out! readers know, sonification has started to enter the public consciousness as a new and refreshing alternative modality for exploring and understanding many kinds of datasets emerging from research into everything from deep space to the underground. We seem to be in a moment in which “science that sounds” has a special magic, a mystique that relies to some extent on misunderstandings in popular awareness about the processes and potentials of that alternative modality.

For one thing, using sound to understand scientific phenomena is not actually new. Diarist Samuel Pepys wrote about meeting scientist Robert Hooke in 1666 that “he is able to tell how many strokes a fly makes with her wings (those flies that hum in their flying) by the note that it answers to in musique during their flying.” Unfortunately Hooke never published his findings, leading researchers to speculate on his methods. One popular theory is that he tied strings of varying lengths between a fly and an ear trumpet, recognizing that sympathetic resonance would cause the correct length string to vibrate, thus allowing him to calculate the frequency. Even Galileo used sound, showing the constant acceleration of a ball due to gravity by using an inclined plane with thin moveable frets. By moving the placement of the frets until the clicks created an even tempo he was able to come up with a mathematical equation to describe how time and distance relate when an object falls.

Illustration from Robert Hooke's Micrographia (1665)

Illustration from Robert Hooke’s Micrographia (1665)

There have also been other scientific advances using sound in the more recent past. The stethoscope was invented in 1816 for auscultation, listening to the sounds of the body. It was later applied to machines—listening for the operation of the technological gear. Underwater sonar was patented in 1913 and is still used to navigate and communicate using hydroacoustic phenomenon. The Geiger Counter was developed in 1928 using principles discovered in 1908; it is unclear exactly when the distinctive sound was added. These are all examples of auditory display [AD]; sonification-generating or manipulating sound by using data is a subset of AD. As the forward to the The Sonification Handbook states, “[Since 1992] Technologies that support AD have matured. AD has been integrated into significant (read “funded” and “respectable”) research initiatives. Some forward thinking universities and research centers have established ongoing AD programs. And the great need to involve the entire human perceptual system in understanding complex data, monitoring processes, and providing effective interfaces has persisted and increased” (Thomas Hermann, Andy Hunt, John G. Neuhoff, Sonification Handbook, iii)

Sonification clearly enables scientists, musicians and the public to interact with data in a very different way, particularly compared to the more numerous techniques involving vision. Indeed, because hearing functions quite differently than vision, sonification offers an alternative kind of understanding of data (sometimes more accurate), which would not be possible using eyes alone. Hearing is multi-directional—our ears don’t have to be pointing at a sound source in order to sense it. Furthermore, the frequency response of our hearing is thousands of times more accurate than our vision. In order to reproduce a moving image the sampling rate (called frame-rate) for film is 24 frames per second, while audio has to be sampled at 44,100 frames per second in order to accurately reproduce sound. In addition, aural perception works on simultaneous time scales—we can take in multiple streams of audio data at once at many different dynamics, while our pupils dilate and contract, limiting how much visual data we can absorb at a single time. Our ears are also amazing at detecting regular patterns over time in data; we hear these patterns as frequency, harmonic relationships, and timbre.

Image credit: Dr. Kevin Yager, data measured at X9 beamline, Brookhaven National Lab.

Image credit: Dr. Kevin Yager, Brookhaven National Lab.

But hearing isn’t simple, either. In the current fascination with sonification, the fact that aesthetic decisions must be made in order to translate data into the auditory domain can be obscured. Headlines such as “Here’s What the Higgs Boson Sounds Like” are much sexier than headlines such as “Here is What One Possible Mapping of Some of the Data We Have Collected from a Scientific Measuring Instrument (which itself has inaccuracies) Into Sound.” To illustrate the complexity of these aesthetic decisions, which are always interior to the sonification process, I focus here on how my collaborators and I have been using sound to understand many kinds of scientific data.

My husband, Kevin Yager, a staff scientist at Brookhaven National Laboratory, works at the Center for Functional Nanomaterials using scattering data from x-rays to probe the structure of matter. One night I asked him how exactly the science of x-ray scattering works. He explained that X-rays “scatter” off of all the atoms/particles in the sample and the intensity is measured by a detector. He can then calculate the structure of the material, using the Fast Fourier Transform (FFT) algorithm. He started to explain FFT to me, but I interrupted him because I use FFT all the time in computer music. The same algorithm he uses to determine the structure of matter, musicians use to separate frequency content from time. When I was researching this post, I found a site for computer music which actually discusses x-ray scattering as a precursor for FFT used in sonic applications.

To date, most sonifications have used data which changes over time – a fly’s wings flapping, a heartbeat, a radiation signature. Except in special cases Kevin’s data does not exist in time – it is a single snapshot. But because data from x-ray scattering is a Fourier Transform of the real-space density distribution, we could use additive synthesis, using multiple simultaneous sine waves, to represent different spatial modes. Using this method, we swept through his data radially, like a clock hand, making timbre-based sonifications from the data by synthesizing sine waves using with the loudness based on the intensity of the scattering data and frequency based on the position.

We played a lot with the settings of the additive synthesis, including the length of the sound, the highest frequency and even the number of frequency bins (going back to the clock metaphor – pretend the clock hand is a ruler – the number of frequency bins would be the number of demarcations on the ruler) arriving eventually at set of optimized variables.

Here is one version of the track we created using 10 frequency bins:


Here is one we created using 2000:


And here is one we created using 50 frequency bins, which we settled on:


On a software synthesizer this would be like the default setting. In the future we hope to have an interactive graphic user interface where sliders control these variables, just like a musician tweaks the sound of a synth, so scientists can bring out, or mask aspects of the data.

To hear what that would be like, here are a few tracks that vary length:




Finally, here is a track we created using different mappings of frequency and intensity:


Having these sliders would reinforce to the scientists that we are not creating “the sound of a metallic alloy,” we are creating one sonic representation of the data from the metallic alloy.

It is interesting that such a representation can be vital to scientists. At first, my husband went along with this sonification project as more of a thought experiment rather than something that he thought would actually be useful in the lab, until he heard something distinct about one of those sounds, suggesting that there was a misaligned sample. Once Kevin heard that glitched sound (you can hear it in the video above), he was convinced that sonification was a useful tool for his lab. He and his colleagues are dealing with measurements 1/25,000th the width of a human hair, aiming an X-ray through twenty pieces of equipment to get the beam focused just right. If any piece of equipment is out of kilter, the data can’t be collected. This is where our ears’ non-directionality is useful. The scientist can be working on his/her computer and, using ambient sound, know when a sample is misaligned.


It remains to be seen/heard if the sonifications will be useful to actually understand the material structures. We are currently running an experiment using Mechanical Turk to determine this kind of multi-modal display (using vision and audio) is actually helpful. Basically we are training people on just the images of the scattering data, and testing how well they do, and training another group of people on the images plus the sonification and testing how well they do.

I’m also working with collaborators at Stony Brook University on sonification of data. In one experiment we are using ambisonic (3-dimensional) sound to create a sonic map of the brain to understand drug addiction. Standing in the middle of the ambisonic cube, we hope to find relationships between voxels, a cube of brain tissue—analogous to pixels. When neurons fire in areas of the brain simultaneously there is most likely a causal relationship which can help scientists decode the brain activity of addiction. Computer vision researchers have been searching for these relationships unsuccessfully; we hope that our sonification will allow us to hear associations in distinct parts of the brain which are not easily recognized with sight. We are hoping to leverage the temporal pattern recognition of our auditory system, but we have been running into problems doing the sonification; each slice of data from the FMRI has about 300,000 data points. We have it working with 3,000 data points, but either our programming needs to get more efficient, or we have to get a much more powerful computer in order to work with all of the data.

On another project we are hoping to sonify gait data using smartphones. I’m working with some of my music students and a professor of Physical Therapy, Lisa Muratori, who works on understanding the underlying mechanisms of mobility problems in Parkinsons’ Disease (PD). The physical therapy lab has a digital motion-capture system and a split-belt treadmill for asymmetric stepping—the patients are supported by a harness so they don’t fall. PD is a progressive nervous system disorder characterized by slow movement, rigidity, tremor, and postural instability. Because of degeneration of specific areas of the brain, individuals with PD have difficulty using internally driven cues to initiate and drive movement. However, many studies have demonstrated an almost normal movement pattern when persons with PD are provided external cues, including significant improvements in gait with rhythmic auditory cueing. So far the research with PD and sound has be unidirectional – the patients listen to sound and try to match their gait to the external rhythms from the auditory cues.In our system we will use bio-feedback to sonify data from sensors the patients will wear and feed error messages back to the patient through music. Eventually we hope that patients will be able to adjust their gait by listening to self-generated musical distortions on a smartphone.

As sonification becomes more prevalent, it is important to understand that aesthetic decisions are inevitable and even essential in every kind of data representation. We are so accustomed to looking at visual representations of information—from maps to pie charts—that we may forget that these are also arbitrary transcodings. Even a photograph is not an unambiguous record of reality; the mechanics of the camera and artistic choices of the photographer control the representation. So too, in sonification, do we have considerable latitude. Rather than view these ambiguities as a nuisance, we should embrace them as a freedom that allows us to highlight salient features, or uncover previously invisible patterns.


Margaret Anne Schedel is a composer and cellist specializing in the creation and performance of ferociously interactive media. She holds a certificate in Deep Listening with Pauline Oliveros and has studied composition with Mara Helmuth, Cort Lippe and McGregor Boyle. She sits on the boards of 60×60 Dance, the BEAM Foundation, Devotion Gallery, the International Computer Music Association, and Organised Sound. She contributed a chapter to the Cambridge Companion to Electronic Music, and is a joint author of Electronic Music published by Cambridge University Press. She recently edited an issue of Organised Sound on sonification. Her research focuses on gesture in music, and the sustainability of technology in art. She ran SUNY’s first Coursera Massive Open Online Course (MOOC) in 2013. As an Associate Professor of Music at Stony Brook University, she serves as Co-Director of Computer Music and is a core faculty member of cDACT, the consortium for digital art, culture and technology.

Featured Image: Dr. Kevin Yager, data measured at X9 beamline, Brookhaven National Lab.

Research carried out at the Center for Functional Nanomaterials, Brookhaven National Laboratory, is supported by the U.S. Department of Energy, Office of Basic Energy Sciences, under Contract No. DE-AC02-98CH10886.

tape reelREWIND! ….. If you liked this post, you might also like:

The Noises of Finance–Nicholas Knouf

Revising the Future of Music Technology–Aaron Trammell

A Brief History of Auto-Tune–Owen Marshall

%d bloggers like this: