Tag Archive | Mara Mills

Technological Interventions, or Between AUMI and Afrocuban Timba

Editors’ note: As an interdisciplinary field, sound studies is unique in its scope—under its purview we find the science of acoustics, cultural representation through the auditory, and, to perhaps mis-paraphrase Donna Haraway, emergent ontologies. Not only are we able to see how sound impacts the physical world, but how that impact plays out in bodies and cultural tropes. Most importantly, we are able to imagine new ways of describing, adapting, and revising the aural into aspirant, liberatory ontologies. The essays in this series all aim to push what we know a bit, to question our own knowledges and see where we might be headed. In this series, co-edited by Airek Beauchamp and Jennifer Stoever you will find new takes on sound and embodiment, cultural expression, and what it means to hear. –AB

In November 2016, my colleague Imani Wadud and I were invited by professor Sherrie Tucker to judge a battle of the bands at the Lawrence Public Library in Kansas. The battle revolved around manipulation of one specific musical technology: the Adaptive Use Musical Instruments (AUMI). Developed by Pauline Oliveros in collaboration with Leaf Miller and released in 2007, the AUMI is a camera-based software that enables various forms of instrumentation. It was first created in work with (and through the labor of) children with physical disabilities in the Abilities First School (Poughkeepsie, New York) and designed with the intention of researching its potential as a model for social change.

AUMI Program Logo, University of Kansas

Our local AUMI initiative KU-AUMI InterArts forms part of the international research network known as the AUMI Consortium. KU-AUMI InterArts has been tasked by the Consortium to focus specifically on interdisciplinary arts and improvisation, which led to the organization’s commitment to community-building “across abilities through creativity.” As KU-AUMI InterArts member and KU professor Nicole Hodges Persley expressed in conversation:

KU-AUMI InterArts seeks to decentralize hierarchies of ability by facilitating events that reveal the limitations of able-bodiedness as a concept altogether. An approach that does not challenge the able-bodied/disabled binary could dangerously contribute to the infantilizing and marginalization of certain bodies over others. Therefore, we must remain invested in understanding that there are scales of mobility that transcend our binary renditions of embodiment and we must continue to question how it is that we account for equality across abilities in our Lawrence community.

Local and international attempts to interpret the AUMI as a technology for the development of radical, improvisational methods are by no means a departure from its creators’ motivations. In line with KU-AUMI InterArts and the AUMI Consortium, my work here is that of naming how communal, mixed-ability interactions in Lawrence have come to disrupt the otherwise ableist communication methods that dominate musical production and performance.

The AUMI is designed to be accessed by those with profound physical disabilities. The AUMI software works using a visual tracking system, represented on-screen with a tiny red dot that begins at the very center. Performers can move the dot’s placement to determine which part of their body and its movement the AUMI should translate into sound. As one moves, so does the dot, and, in effect, the selected sound is produced through the performer’s movement.


Could this curious technology help build radical new coalitions between researchers and disabled populations? Mara Mills’s research examines how the history of communication technology in the United States has advanced through experimentation with disabled populations that have often been positioned as an exemplary pretext for funding, but then they are unable to access the final product, and sometimes even entirely erased from the history of a product’s development in the name of universal communication and capitalist accumulation. Therefore, the AUMI’s usage beyond the disabled populations first involved in its invention always stands on dubious historical, political, and philosophical ground. Yet, there is no doubt that the AUMI’s challenge to ableist musical production and performance has unexpectedly affected and reshaped communication for performers of different abilities in the Lawrence jam sessions, which speaks to its impressive coalitional potential. Institutional (especially academic) research invested in the AUMI’s potential then ought to, as its perpetual point of departure, loop back its energies in the service of disabled populations marginalized by ableist musical production and communication.

Facilitators of the library jam sessions, including myself, deliberately avoid exoticizing the AUMI and separating its initial developers and users from its present incarnations. To market the AUMI primarily as a peculiar or fringe musical experience would unnecessarily “Other” both the technology and its users. Instead, we have emphasized the communal practices that, for us, have made the AUMI work as a radically accessible, inclusionary, and democratic social technology. We are mainly invested in how the AUMI invites us to reframe the improvisational aspects of human communication upon a technology that always disorients and reorients what is being shared, how it is being shared, and the relationships between everyone performing. Disorientations reorient when it comes to our Lawrence AUMI community, because a tradition is being co-created around the transformative potential of the AUMI’s response-rate latency and its sporadic visual mode of recognition.

In his work on the AUMI, KU alumni and sound studies scholar Pete Williams explains how the wide range of mobility typically encouraged in what he calls “standard practice” across theatre, music, and dance is challenged by the AUMI’s tendency to inspire “smaller” movements from performers. While he sees in this affective/physical shift the opportunity for able-bodied performers to encounter “…an embodied understanding of the experience of someone with limited mobility,” my work here focuses less on the software’s potential for able-bodied performers to empathize with “limited” mobility and more on the atypical forms of social interaction and communication the AUMI seems to evoke in mixed-ability settings. An attempt to frame this technology as a disability simulator not only demarcates a troubling departure from its original, intended use by children with severe physical disabilities, but also constitutes a prioritization of able-bodied curiosity that contradicts what I’ve witnessed during mixed-ability AUMI jam sessions in Lawrence.

Sure, some able-bodied performers may come to describe such an experience of simulated “limited” mobility as meaningful, but how we integrate this dynamic into our analyses of the AUMI matters, through and through. What I aim to imply in my read of this technology is that there is no “limited” mobility to experientially empathize with in the first place. If we hold the AUMI’s early history close, then the AUMI is, first and foremost, designed to facilitate musical access for performers with severe physical disabilities. Its structural schematic and even its response-rate latency and sporadic visual mode of recognition ought to be treated as enabling functions rather than limiting ones. From this position, nothing about the AUMI exists for the recreation of disability for able-bodied performers. It is only from this specific position that the collectively disorienting/reorienting modes of communication enabled by the AUMI among mixed-ability groups may be read as resisting the violent history of labor exploitation, erasure, and appropriation Mills warns us about: that is, when AUMI initiatives, no matter how benevolently universal in their reach, act fundamentally as a strategy for the efficacious and responsible unsettling of ableist binaries.

The way the AUMI latches on to unexpected parts of a performer’s body and the “discrepancies” of its body-to-sound response rate are at the core of what sets this technology apart from many other instruments, but it is not the mechanical features alone that accomplish this. Sure, we can find similar dynamics in electronics of all sorts that are “failing,” in one way or another, to respond with accuracies intended during regular use, or we can emulate similar latencies within most recording software available today. But what I contend sets the AUMI apart goes beyond its clever camera-based visual tracking system and the sheer presence of said “incoherencies” in visual recognition and response rate.

Image by Ray Mizumura-Pence at The Commons, Spooner Hall, KU, at rehearsals for “(Un)Rolling the Boulder: Improvising New Communities” performance in October 2013.

What makes the AUMI a unique improvisational instrument is the tradition currently being co-created around its mechanisms in the Lawrence area, and the way these practices disrupt the borders between able-bodied and disabled musical production, participation, and communication. The most important component of our Lawrence-area AUMI culture is how facilitators engage the instrument’s “discrepancies” as regular functions of the technology and as mechanical dynamics worthy of celebration. At every AUMI library jam session I have participated in, not once have I heard Tucker or other facilitators make announcements about a future “fix” for these functions. Rather, I have witnessed an embrace of these features as intentionally integrated aspects of the AUMI. It comes as no surprise, then, that a “Battle of the Bands” event was organized as a way of leaning even further into what makes the AUMI more than a radically accessible musical instrument––that is, its relationship to orientation.

Perhaps it was the competitive framing of the event––we offered small prizes to every participating band––or the diversity among that day’s participants, or even the numerous times some of the performers had previously used this technology, but our event evoked a deliberate and collaborative improvisational method unfold in preparation for the performances. An ensemble mentality began to congeal even before performers entered the studio space, when Tucker first encouraged performers to choose their own fellow band members and come up with a working band name. The two newly-formed bands––Jayhawk Band and The Human Pianos––took turns, laying down collaboratively premeditated improvisations with composition (and perhaps even prizes) in mind. iPad AUMIs were installed in a circle on stands, with studio monitor headphones available for each performer.

Jayhawk Band’s eponymous improvisation “Jayhawks,” which brings together stylized steel drums, synthesizers, an 80’s-sounding floor tom, and a plucked woodblock sound, exemplifies this collaborative sensory ethos, unique in the seemingly discontinuous melding of its various sections and the play between its mercurial tessellations and amalgamations:

In “Jayhawks,” the floor tom riffs are set along a rhythmic trajectory defiant of any recognizable time signature, and the player switches suddenly to a wood block/plucking instrument mid-song (00:49). The composition’s lower-pitched instrument, sounding a bit like an electronic bass clarinet, opens the piece and, starting at 00:11, repeats a melodically ascending progression also uninhibited by the temporal strictures of time signature. In fact, all the melodic layers in “Jayhawk,” demonstrate a kind of temporally “unhinged” ensemble dynamic present in most of the library jam sessions that I’ve witnessed. Yet unexpected moves and elements ultimately cohere for jam session performers, such as Jayhawk Band’s members, because certain general directions were agreed upon prior to hitting “record,” whether this entails sound bank selections or compositional structure. All that to say that collective formalities are certainly at play here, despite the song’s fluid temporal/melodic nuances suggesting otherwise.

Five months after the battle of the bands, The Human Pianos and Jayhawk Band reunited at the library for a jam session. This time, performers were given the opportunity to prepare their individual iPad setup prior to entering the studio space. These customized setup selections were then transferred to the iPads inside the studio, where the new supergroup recorded their notoriously polyrhythmic, interspecies, sax-riddled composition “Animal Parade”:

As heard throughout the fascinating and unexpected moments of “Animal Parade,” the AUMI’s sensitivity can be adjusted for even the most minimal physical exertion and its sound bank variety spans from orchestral instruments, animal sounds, synthesizers, to various percussive instruments, dynamic adjustments, and even prefabricated loops. Yet, no matter how familiar a traditionally trained (and often able-bodied) musician may be with their sound selection, the concepts of rhythmic precision and musical proficiency––as they are understood within dominant understandings of time and consistency––are thoroughly scrambled by the visual tracking system’s sporadic mode of recognition and its inherent latency. As described above, it is structurally guaranteed that the AUMI’s red dot will not remain in its original place during a performance, but instead, latch onto unexpected parts of the body.

Simultaneously, the dot-to-movement response rate is not immediate. My own involvement with “the unexpected” in communal musical production and performance moulds my interpretation of what is socially (and politically) at work in both “Jayhawks” and “Animal Parade.” While participating in AUMI jam sessions I could not help but reminisce on similar experiences with the collective management of orientations/disorientations that, while depending on quite different technological structures, produced similar effects regarding performer communication.

Being a researcher steeped in the L.A. area Salsa, Latin Jazz, and Black Gospel scenes meant that I was immediately drawn to the AUMI’s most disorienting-yet-reorienting qualities. In Timba, the form of contemporary Afrocuban music that I most closely studied back in Los Angeles, disorientations and reorientations are the most prized structural moments in any composition. For example, Issac Delgado’s ensemble 1997 performance of “No Me Mires a Los Ojos” (“Don’t Look at Me In the Eyes”)– featuring now-legendary performances by Ivan “Melon” Lewis (keyboard), Alain Pérez (bass), and Andrés Cuayo (timbales)—sonically reveals the tradition’s call to disorient and reorient performers and dancers alike through collaborative improvisations:

Video Filmed by Michael Croy.

“No Me Mires a los Ojos” is riddled with moments of improvisational coalition formed rather immediately and then resolved in a return to the song’s basic structure. For listeners disciplined by Western musical training, the piece may seem to traverse several time signatures, even though it is written entirely in 4/4 time signature. Timba accomplishes an intense, percussively demanding, melodically multifaceted set of improvisations that happen all at once, with the end goal of making people dance, nodding at the principle tradition it draws its elements from: Afrocuban Rumba. Every performer that is not a horn player or a vocalist is articulating patterns specific to their instrument, played in the form of basic rhythms expected at certain sections. These patterns and their variations evolved from similar Rumba drum and bell formats and the improvisational contributions each musician is expected to integrate into their basic pattern too comes from Rumba’s long-standing tradition of formalized improvisation. The formal and the improvisational function as single communicative practice in Timba. Performers recall format from their embodied knowledge of Rumba and other pertinent influences while disrupting, animating, and transforming pre-written compositions with constant layers of improvisation.

What ultimately interests me the most about the formal registers within the improvisational tradition that is Timba, is that these seem to function, on at least one level, as premeditated terms for communal engagement. This kind of communication enables a social set of interactions that, like Jazz, grants every performer the opportunity to improvise at will, insofar as the terms of engagement are seriously considered. As with the AUMI library jam sessions, timba’s disorientations, too, seem to reorient. What is different, though, is how the AUMI’s sound bank acts in tandem with a performer’s own embodied musical knowledge as an extension of the archive available for improvisation. In Timba, the sound bank and knowledge of form are both entirely embodied, with synthesizers being the only exception.

Timba ensembles and their interpretations of traditional and non-Cuban forms, like the AUMI and its sound bank, use reliable and predictable knowledge bases to break with dominant notions of time and its coherence, only to wrangle performers back to whatever terms of communal engagement were previously decided upon. In this sense, I read the AUMI not as a solitary instrument but as a partial orchestration of sorts, with functions that enable not only an accessible musical experience but also social arrangements that rely deeply on a more responsible management of the unexpected. While the Timba ensemble is required to collaboratively instantiate the potential for disorientations, the AUMI provides an effective and generative incorporation of said potential as a default mechanism of instrumentation itself.

Image from “How do you AUMI?” at the Lawrence Public Library

As the AUMI continues on its early trajectory as a free, downloadable software designed to be accessed by performers of mixed abilities, it behooves us to listen deeply to the lessons learned by orchestral traditions older than our own. Timba does not come without its own problems of social inequity––it is often a “boy’s club,” for one––but there is much to learn about how the traditions built around its instruments have managed to centralize the value of unexpected, multilayered, and even complexly simultaneous patterns of communication. There is also something to be said about the necessity of studying the improvisational communication patterns of musical traditions that have not yet been institutionalized or misappropriated within “first world” societies. Timba teaches us that the conga alone will not speak without the support of a community that celebrates difference, the nuances of its organization, and the call to return to difference. It teaches us, in other words, to see the constant need for difference and its reorganization as a singular practice.

The work started with the AUMI’s earliest users in Poughkeepsie, New York and that involving mixed-ability ensembles in Lawrence, Kansas today is connected through the AUMI Consortium’s commitment to a kind of research aimed at listening closely and deeply to the AUMI’s improvisational potential interdisciplinarily and undisciplinarily across various sites. A tech innovation alone will not sustain the work of disrupting the longstanding, rooted forms of ableism ever-present in dominant musical production, performance, and communication, but mixed-ability performer coalitions organized around a radical interrogation of coherence and expectation may have a fighting chance. I hope the technology team never succeeds at working out all of the “discrepancies,” as these are helping us to build traditions that frame the AUMI’s mechanical propensity towards disorientation as the raw core of its democratic potential.

Featured Image: by Ray Mizumura-Pence at The Commons, Spooner Hall, KU, at rehearsals for “(Un)Rolling the Boulder: Improvising New Communities” performance in October 2013.

Caleb Lázaro Moreno is a doctoral student in the Department of American Studies at the University of Kansas. He was born in Trujillo, La Libertad (Perú) and grew up in Southern California. Lázaro Moreno is currently writing about several soundscapes present during one of the Los Angeles anti-xenophobia mega marches, which took place on March 25, 2006. He is also a multi-instrumentalist and composer, check out his Bandcamp page.

REWIND! . . .If you liked this post, you may also dig:

Introduction to Sound, Ability, and Emergence Forum –Airek Beauchamp

Unlearning Black Sound in Black Artistry: Examining the Quiet in Solange’s A Seat At the Table — Kimberly Williams

Experiments in Agent-based Sonic Composition — Andreas Duus Pape

On Whiteness and Sound Studies

white noise

World Listening Month3This is the first post in Sounding Out!’s 4th annual July forum on listening in observation of World Listening Day on July 18th, 2015.  World Listening Day is a time to think about the impacts we have on our auditory environments and, in turn, their effects on us.  For Sounding Out! World Listening Day necessitates discussions of the politics of listening and listening as a political act, beginning this year with Gustavus Stadler’s timely provocation.  –Editor-in-Chief JS

Many amusing incidents attend the exhibition of the Edison phonograph and graphophone, especially in the South, where a negro can be frightened to death almost by a ‘talking machine.’ Western Electrician May 11, 1889, (255).

What does an ever-nearer, ever-louder police siren sound like in an urban neighborhood, depending on the listener’s racial identity? Rescue or invasion? Impending succor or potential violence? These dichotomies are perhaps overly neat, divorced as they are from context. Nonetheless, contemplating them offers one charged example of how race shapes listening—and hence, some would say, sound itself—in American cities and all over the world. Indeed, in the past year, what Jennifer Stoever calls the “sonic color line” has become newly audible to many white Americans with the attention the #blacklivesmatter movement has drawn to police violence perpetrated routinely against people of color.

"Sheet music 'Coon Coon Coon' from 1901" via Wikimedia, public domain

“Sheet music ‘Coon Coon Coon’ from 1901” via Wikimedia, public domain

Racialized differences in listening have a history, of course. Consider the early decades of the phonograph, which coincided with the collapse of Reconstruction and the consolidation of Jim Crow laws (with the Supreme Court’s stamp of approval). At first, these historical phenomena might seem wholly discrete. But in fact, white supremacy provided the fuel for many early commercial phonographic recordings, including not only ethnic humor and “coon songs” but a form of “descriptive specialty”—the period name for spoken-word recordings about news events and slices of life—that reenacted the lynchings of black men. These lynching recordings, as I argued in “Never Heard Such a Thing,” an essay published in Social Text five years ago, appear to have been part of the same overall entertainment market as the ones lampooning foreign accents and “negro dialect”; that is, they were all meant to exhibit the wonders of the new sound reproduction to Americans on street corners, at country fairs, and in other public venues.

Thus, experiencing modernity as wondrous, by means of such world-rattling phenomena as the disembodiment of the voice, was an implicitly white experience. In early encounters with the phonograph, black listeners were frequently reminded that the marvels of modernity were not designed for them, and in certain cases were expressly designed to announce this exclusion, as the epigraph to this post makes brutally evident. For those who heard the lynching recordings, this new technology became another site at which they were reminded of the potential price of challenging the racist presumptions that underwrote this modernity. Of course, not all black (or white) listeners heard the same sounds or heard them the same way. But the overarching context coloring these early encounters with the mechanical reproduction of sound was that of deeply entrenched, aggressive, white supremacist racism.

"66 West 12th Street, New School entrance" by Wikimedia user Beyond My Ken, CC BY-SA 4.0

“66 West 12th Street, New School entrance” by Wikimedia user Beyond My Ken, CC BY-SA 4.0

The recent Sonic Shadows symposium at The New School offered me an opportunity to come back to “Never Heard Such a Thing” at a time when the field of sound studies has grown more prominent and coherent—arguably, more of an institutionally recognizable “field” than ever before. In the past three years, at least three major reference/textbook-style publications have appeared containing both “classic” essays and newer writing from the recent flowering of work on sound, all of them formidable and erudite, all of great benefit for those of us who teach classes about sound: The Oxford Handbook of Sound Studies (2012), edited by Karen Bijsterveld and Trevor Pinch; The Sound Studies Reader (2013), edited by Jonathan Sterne; and Keywords in Sound (2015), edited by David Novak and Matt Sakakeeny. From a variety of disciplinary perspectives, these collections bring new heft to the analysis of sound and sound culture.

I’m struck, however, by the relative absence of a certain strain of work in these volumes—an approach that is difficult to characterize but that is probably best approximated by the term “American Studies.” Over the past two decades, this field has emerged as an especially vibrant site for the sustained, nuanced exploration of forms of social difference, race in particular. Some of the most exciting sound-focused work that I know of arising from this general direction includes: Stoever’s trailblazing account of sound’s role in racial formation in the U.S.; Fred Moten’s enormously influential remix of radical black aesthetics, largely focused on music but including broader sonic phenomena like the scream of Frederick Douglass’s Aunt Hester; Bryan Wagner’s work on the role of racial violence in the “coon songs” written and recorded by George W. Johnson, widely considered the first black phonographic artist; Dolores Inés Casillas’s explication of Spanish-language radio’s tactical sonic coding at the Mexican border; Derek Vaillant’s work on racial formation and Chicago radio in the 1920s and 30s. I was surprised to see none of these authors included in any of the new reference works; indeed, with the exception of one reference in The Sound Studies Reader to Moten’s work (in an essay not concerned with race), none is cited. The new(ish) American Studies provided the bedrock of two sound-focused special issues of journals: American Quarterly’s “Sound Clash: Listening to American Studies,” edited by Kara Keeling and Josh Kun, and Social Text’s “The Politics of Recorded Sound,” edited by me. Many of the authors of the essays in these special issues hold expertise in the history and politics of difference, and scholarship on those issues drives their work on sound. None of them, other than Mara Mills, is among the contributors to the new reference works. Aside from Mills’s contributions and a couple of bibliographic nods in the introduction, these journal issues play no role in the analytical work collected in the volumes.

"Blank pages intentionally, end of book" by Wikimedia user Brian 0918, CC BY-SA 3.0

“Blank pages intentionally, end of book” by Wikimedia user Brian 0918, CC BY-SA 3.0

The three new collections address the relationship between sound, listening, and specific forms of social difference to varying degrees. All three of the books contain excerpts from Mara Mills’ excellent work on the centrality of deafness to the development of sound technology. The Sound Studies Reader, in particular, contains a small array of pieces that focus on disability, gender and race; in attending to race, specifically, Sterne shrewdly includes an excerpt from Franz Fanon’s A Dying Colonialism, as well as essays on black music by authors likely unfamiliar to many American readers. The Oxford Handbook’s sole piece addressing race is a contribution on racial authenticity in hip-hop. It’s a strong essay in itself. But appearing in this time and space of field-articulation, its strength is undermined by its isolation, and its distance from any deeper analysis of race’s role in sound than what seems to be, across all three volumes, at best, a liberal politics of representation or “inclusion.” Encountering the three books at once, I found it hard not to hear the implicit message that no sound-related topics other than black music have anything to do with race. At the same time, the mere inclusion of work on black music in these books, without any larger theory of race and sound or wider critical framing, risks reproducing the dubious politics of white Euro-Americans’ long historical fascination with black voices.

What I would like to hear more audibly in our field—what I want all of us to work to make more prominent and more possible—is scholarship that explicitly confronts, and broadcasts, the underlying whiteness of the field, and of the generic terms that provide so much currency in it: terms like “the listener,” “the body,” “the ear,” and so on. This work does exist. I believe it should be aggressively encouraged and pursued by the most influential figures in sound studies, regardless of their disciplinary background. Yes, work in these volumes is useful for this project; Novak and Sakakeeny seem to be making this point in their Keywords introduction when they write:

While many keyword entries productively reference sonic identities linked to socially constructed categories of gender, race, ethnicity, religion, disability, citizenship, and personhood, our project does not explicitly foreground those modalities of social difference. Rather, in curating a conceptual lexicon for a particular field, we have kept sound at the center of analysis, arriving at other points from the terminologies of sound, and not the reverse. (8)

I would agree there are important ways of exploring sound and listening that need to be sharpened in ways that extended discussion of race, gender, class, or sexuality will not help with. But this doesn’t mean that work that doesn’t consider such categories is somehow really about sound in a way that the work does take them up isn’t, any more than a white middle-class person who hears a police siren can really hear what it sounds like while a black person’s perception of the sound is inaccurate because burdened (read: biased) by the weight of history and politics.

"Pointy Rays of Justice" by Flickr user Christopher Sebela, CC BY-NC 2.0

“Pointy Rays of Justice” by Flickr user Christopher Sebela, CC BY-NC 2.0

In a recent Twitter conversation with me, the philosopher Robin James made the canny point that whiteness, masquerading as lack of bias, can operate to guarantee the coherence and legibility of a field in formation. James’s trenchant insight reminds me of cultural theorist Kandice Chuh’s recent work on “aboutness” in “It’s Not About Anything,” from Social Text (Winter 2014) and knowledge formation in the contemporary academy. Focus on what the object of analysis in a field is, on what work in a field is about, Chuh argues, is “often conducted as a way of avoiding engagement with ‘difference,’ and especially with racialized difference.”

I would like us to explore alternatives to the assumption that we have to figure out how to talk about sound before we can talk about how race is indelibly shaping how we think about sound; I want more avenues opened, by the most powerful voices in the field, for work acknowledging that our understanding of sound is always conducted, and has always been conducted, from within history, as lived through categories like race.

The cultivation of such openings also requires that we acknowledge the overwhelming whiteness of scholars in the field, especially outside of work on music. If you’re concerned by this situation, and have the opportunity to do editorial work, one way to work to change it is by making a broader range of work in the field more inviting to people who make the stakes of racial politics critical to their scholarship and careers. As I’ve noted, there are people out there doing such work; indeed, Sounding Out! has continually cultivated and hosted it, with far more editorial care and advisement than one generally encounters in blogs (at least in my experience), over the course of its five years. But if the field remains fixated on sound as a category that exists in itself, outside of its perception by specifically marked subjects and bodies within history, no such change is likely to occur. Perhaps we will simply resign ourselves to having two (or more) isolated tracks of sound studies, or perhaps some of us will have to reevaluate whether we’re able to teach what we think is important to teach while working under its rubric.

Thanks to Robin James, Julie Beth Napolin, Jennifer Stoever, and David Suisman for their ideas and feedback.

Gustavus Stadler teaches English and American Studies at Haverford College. He is the author of Troubling Minds: The Cultural Politics of Genius in the U. S.1840-1890 (U of Minn Press, 2006).  His 2010 edited special issue of Social Text on “The Politics of Recorded Sound” was named a finalist for a prize in the category of “General History” by the Association of Recorded Sound Collections. He is the recipient of the 10th Annual Woody Guthrie fellowship! This fellowship will support research for his book-in-progress, Woody Guthrie and the Intimate Life of the Left.


tape reelREWIND! . . .If you liked this post, you may also dig:

Reading the Politics of Recorded Sound — Jennifer Stoever

Hearing the Tenor of the Vendler/Dove Conversation: Race, Listening, and the “Noise” of Texts — Christina Sharpe

Listening to the Border: “‘2487’: Giving Voice in Diaspora” and the Sound Art of Luz María Sánchez — Dolores Inés Casillas

Optophones and Musical Print

The word type, as scanned by the optophone.

From E.E. Fournier d’Albe, The Moon Element (New York: D. Appleton & Company, 1924), 141.


In 1912, British physicist Edmund Fournier d’Albe built a device that he called the optophone, which converted light into tones. The first model—“the exploring optophone”—was meant to be a travel aid; it converted light into a sound of analogous intensity. A subsequent model, “the reading optophone,” scanned print using lamp-light separated into beams by a perforated disk. The pattern of light reflected back from a given character triggered a corresponding set of tones in a telephone receiver. d’Albe initially worked with 8 beams, producing 8 tones based on a diatonic scale. He settled on 5 notes: lower G, and then middle C, D, E and G. (Sol, do, re, mi, sol.) The optophone became known as a “musical print” machine. It was popularized by Mary Jameson, a blind student who achieved reading speeds of 60 words per minute.

Photograph of the optophone, an early scanner with a rounded glass bookrest.

Reading Optophone, held at Blind Veterans UK (formerly St. Dunstan’s). Photographed by the author. With thanks to Robert Baker for helping me search through the storeroom to locate this item, which was previously uncatalogued and “lost” for many years.


Scientific illustration of the optophone, showing a book on the bookrest and a pair of headphones for listening to the tonal output.

Schematic of optophone from Vetenskapen och livet (1922)

In the field of media studies, the optophone has become renowned through its imaginary repurposings by a number of modernist artists. For one thing, the optophone finds brief mention in Finnegan’s Wake. In turn, Marshall McLuhan credited James Joyce’s novel for being a new medium, turning text into sound. In “New Media as Political Forms,” McLuhan says that Joyce’s own “optophone principle” releases us from “the metallic and rectilinear embrace of the printed page.” More familiar within media studies today, Dada artist Raoul Hausmann patented (London 1935), but did not successfully build, an optophone presumably inspired by d’Albe’s model, which he hoped would be employed in audiovisual performances. This optophone was meant to convert sound into light as well as the reverse. It was part of a broader contemporary impulse to produce color music and synaesthetic art. Hausmann also wrote optophonetic poetry, based on the sounds and rhythms of “pure phonemes” and non-linguistic noises. In response, Francis Picabia painted two optophone portraits in 1921 and 22. Optophone I, below, is composed of lines that might be sound waves, with a pattern that disorders vision.

Francis Picabia's Optophone I, a series of concentric black circles with a female figure at the center.

Francis Picabia, Optophone I (1922)

Theorists have repeatedly located Hausmann’s device at the origin of new media. Authors in the Audiovisuology, Media Archaeology, and Beyond Art: A Third Culture anthologies credit Hausmann’s optophone with bringing-into-being cybernetics, digitization, the CD-ROM, audiovisual experiments in video art, and “primitive computers.” It seems to have escaped notice that d’Albe also used the optophone to create electrical music. In his book, The Moon Element, he writes:

Needless to say, any succession or combination of musical notes can be picked out by properly arranged transparencies, and I have succeeded in transcribing a number of musical compositions in this manner, which are, of course, only audible in the telephone. These notes, in the absence of all other sounding mechanism, are particularly pure and free from overtones. Indeed, a musical optophone worked by this intermittent light, has been arranged by means of a simple keyboard, and some very pleasant effects may thus be obtained, more especially as the loudness and duration of the different notes is under very complete and separate control.

E.E. Fournier d’Albe, The Moon Element (New York: D. Appleton & Company, 1924), 107.

d’Albe’s device is typically portrayed as a historical cul-de-sac, with few users and no real technical influence. Yet optophones continued to be designed for blind people throughout the twentieth century; at least one model has users even today. Musical print machines, or “direct translators,” co-existed with more complex OCR-devices—optical character recognizers that converted printed words into synthetic speech. Both types of reading machine contributed to today’s procedures for scanning and document digitization. Arguably, reading optophones intervened more profoundly into the order of print than did Hausmann’s synaesthetic machine: they not only translated between the senses, they introduced a new symbolic system by which to read. Like braille, later vibrating models proposed that the skin could also read.

In December 1922, the Optophone was brought to the United States from the United Kingdom for a demonstration before a number of educators who worked with blind children; only two schools ordered the device. Reading machine development accelerated in the U.S. around World War II. In his position as chair of the National Defense Research Committee, Vannevar Bush established a Committee on Sensory Devices in 1944, largely for the purpose of rehabilitating blind soldiers. The other options for reading—braille and Talking Books—were relatively scarce and had a high cost of production. Reading machines promised to give blind readers access to magazines and ephemeral print (recipes, signs, mail), which was arguably more important than access to books.

Piechowski, wearing a suit, scans the pen of the A-2 reader over a document.

Joe Piechowski with the A-2 reader. Courtesy of Rob Flory.

At RCA (Radio Corporation of America), the television innovator Vladimir Zworykin became involved with this project. Zworykin had visited Fournier d’Albe in London in the 19-teens and seen a demonstration of the optophone. Working with Les Flory and Winthrop Pike, Zworykin built an initial machine known as the A-2 that operated on the same principles, but used a different mechanism for scanning—an electric stylus, which was publicized as “the first pen that reads.” Following the trail of citations for RCA’s “Reading Aid for the Blind” patent (US 2420716A, filed 1944), it is clear that the “pen” became an aid in domains far afield from blindness. It was repurposed as an optical probe for measuring the oxygen content of blood (1958); an “optical system for facsimile scanners” (1972); and, in a patent awarded to Burroughs Corporation in 1964, a light gun. This gun, in turn, found its way into the handheld controls for the first home video game system, produced by Sanders Associates.

The A-2 optophone was tested on three blind research subjects, including ham radio enthusiast Joe Piechowski, who was more of a technical collaborator. According to the reports RCA submitted to the CSD, these readers were able to correlate the “chirping” or “tweeting” sounds of the machine with letters “at random with about eighty percent accuracy” after 60 hours of practice. Close spacing on a printed page made it difficult to differentiate between letters; readers also had difficulty moving the stylus at a steady pace and in a straight line. Piechowski achieved reading speeds of 20 words per minute, which RCA deemed too slow.

Attempts were made to incorporate “human factors” and create a more efficient tonal code, to reduce reading time as well as learning time and confusion between letters. One alternate auditory display was known as the compressed optophone. Rather than generate multiple tones or chords for a single printed letter, which was highly redundant and confusing to the ear, the compressed version identified only certain features of a printed letter: such as the presence of an ascender or descender. Below is a comparison between the tones of the original optophone and the compressed version, recorded by physicist Patrick Nye in 1965. The following eight lower case letters make up the source material: f, i, k, j, p, q, r, z.

Original record in the author’s possession. With thanks to Elaine Nye, who generously tracked down two of her personal copies at the author’s request. The second copy is now held at Haskins Laboratories.

An image of the letter r as scanned by the optophone and compressed optophone.

From Patrick Nye, “An Investigation of Audio Outputs for a Reading Machine,” AFB Research Bulletin (July 1965): 30.


Because of the seeming limitations of tonal reading, RCA engineers re-directed their research to add character recognition to the scanning process. This was controversial, direct translators like the optophone being perceived as too difficult because they required blind people to do something akin to learning to read print—learning a symbolic tonal or tactile code. At an earlier moment, braille had been critiqued on similar grounds; many in the blind community have argued that mainstream anxieties about braille sprang from its symbolic difference. Speed, moreover, is relative. Reading machine users protested that direct translators like the optophone were inexpensive to build and already available—why wait for the refinement of OCR and synthetic speech? Nevertheless, between November 1946 and May 1947, Zworykin, Flory, and Pike worked on a prototype “letter reading machine,” today widely considered to be the first successful example of optical character recognition (OCR). Before reliable synthetic speech, this device spelled out words letter by letter using tape recordings. The Letter-Reader was too massive and expensive for personal use, however. It also had an operating speed of 20 words per minute—thus it was hardly an improvement over the A-2 translator.

Haskins Laboratories, another affiliate of the Committee on Sensory Devices, began working on the reading machine problem around the same time, ultimately completing an enormous amount of research into synthetic speech and—as argued by Donald Shankweiler and Carol Fowler—the “speech code” itself. In the 1940s, before workable text-to-speech, researchers at Haskins wanted to determine whether tones or artificial phonemes (“speech-like speech”) were easier to read by ear. They developed a “machine dialect of English,” named wuhzi: “a transliteration of written English which preserved the phonetic patterns of the words.” An example can be played below. The eight source words are: With, Will, Were, From, Been, Have, This, That.

Original record in the author’s possession. From Patrick Nye, “An Investigation of Audio Outputs for a Reading Machine” (1965). With thanks to Elaine Nye.

Based on the results of tests with several human subjects, the Haskins researchers concluded that aural reading via speech-like sounds was necessarily faster than reading musical tones. Like the RCA engineers, they felt that a requirement of these machines should be a fast rate of reading. Minimally, they felt that reading speed should keep pace with rapid speech, at about 200 words per minute.

Funded by the Veterans Administration, members of Mauch Laboratories in Ohio worked on both musical optophones and spelled-speech recognition machines from the 1950s into the 1970s. One of their many devices, the Visotactor, was a direct-translator with vibro-tactile output for four fingers. Another, the Visotoner, was a portable nine-channel optophone. All of the Mauch machines were tested by Harvey Lauer, a technology transfer specialist for the Veterans Administration for over thirty years, himself blind. Below is an excerpt from a Visotoner demonstration, recorded by Lauer in 1971.

Visotoner demonstration. Original 7” open reel tape in author’s possession. With thanks to Harvey Lauer for sharing items from his impressive collection and for collaborating with the author over many years.

Lauer's fingers are pictured in the finger-rests of the Visotactor, scanning a document.

Harvey Lauer reading with the Visotactor, a text-to-tactile translator, 1977.

Later on the same tape, Lauer discusses using the Visotoner to read mail, identify currency, check over his own typing, and read printed charts or graphics. He achieved reading speeds of 40 words per minute with the device. Lauer has also told me that he prefers the sound of the Visotoner to that of other optophone models—he compares its sound to Debussy, or the music for dream sequences in films.

Mauch also developed a spelled speech OCR machine called the Cognodictor, which was similar to the RCA model but made use of synthetic speech. In the recording below, Lauer demonstrates this device by reading a print-out about IBM fonts. He simultaneously reads the document with the Visotoner, which reveals glitches in the Cognodictor’s spelling.

Original 7” open reel tape in the author’s possession. With thanks to Harvey Lauer.

A hand uses the metal probe of the Cognodictor to scan a typed document.

The Cognodictor. Glendon Smith and Hans Mauch, “Research and Development in the Field of Reading Machines for the Blind,” Bulletin of Prosthetics Research (Spring 1977): 65.

In 1972, at the request of Lauer and other blind reading machine users, Mauch assembled a stereo-optophone with ten channels, called the Stereotoner. This device was distributed through the VA but never marketed, and most of the documentation exists in audio format, specifically in sets of training tapes that were made for blinded veterans who were the test subjects. Some promotional materials, such as the short video below, were recorded for sighted audiences—presumably teachers, rehabilitation specialists, or funding agencies.

Mauch Stereo Toner from Sounding Out! on Vimeo.

Video courtesy of Harvey Lauer.

Mary Jameson corresponded with Lauer about the stereotoner, via tape and braille, in the 1970s. In the braille letter pictured below she comments, “I think that stereotoner signals are the clearest I have heard.”

Scan of a braille letter from Jameson to Lauer.

Letter courtesy of Harvey Lauer. Transcribed by Shafeka Hashash.

In 1973, with the marketing of the Kurzweil Reader, funding for direct translation optophones ceased. The Kurzweil Reader was advertised as the first machine capable of multi-font OCR; it was made up of a digital computer and flatbed scanner and it could recognize a relatively large number of typefaces. Kurzweil recalls in his book The Age of Spiritual Machines that this technology quickly transferred to Lexis-Nexis as a way to retrieve information from scanned documents. As Lauer explained to me, the abandonment of optophones was a serious problem for people with print disabilities: the Kurzweil Readers were expensive ($10,000-$50,000 each); early models were not portable and were mostly purchased by libraries. Despite being advertised as omnifont readers, they could not in fact recognize most printed material. The very fact of captchas speaks to the continued failures of perfect character recognition by machines. And, as the “familiarization tapes” distributed to blind readers indicate, the early synthetic speech interface was not transparent—training was required to use the Kurzweil machines.

Original cassette in the author’s possession. 

A young Kurzweil stands by his reading machine, demonstrated by Jernigan, who is seated.

Raymond Kurzweil and Kenneth Jernigan with the Kurzweil Reading Machine (NFB, 1977). Courtesy National Federation of the Blind.

Lauer always felt that the ideal reading machine should have both talking OCR and direct-translation capabilities, the latter being used to get a sense of the non-text items on a printed page, or to “preview material and read unusual and degraded print.” Yet the long history of the optophone demonstrates that certain styles of decoding have been more easily naturalized than others—and symbols have increasingly been favored if they bear a close relation to conventional print or speech. Finally, as computers became widely available, the focus for blind readers shifted, as Lauer puts it, “from reading print to gaining access to computers.” Today, many electronic documents continue to be produced without OCR, and thus cannot be translated by screen readers; graphical displays and videos are largely inaccessible; and portable scanners are far from universal, leaving most “ephemeral” print still unreadable.

Mara Mills is an Assistant Professor of Media, Culture, and Communication at New York University, working at the intersection of disability studies and media studies. She is currently completing a book titled On the Phone: Deafness and Communication Engineering. Articles from this project can be found in Social Text, differences, the IEEE Annals of the History of Computing, and The Oxford Handbook of Sound Studies. Her second book project, Print Disability and New Reading Formats, examines the reformatting of print over the course of the past century by blind and other print disabled readers, with a focus on Talking Books and electronic reading machines. This research is supported by NSF Award #1354297.

%d bloggers like this: