Optophones and Musical Print
In 1912, British physicist Edmund Fournier d’Albe built a device that he called the optophone, which converted light into tones. The first model—“the exploring optophone”—was meant to be a travel aid; it converted light into a sound of analogous intensity. A subsequent model, “the reading optophone,” scanned print using lamp-light separated into beams by a perforated disk. The pattern of light reflected back from a given character triggered a corresponding set of tones in a telephone receiver. d’Albe initially worked with 8 beams, producing 8 tones based on a diatonic scale. He settled on 5 notes: lower G, and then middle C, D, E and G. (Sol, do, re, mi, sol.) The optophone became known as a “musical print” machine. It was popularized by Mary Jameson, a blind student who achieved reading speeds of 60 words per minute.
In the field of media studies, the optophone has become renowned through its imaginary repurposings by a number of modernist artists. For one thing, the optophone finds brief mention in Finnegan’s Wake. In turn, Marshall McLuhan credited James Joyce’s novel for being a new medium, turning text into sound. In “New Media as Political Forms,” McLuhan says that Joyce’s own “optophone principle” releases us from “the metallic and rectilinear embrace of the printed page.” More familiar within media studies today, Dada artist Raoul Hausmann patented (London 1935), but did not successfully build, an optophone presumably inspired by d’Albe’s model, which he hoped would be employed in audiovisual performances. This optophone was meant to convert sound into light as well as the reverse. It was part of a broader contemporary impulse to produce color music and synaesthetic art. Hausmann also wrote optophonetic poetry, based on the sounds and rhythms of “pure phonemes” and non-linguistic noises. In response, Francis Picabia painted two optophone portraits in 1921 and 22. Optophone I, below, is composed of lines that might be sound waves, with a pattern that disorders vision.
Theorists have repeatedly located Hausmann’s device at the origin of new media. Authors in the Audiovisuology, Media Archaeology, and Beyond Art: A Third Culture anthologies credit Hausmann’s optophone with bringing-into-being cybernetics, digitization, the CD-ROM, audiovisual experiments in video art, and “primitive computers.” It seems to have escaped notice that d’Albe also used the optophone to create electrical music. In his book, The Moon Element, he writes:
d’Albe’s device is typically portrayed as a historical cul-de-sac, with few users and no real technical influence. Yet optophones continued to be designed for blind people throughout the twentieth century; at least one model has users even today. Musical print machines, or “direct translators,” co-existed with more complex OCR-devices—optical character recognizers that converted printed words into synthetic speech. Both types of reading machine contributed to today’s procedures for scanning and document digitization. Arguably, reading optophones intervened more profoundly into the order of print than did Hausmann’s synaesthetic machine: they not only translated between the senses, they introduced a new symbolic system by which to read. Like braille, later vibrating models proposed that the skin could also read.
In December 1922, the Optophone was brought to the United States from the United Kingdom for a demonstration before a number of educators who worked with blind children; only two schools ordered the device. Reading machine development accelerated in the U.S. around World War II. In his position as chair of the National Defense Research Committee, Vannevar Bush established a Committee on Sensory Devices in 1944, largely for the purpose of rehabilitating blind soldiers. The other options for reading—braille and Talking Books—were relatively scarce and had a high cost of production. Reading machines promised to give blind readers access to magazines and ephemeral print (recipes, signs, mail), which was arguably more important than access to books.
At RCA (Radio Corporation of America), the television innovator Vladimir Zworykin became involved with this project. Zworykin had visited Fournier d’Albe in London in the 19-teens and seen a demonstration of the optophone. Working with Les Flory and Winthrop Pike, Zworykin built an initial machine known as the A-2 that operated on the same principles, but used a different mechanism for scanning—an electric stylus, which was publicized as “the first pen that reads.” Following the trail of citations for RCA’s “Reading Aid for the Blind” patent (US 2420716A, filed 1944), it is clear that the “pen” became an aid in domains far afield from blindness. It was repurposed as an optical probe for measuring the oxygen content of blood (1958); an “optical system for facsimile scanners” (1972); and, in a patent awarded to Burroughs Corporation in 1964, a light gun. This gun, in turn, found its way into the handheld controls for the first home video game system, produced by Sanders Associates.
The A-2 optophone was tested on three blind research subjects, including ham radio enthusiast Joe Piechowski, who was more of a technical collaborator. According to the reports RCA submitted to the CSD, these readers were able to correlate the “chirping” or “tweeting” sounds of the machine with letters “at random with about eighty percent accuracy” after 60 hours of practice. Close spacing on a printed page made it difficult to differentiate between letters; readers also had difficulty moving the stylus at a steady pace and in a straight line. Piechowski achieved reading speeds of 20 words per minute, which RCA deemed too slow.
Attempts were made to incorporate “human factors” and create a more efficient tonal code, to reduce reading time as well as learning time and confusion between letters. One alternate auditory display was known as the compressed optophone. Rather than generate multiple tones or chords for a single printed letter, which was highly redundant and confusing to the ear, the compressed version identified only certain features of a printed letter: such as the presence of an ascender or descender. Below is a comparison between the tones of the original optophone and the compressed version, recorded by physicist Patrick Nye in 1965. The following eight lower case letters make up the source material: f, i, k, j, p, q, r, z.
Original record in the author’s possession. With thanks to Elaine Nye, who generously tracked down two of her personal copies at the author’s request. The second copy is now held at Haskins Laboratories.
Because of the seeming limitations of tonal reading, RCA engineers re-directed their research to add character recognition to the scanning process. This was controversial, direct translators like the optophone being perceived as too difficult because they required blind people to do something akin to learning to read print—learning a symbolic tonal or tactile code. At an earlier moment, braille had been critiqued on similar grounds; many in the blind community have argued that mainstream anxieties about braille sprang from its symbolic difference. Speed, moreover, is relative. Reading machine users protested that direct translators like the optophone were inexpensive to build and already available—why wait for the refinement of OCR and synthetic speech? Nevertheless, between November 1946 and May 1947, Zworykin, Flory, and Pike worked on a prototype “letter reading machine,” today widely considered to be the first successful example of optical character recognition (OCR). Before reliable synthetic speech, this device spelled out words letter by letter using tape recordings. The Letter-Reader was too massive and expensive for personal use, however. It also had an operating speed of 20 words per minute—thus it was hardly an improvement over the A-2 translator.
Haskins Laboratories, another affiliate of the Committee on Sensory Devices, began working on the reading machine problem around the same time, ultimately completing an enormous amount of research into synthetic speech and—as argued by Donald Shankweiler and Carol Fowler—the “speech code” itself. In the 1940s, before workable text-to-speech, researchers at Haskins wanted to determine whether tones or artificial phonemes (“speech-like speech”) were easier to read by ear. They developed a “machine dialect of English,” named wuhzi: “a transliteration of written English which preserved the phonetic patterns of the words.” An example can be played below. The eight source words are: With, Will, Were, From, Been, Have, This, That.
Original record in the author’s possession. From Patrick Nye, “An Investigation of Audio Outputs for a Reading Machine” (1965). With thanks to Elaine Nye.
Based on the results of tests with several human subjects, the Haskins researchers concluded that aural reading via speech-like sounds was necessarily faster than reading musical tones. Like the RCA engineers, they felt that a requirement of these machines should be a fast rate of reading. Minimally, they felt that reading speed should keep pace with rapid speech, at about 200 words per minute.
Funded by the Veterans Administration, members of Mauch Laboratories in Ohio worked on both musical optophones and spelled-speech recognition machines from the 1950s into the 1970s. One of their many devices, the Visotactor, was a direct-translator with vibro-tactile output for four fingers. Another, the Visotoner, was a portable nine-channel optophone. All of the Mauch machines were tested by Harvey Lauer, a technology transfer specialist for the Veterans Administration for over thirty years, himself blind. Below is an excerpt from a Visotoner demonstration, recorded by Lauer in 1971.
Visotoner demonstration. Original 7” open reel tape in author’s possession. With thanks to Harvey Lauer for sharing items from his impressive collection and for collaborating with the author over many years.
Later on the same tape, Lauer discusses using the Visotoner to read mail, identify currency, check over his own typing, and read printed charts or graphics. He achieved reading speeds of 40 words per minute with the device. Lauer has also told me that he prefers the sound of the Visotoner to that of other optophone models—he compares its sound to Debussy, or the music for dream sequences in films.
Mauch also developed a spelled speech OCR machine called the Cognodictor, which was similar to the RCA model but made use of synthetic speech. In the recording below, Lauer demonstrates this device by reading a print-out about IBM fonts. He simultaneously reads the document with the Visotoner, which reveals glitches in the Cognodictor’s spelling.
Original 7” open reel tape in the author’s possession. With thanks to Harvey Lauer.
In 1972, at the request of Lauer and other blind reading machine users, Mauch assembled a stereo-optophone with ten channels, called the Stereotoner. This device was distributed through the VA but never marketed, and most of the documentation exists in audio format, specifically in sets of training tapes that were made for blinded veterans who were the test subjects. Some promotional materials, such as the short video below, were recorded for sighted audiences—presumably teachers, rehabilitation specialists, or funding agencies.
Mauch Stereo Toner from Sounding Out! on Vimeo.
Video courtesy of Harvey Lauer.
Mary Jameson corresponded with Lauer about the stereotoner, via tape and braille, in the 1970s. In the braille letter pictured below she comments, “I think that stereotoner signals are the clearest I have heard.”
In 1973, with the marketing of the Kurzweil Reader, funding for direct translation optophones ceased. The Kurzweil Reader was advertised as the first machine capable of multi-font OCR; it was made up of a digital computer and flatbed scanner and it could recognize a relatively large number of typefaces. Kurzweil recalls in his book The Age of Spiritual Machines that this technology quickly transferred to Lexis-Nexis as a way to retrieve information from scanned documents. As Lauer explained to me, the abandonment of optophones was a serious problem for people with print disabilities: the Kurzweil Readers were expensive ($10,000-$50,000 each); early models were not portable and were mostly purchased by libraries. Despite being advertised as omnifont readers, they could not in fact recognize most printed material. The very fact of captchas speaks to the continued failures of perfect character recognition by machines. And, as the “familiarization tapes” distributed to blind readers indicate, the early synthetic speech interface was not transparent—training was required to use the Kurzweil machines.
Original cassette in the author’s possession.
Lauer always felt that the ideal reading machine should have both talking OCR and direct-translation capabilities, the latter being used to get a sense of the non-text items on a printed page, or to “preview material and read unusual and degraded print.” Yet the long history of the optophone demonstrates that certain styles of decoding have been more easily naturalized than others—and symbols have increasingly been favored if they bear a close relation to conventional print or speech. Finally, as computers became widely available, the focus for blind readers shifted, as Lauer puts it, “from reading print to gaining access to computers.” Today, many electronic documents continue to be produced without OCR, and thus cannot be translated by screen readers; graphical displays and videos are largely inaccessible; and portable scanners are far from universal, leaving most “ephemeral” print still unreadable.
Mara Mills is an Assistant Professor of Media, Culture, and Communication at New York University, working at the intersection of disability studies and media studies. She is currently completing a book titled On the Phone: Deafness and Communication Engineering. Articles from this project can be found in Social Text, differences, the IEEE Annals of the History of Computing, and The Oxford Handbook of Sound Studies. Her second book project, Print Disability and New Reading Formats, examines the reformatting of print over the course of the past century by blind and other print disabled readers, with a focus on Talking Books and electronic reading machines. This research is supported by NSF Award #1354297.
Sound as Art as Anti-environment
When I performed at the 2012 Computers and Writing Conference in Raleigh, North Carolina, I looked around during my fairly abstract 10-minute long improvisation featuring feedback loops, glitches, silences, and circuit-bent instruments, and I noticed the audience’s sometimes visible restlessness, discomfort, and even anxiety. This is a fairly common occurrence when I perform experimental sound art, particularly in contexts in which audiences expect “music” (you can hear my work at 38:30 in the video below). However, for an experimental sound artist to take offense to such reactions is, in my estimation, missing the point of the exercise. That sound art disrupts, agitates, and even offends is a powerfully reaffirming reminder that sound art transcends music and sound; it is a method of revelation, an act that surpasses logical communication, instead challenging the very nature of sound and perception.
As an artist, scholar, and fan, I am drawn toward sound and music that lures me into a new world, an unfamiliar way of being and knowing. Like Lewis Carroll’s Alice, I learn that the rules of my world no longer apply. This happened when I heard J Dilla’s Donuts album, and when I heard Madlib’s Medicine Show #3: Beat Konducta in Africa, when I heard Miles Davis’ Bitches Brew. An artist that continually draws me down the rabbit hole is Walter Gross, an experimental sound/beat artist out of Los Angeles. His work changes the way I usually interact with sonic art, both in terms of his sound and in his approach to physical collage and handcrafted cassette packaging, Gross departs from the comfortable and familiar listening imparted by polished hi-fi 3-minute tracks with definitive beginnings and ends and discernible melodies. Gross instead propels listeners into very unusual (and pleasantly discomforting) soundscapes that demand attention. Almost counter-intuitively, Gross’s visual representations of his work intensify that experience. Consider his 2010 work, Dopamine:
Dopamine is likely a challenging piece for audiences, at least in terms of violating the dominant structures of music. The piece opens with disorienting use of panning, deliberately obscuring degraded audio, largely indiscernible movements and patterns, and so on. His video work likewise presents a fitting yet relatively unusual juxtaposition of youth and destruction, celebration and danger. In terms of both sound and sight, Gross’ work disrupts dominant musical sensibilities, challenging the very patterns and structures within which we can express ideas. He violates tradition, shakes off the canonical baggage carried by prevailing paradigms of Art and Music, and plunges audiences into unfamiliar sensory experiences that require metacognition, reflection, and examination of what sonic art is, and more importantly, what sonic art can be. Gross, in other words, seems to transcend the musician moniker and reach something else entirely. In what follows, I’d like to explore a (very brief) history of such artists, and begin to think about how to frame sonic art as immersion in what Marshall McLuhan called anti-environments: the unconscious environment as raised to conscious attention.
Sound as Art
There exists a strong tradition of experimental noise and sound art, particularly in 20th-century Western avant-garde movements. Futurists were arguably the first to consider noise as music in the European tradition, and were certainly influential in asking artists and audiences to become more aware of the changing social and sonic surroundings . In his 1913 manifesto-of-sorts titled “The Art of Noises,” Italian Futurist Luigi Russolo proposed an orchestral configuration that more aptly represented the range of sounds available to contemporary listeners, namely those sounds that accompanied industrialization and urbanization. The sounds of the Futurist orchestra would include “rumbles, roars, explosions, and crashes.” Russolo built devices called intonarumori to mechanically achieve and manipulate these sounds. His brother, Antonio Russolo, also enacted this new philosophy of modern found sound and composed Corale and Serenata.
Any inquiry of art as anti-environment would be incomplete without a discussion of the great anti-art movement, Dada. Like the Futurists before them, Dadaists used found sound and technology-as-art to violently disrupt conventions of art, beauty, and authorship within the white avant-garde community. Marcel Duchamp’s famous work, “Fountain,” is likely the most familiar Dadaist artifact to contemporary readers, yet the sound poetry of Kurt Schwitters and other Dadaist and Dada-inspired sound pieces such as Erwin Schulhoff’s 1922 work In Futurum (the middle movement of which contains only a rest and the notation “with feeling,” an undoubtable precursor to John Cage’s 4’33”, written 30 years later) created sonic spaces of innovation and strangeness that changed the way audiences listened to both voices and silences. The Russian Cubo-Futurists, especially zaumniks such as Alexei Kruchenykh, made similar ventures into anti-environments. Kruchenykh developed the sound art zaum, which he understood as a transrational language that undercut existing language systems in which the “word [had] been shackled…by its subordination to rational thought” (70). Zaum was a sort of linguistic anti-environment, one rooted in the notion that meaning resided first and foremost in the sound of a word rather than the denotative symbol system that emerged alongside the proliferation of print/visual culture. One could also not underemphasize the work of John Cage, from his prepared piano to his work with organic instruments.
The list of artists, genres, and movements engaged to some extent in the enterprise of anti-environment architecture could go on and be debated indefinitely: Free Jazz, Turntablism/Nu Jazz, Experimental Hip-Hop,Fluxus, Circuit Bending, Prepared Guitar, ProtoPunk, Punk, Post-Punk, New Wave, No Wave. . . in all of these diverse movements, the sonic artists share the tendency to create strange new worlds via sound; worlds that reveal social and technological environments that most people seem unaware of in the moment. This is why media theorist Marshall McLuhan called the artist “indispensible,” because the artist can tell us something about ourselves that we cannot know via ordinary means of perception. Sonic artists expose audiences to auditory phenomena, structures, juxtapositions, etc. that are to various extents hidden, obscured, or ignored as “noise.” The sonic artist is more than just a clever selector and (re)arranger of sound; s/he is a revelatory agent, exposing what is inaudible.
Art as Anti-environment
Anti-environments, however we might define and classify them, are vital not only to artistic communities themselves, but they are also vital to a society of fish in water. In his 1968 text, War and Peace in the Global Village, McLuhan asserts (among other things) that humans remain largely unaware of their new environments, likening them to fish in water: “one thing about which fish know exactly nothing is water, since they have no anti-environment which would enable them to perceive the element they live in” (175). In other words, humans seldom possess or practice a sense of awareness regarding their surroundings because there’s nothing against which surroundings may be contrasted. The “water” to McLuhan represented the various environments (physical, psychological, cultural) shaped by technological innovation, but we can—and should—extend the water metaphor to a range of hegemonic frameworks: constructions of gender, race, ability, and so on.
This essay is certainly not an attempt to generate some sort of evaluative rubric by which to judge artistic or sonic expression objectively. Rather, we might use the concept of anti-environments as a way to frame our subjective experiences and encounters with all sound, and begin listening to unfamiliar sounds as psychedelic (from Greek psyche- “mind” + deloun “reveal”) keys to illuminate the patterns and structures in which listeners exist. We must work to understand our environments and our place in them; if we are to engage critically with our culture, we must first understand existing (yet invisible) patterns and structures that surround us. And we are aided in this effort, in great part, by humanity’s great seekers of pattern recognition, the sonic-psychonautical messengers: the sonic artists.
To return to the sound that inspired this meditation, Walter Gross (among others) is in many ways participating in and propelling the discourse of Leary and McLuhan, Schwitters and Schulhoff, Kruchenykh and Cage,Davis and Sun Ra, Madlib and J Dilla. Gross performs the sonic anti-environment, enacts the revelation of obscured sonic paradigms. For me, Gross can act as a sort of lens through which ordinary sonic patterns and structures become visible. I hear Flying Lotus, Bob Dylan, and The Minutemen differently after Gross. I hear my office, my home, my family’s voices differently after Gross. I hear patterns that weren’t audible before. After Gross, I become aware of how I am continuously trained to expect certain things from the sonic world: compartmentalized units of meaning, clearly stated origins of utterances, linear narratives, repeated/repeatable melodies, and so on.
Likewise, my own sonic art/scholarship approaches the use of sound to reveal the inaudible assumptions present in Western frameworks surrounding sonic production. I will conclude with an illustration of my own work and why sonic anti-environments are so central to my philosophy and method. One of my sonic works, “Toward an Object-Oriented Sonic Phenomenology,” was recently part of an exhibition titled Not For Human Consumption, curated by Julian Weaver of CRISAP in London. I recorded the sounds of a high mast lighting pole using contact microphones. Contact microphones do not “hear” like humans typically hear. Typical (dominant) notions of human hearing (and therefore of sound itself) involve the reception and interpretation of vibrations present in air. Contact microphones instead only interpret the vibrations in solid objects.
By listening through an object–through alien “ears,” so to speak– we can begin to critique the ways that we privilege listening via air, a listening that places humans at the center of the universe. We can consider the ways that sound has very real effects on humans with atypical hearing abilities and nonhuman objects. It is difficult to have such conversations if we never explore sonic anti-environments, if we never break through dominant epistemological models, if we never expose the limits of our own environments.
Featured Image: Beatrix*JAR in Dayton, Ohio, September 9, 2009, by Flickr User Vista Vision
Steven Hammer is a Ph.D. candidate in Rhetoric, Writing, and Culture at North Dakota State University in Fargo, ND, USA. His research deals with various aspects of sonic art, from exploring glitch and proto-glitch practices and theories (e.g., circuit bending), to understanding and producing sound from an object-oriented ontology (e.g., contact microphones). He also researches and facilitates trans-Atlantic translation collaborations between American, European, and African universities. He has multimedia publications with Enculturation, Sensory Studies, as well as forthcoming book chapters with Wiley/IEEE press, and IGI Global Publishing, and has performed creative and academic work at several conferences across North America, including the national Computers and Writing Conference and the Council for Programs in Technical and Scientific Communication. He performs experimental circuit-bent and sampler-based music under the moniker “patchbaydoor,” and has constructed and documented a number of hardware modification projects for his own artistic projects and for other artists in the upper Midwest United States. You can read/hear more atstevenrhammer.com