One Scream is All it Takes: Voice Activated Personal Safety, Audio Surveillance, and Gender Violence
Just a few days ago, London Metro Police Officer Wayne Couzens pled guilty to the rape and murder of Sarah Everard by, a 33-year-old woman he abducted while she walked home from a friend’s house. Since the news broke of her disappearance in March 2021, the UK has been going through a moment of national “soul-searching.” The national reckoning has included a range of discussions–about casual and spectacular misogynistic violence, about a victim-blaming criminal justice system that fails to address said violence–and responses, including a vigil in south London that was met with aggressive policing, that has itself entered into and furthered the UK’s soul-searching. There has also been a surge in the installation of personal safety apps on mobile phones; One Scream (OS), “voice activated personal safety,” is one of them.
Available for Android and iOS devices, OS claims to detect and be triggered by a woman’s (true) “panic scream,” and, after 20 seconds and unless the alarm is cancelled, it will send both a text message to the user’s chosen contacts and an automated call with the location to a nominated contact. The app is meant to help women in situations where dialing 999, (assumed to be the natural and preferred response to danger), is not viable for the user and, in the ideal embodiment, this nominated contact, “the helper,” is the police. OS did automatically contact police (and required a paid subscription) in 2016, but it did not work out well and by 2018, was declared a work in progress: “What we really want is for the app to dial 999 when it detects a panic scream, but first, we need to prove how accurate it is. That’s where you come in. . .” OS is currently in beta and free (while in beta). It is unclear whether the developers have given up with that utmost expression of OS.
OS is based on the premise that men fight and women scream —“It is an innate response for females in danger to scream for help”—and its correct functioning requires its users to be ready to do so, even if such an innate and instinctive response doesn’t come naturally to them: “If you do not scream, the app will not be able to detect you.” However, there are two discriminations in terms of scream analysis, in how the app discriminates while listening for and to screams, and in failing to detect or respond to them. The first has to do with who can use the app (i.e., whose panicked screams are able to trigger it) in the first place. This is presented in terms of gender and age—for the moment, OS can listen to “girls aged 14+ and women under 60,” where cisgender, as in anything OS, is taken for granted. It is, however, a matter of acoustic parameters set by the developers (notably, of reaching a certain high pitch and loudness threshold). Which is why the app was implemented to include a “screamometer” for potential users to scream, hard, figure out, and see whether they can reach “the intensity that is needed to set it off” (confetti means they do). The second one discriminates true panicked screams from other types of screams (e.g., happiness, untrue panic). As presented by the developers, both discriminations are problematic and misleading, and so is “the science behind screaming” One Scream‘s website boasts of.
The app does not quite distinguish true from fake screams, nor joy from panic for that matter. Instead, One Scream listens for “roughness,” which a team of scream researchers—it truly is a “tiny science lesson” —has identified as scream’s “privileged acoustic niche” for communicating alarm. According to this 2015 study in Current Biology, “roughness” is the distinctive quality of effective, compelling human screams (and of artificial alarms) in terms of their ability to trigger listeners and in terms of perceived urgency. Abrupt increases in loudness and pitch are not unique to screams. The rougher the scream, then, the greater its perceived “alarmess” and its alarming effect. That’s why developers say OS “hears real distress,” essentially “just as your own ear.” However, other studies suggest your own ears might not be so great at distinguishing happiness from fear and scream research, and particularly the specific “bit” OS builds on, by and large assumes, relies on, and furthers the irrelevance of “real” on the scream vocalizer end.
In OS’s pledge to its users, the app’s fine-tuning to its scream niche—i.e., to rough temporal modulations between 30 and 150Hz—is as important, as is the developers (flawed) insistence on the irredeemably uniqueness of true panic’s scream vocalizations, which they posit are instinctive and can’t be plotted or counterfeit: “Experience has shown that it is difficult for women to fake their scream.” Yet, current scream analysis and research primarily and largely relies on screams delivered by human research subjects (often university students, ideally drama students) in response to prompts for the purposes of studying them as well as, especially, on screams extracted from commercial movies and sound effect libraries. The same applies to the other types of vocalizations (e.g., neutral and valenced speech, screamed sentences, laughter, etc.) produced or retrieved for the purposes of figuring out what it is that makes a scream a scream, and how to translate that into a set of quantifiable parameters to capitalize on that knowledge, regardless of the agenda.
Because of their interest for audio surveillance applications, screams are currently a contested object and a hot commodity. Much as is the case with other scream distinction/detection enterprises, the initial training of OS most likely involved that vast and available bank of crafted scream renditions—by professional actors, machines, combinations of those, by and for an industry otherwise partial to female non-speech sounds—conveniently the exact type of “thick with body” female voicings OS is also invested in. For some readers, myself included, this might come across as creepy and, science-wise, flimsy.
Scream research often relies on how human listeners recruited for the cause respond to audio samples. Apparently, whether the scream is “real,” acted, or post-produced is neither something study subjects necessarily distinguish nor a determining factor in how they rate and react. In terms of machines learning to scream-mine audio data, it is what it is: “natural corpora with extreme emotional manifestation and atypical sounds events for surveillance applications” are scarce, unreliable, and largely unavailable because of their private character. That is no longer the case for OS, which has been accruing, and machine-learning from, its beta-user screams as well as how users themselves monitor/rate their screams and the app’s sensibility. OS users’ screams might not be exactly ad lib, as users/vocalizers first practice with the “screamometer” to learn to scream for and as a means to interface with OS, but it’s as natural a corpora as it gets, and it’s free for the users of the screams. OS not only echoes “voice stress analysis” technologies invested in distinguishing true from fake or in ranking urgency, but, as part and parcel of a larger scream surveillance enterprise, also public surveillance technologies such as ShotSpotter, all of which Lawrence Abu Hamdan has brilliantly dissected in his essay on the recording of the police gunshots that killed Michael Brown in Ferguson, Missouri in 2014.
Chilla is a strikingly similar app developed and available in India—although there’s a nuanced difference in the developer’s rationale for Chilla, which in its pursuance of scream-activated personal safety also aims to compensate for the fact that many girls and women don’t call “parents or police” for help when harassed or in danger. As presented, Chilla responds both to assaults and to women’s ambivalence towards their guardians. The latter is, too, a manifestation of the breadth of gender-based violence as a socio-cultural problem, one that Chilla is trained to fail to listen to and one that, because of OS’s particular niche user market, is simply out of the purview of its UK counterpart.
That problem–and that failure–is neither exclusive to India nor to scream-activated personal safety apps. Calling 999 in the UK, 911 in the US, or 091 in Spain, where I am writing, doesn’t come naturally to many targets of sexual and gender-based violence because they don’t conceive police as a help or because, directly, they see it as a risk—to themselves and/or to others. As Angela Ritchie has copiously documented in Invisible No More: Police Violence Against Black Women and Women of Color, women of color and Black women in particular are at extremely high risk for rape and sexual abuse by police officers, as high as 1 in 5 women in New York City alone.
OS, then, is framed as a pragmatic, partial answer to a problem it doesn’t solve: “We should never have to dress in a certain way…but we do.” The specifics of how OS would actually “save” or even has saved its users in particular scenarios go unexplained, because OS is meant to help with feeling safe; getting into the details, and the what ifs, compromises that service. This sense of safety has two components and is based on two promises: one, that OS will listen to your (panic) scream, and, two, as of now via the intermediacy of your contacts, the police will go save you. The second component and its assumed self-evidence speaks to the app’s whiteness and of its target market of white, securitized, cisgender female subjects.
Over and above its acoustic profiling, the app is simply not designed with every woman in mind. OS’s branding is about a certain lifestyle—of going for early runs and dates with cis-men, of taking time for yourself because you’re super busy at your white-collar job and going for night runs, of taking inspiration from “world” women and skipping if running isn’t for you. This lifestyle is also sold: sold as always under the threat of rape–despite its “rightfulness”–sold in a way that animates the feelings of insecurity and disempowerment that One Scream advertizes itself as capable of reversing. Safety, then, is sold as retrievable with OS.
Wearable or otherwise portable technologies to keep women “safe,” specifically from sexual assaults, are not new and are varied. These have been vigorously protested, particularly from feminist standpoints other than the white, securitized, capitalist brand OS professes—because, in (partly) delegating safety on technologies women then become personally responsible for, these technologies further “blame” women. For authorities and the patriarchy, this shift in blame is a relief. In discussing the racialized securitization of US university campuses, Kwame Holmes notes how despite “reactionary attacks” on campus feminism (e.g., so-called “snowflakes” complaining about bad sex) and authorities’ effective reluctance to acknowledge and challenge rape culture, anti-sexual assault technologies tend to be welcomed and accepted. As Holmes also notes, there’s no paradox in that. Those technologies flatten the discussion, deactivate more radical feminist critiques and potential strategies, and protect the status quo—not so much women and not those who, whenever an alarm sounds and especially when security forces respond, readily become insecure.
It is not a stretch to think that OS could potentially amplify the insecurities of Black and brown people subject to white panic (screams) and to its violence, something other audio surveillance technologies are already contributing to, at least it’s not a greater stretch than to entertain situations in which police would show up and save an OS user before it’s too late. Even if it’s never triggered, as developers seem to assume will be the case for the majority of installed units—”Many people have never faced a situation where they have had to panic scream”—it’s trapped in a securitization logic that ultimately relies on masculine authority, one that calls for the expansion of CCTV cameras, wherein women are never quite secure (see Sarah Everard’s vigil).
One Scream’s FAQs cover selected worries that users have or OS anticipates they might have. Among these, there are privacy concerns (i.e., does it listen to your conversations?) and the fear the alarm will activate “when it shouldn’t.” In the Apple Store user reviews, there’s a more popular type of concern: OS not responding to users’ screams. In other words, there’s simultaneously a worry about OS listening and detecting too much and about OS failing to listen “when it matters.” These anxieties around OS’s listening excesses and insufficiencies touch on (audio) surveillance paradoxical workings: does OS encroach on the everyday life of those within users’ cell phones’ earshot while not necessarily delivering on an otherwise modest promise of safety in highly specific scenarios? There’s a unified developer response to these concerns: OS “is trained to detect panic screams only.”
Featured Image: By Flicker User Dirk Haun. Image appears to be a woman screaming on a street corner, but is actually an advertisement on the window of a T-Mobile cell phone shop (CC BY 2.0)
María Edurne Zuazu works in music, sound, and media studies, and researches the intersections of material culture and sonic practices in relation to questions of cultural memory, social and environmental justice, and the production of knowledge (and of ignorance) in the West during the 20th and 21st centuries. María has presented on topics ranging from sound and multimedia art and obsolete musical instruments, to aircraft sound and popular music, and published articles on telenovela, weaponized uses of sound, music and historical memory, and music videos. She received her PhD in Music from The CUNY Graduate Center, and has been the recipient of Fulbright and Fundación La Caixa fellowships. She is a 2021-2022 Fellow at Cornell’s Society for the Humanities.
REWIND! . . .If you liked this post, you may also dig:
Flâneuse>La caminanta–Amanda Gutierrez
Sounding Out! Podcast #63: The Sonic Landscapes of Unwelcome: Women of Color, Sonic Harassment, and Public Space
Echo and the Chorus of Female Machines—AO Roberts
Vocal Gender and the Gendered Soundscape: At the Intersection of Gender Studies and Sound Studies–Christine Ehrick
“Recorder of Dublin”: Ulysses’ FX in 1982
For many, the audiobook is a source of pleasure and distraction, a way to get through the To Read Pile while washing dishes or commuting. Audiobooks have a stealthy way of rendering invisible the labor of creating this aural experience: the writer, the narrator, the producer, the technology…here at Sounding Out! we want to render that labor visible and, moreover, think of the sound as a focus of analysis in itself.
Over the next few weeks, we will host several authors who will make all of us think differently about the audiobook selections on our phone, in our car, and in our radios. Today we start things off with a close listen of the 1982 audiobook edition of James Joyce’s Ulysses. Watch out for the hoooooooooooooonk of the SO! train pulling into the station!
—Managing Editor Liana Silva
To think about James Joyce’s Ulysses is to think about the first instant when it truly seized your ears. Accordingly, my Ulysses begins in its final episode, “Penelope”: Molly Bloom is lying down or sitting up next to a passed-out Leopold Bloom when she hears the “frseeeeeeeefronnnng train somewhere whistling.” Her train does not go chug, choo, or chuff, but it rhymes with her “Loves old sweeeetsonnnng” (1669) with an infectious insouciance for the codes of language. Let us call this the Ulysses of 1922 (though the definitive edition of James Joyce’s book whose page numbers are cited here was produced in 1984 by Hans Walter Gabler).
The Ulysses of 1922 is what Jacques Derrida called gramophonic. It plays back to us something recorded without filtering out the noise and is to be heard more than it is to be read. We listen to the book, but we are second-in-line. The first listener is the book itself, which listens to Dublin and records everything with an odd sonic democracy, discriminating little amid its recording of all sounds vivid or vapid, giving equal importance to cats, carts, bells, machines, laughter, coughs, and language. The book saunters about the city, listening and recording, and we listen to the book like we would to a scratchy, static-filled recording of a concert the morning-after. It is a reminder of something Michel Serres once said in The Five Senses: “Meaning trails this long comet tail behind it. A certain kind of æsthetics… take as their object this brilliant trail” (120). Ulysses’ elusive modern city glows in this comet tail of noise and background static more than it pivots around conventionally meaningful language content. Eventually, industrial and technological modernity catches up with artistic modernism and in 1924, Joyce reads and records parts of the “Aeolus” episode of Ulysses, and later in 1929 he records a section of Finnegans Wake. Many years after, in 1982 – the centenary year of Joyce’s birth – Ulysses comes home to Dublin and is recorded in full by Irish national radio.
The 1982 Ulysses Broadcast was an uninterrupted twenty-nine-and-a-half-hour reading of the entire unabridged text on Ireland’s RTÉ Radio on 16th June – Bloomsday – produced by Micheál Ó hAodha. Among this and the two film versions, one from 1967 and the other from 2003, and other recordings such as the ones by LibriVox volunteers and a more recent one by BBC Radio 4, the 1982 Ulysses Broadcast was the first complete recording of the text. Director William Styles called upon voice actors from the Radio Éireann Players to dramatize and act Ulysses out.
My Ulysses of 1982 seizes me differently from the book. From the first seconds of the 1982 Broadcast, I reacted to Buck Mulligan stepping down the stairs inside the Martello Tower with surprise, because the reading is somewhat copiously accompanied; the sounds of loud waves outside of the walls of the seaside tower were part of the soundscape I was thrown into:
Immersion was of the essence. Not that the Ulysses of 1922 is by any means a silent text, but this accompaniment was a simultaneous roar. Sounds in the written text take up space, and as these sounds are being “played” in the book, there is a length of text where nothing else is happening. Think, for instance, of the machinery in the “Aeolus” episode: “Almost human the way it sllt to call attention” (251). As the “sllt” is recorded by the book, it is not over or behind any other sound or voice. It takes up its own space, unlike in the Broadcast
The layering of Buck Mulligan’s voice over the sounds of the sea becomes possible in the move from the spatial-visual of the page to the temporal-aural of a recording. However, listening to the Broadcast prompts me to ask: Is the sonic democracy of recording the soundscape still there?
Most critical work on the audiobook focuses on readerly reception and pleasure, almost indicating that we can hear the Ulysses of 1922 but we must read the Broadcast of 1982; the book provides for more direct sensory engagement while with the Broadcast, we must focus on analyzing the mechanics of our reception. We also get terms like Reinhart Meyer-Kalkus’ “hear-reading” (179) or Matthew Rubery’s “ear contact” (72) which are concerned with the link between the playback of the recorded text and the reading ear. We hear-read when we listen to the voice in our heads recite aloud to us what we are reading, and we establish ear contact, much like eye contact, when we find our ears bound to voices instead of people. Both these concepts are concerned with reception. If we steer clear of our listening of the Broadcast and turn the focus to the Broadcast’s listening of Ulysses, what we find is a rich sonic world, but it is one which takes us away from the linguistic play of the text.
For instance, the book gives cues for the ambient sounds of Dublin clamor surrounding any voice which might be speaking at that moment. “Stream of life” (327) signals in the Broadcast the coming alive of the city soundscape. What is described as a “sudden screech of laughter” (255) in the book is layered upon loud laughter in the Broadcast, as is “a loud cough” (281) upon a loud cough, and a telephone which “whirred” (283) upon the sound of an actual ringing telephone. Later, in the “Circe” episode, a mention of whistling (1169) is also whistled out.
Trams, the clatter of plates and glasses, desks being rapped, coins and bells ringing and jingling, cannon-firing, all these sounds are played as accompaniments again and again as their descriptions are being voiced in the Broadcast. Like in bedtime storytelling, says Brigette Ouvry-Vial, sound effects as uncomplicated accompaniments are never in conflict with the voiced text. Think of pictures and illustrations alongside words in children’s literature (185). The background sound effects of the broadcast add nothing to the sonic democracy of the book even if they do not detract from it.
The Ulysses of 1922 is also rife with non-lexical, unpronounceable sounds, like the one’s Bloom’s cat makes. The many different cat sounds, for example “Mkgnao!” and “Mrkgnao!” and “Mrkrgnao!” (107-8), are not voiced at all in the broadcast, and are instead replaced by the mimicked sounds of a cat meowing, almost exactly the same each time:
“Miaow!” (133) and “Prr” (107), which are Bloom’s responses to his cat, are voiced by him. When the “door of Ruttledge’s office whispered: ee: cree” (243), there is no voicing – only the sound of a creaking door. Yet, when we are in Bloom’s thoughts, like when he remembers a glorious gust of wind which blew up Molly’s skirt, he voices the gust of wind in the Broadcast going “Brrfoo!” (329), pronouncing the non-lexical word with a close-approximation. Would not the non-lexical sounds in his head suggest that he is thinking in sound rather than in language, much like many of us who can hear sounds in our heads? Often but not always, environmental sounds are retained as actual sounds while the sounds in Bloom’s head are sublimated into pronounceable, phonetic language. But mostly there is an insistence on adding sound effects wherever possible.
Whether the book describes the sound or sounds it with a non-lexical string of words, the Broadcast attaches its effects. If we look at the book as a recorder, its movements are staggeringly complex as it moves in and out of multiple spaces. When it is in Bloom’s head, the environment is muted, and when it is inside a carriage, unless it is poked out an open window, it does not record the street. Ssave for a few instances, the Broadcast’s insistence on effects attests to its rich production, but not to its vitality. It therefore stands as an accompaniment to the book, not as a text in its own right given its compositional inconsistencies. So, the several variations on Bloom’s flatulence with “Rrrrrr” (625), “Fff. Oo. Rrpr,” and “Pprrpffrrppfff” are all erased and instead fart sounds are recorded.
On the same page, when Bloom tries to mask his own sounds of bodily release under the din of the passing tram, the “Krandlkrakran” (629) is both voiced by Bloom and recorded as the sound of a noisily ringing tram in the background. But only an actual train whistles in “Penelope,” with no voice in the Broadcast attempting to say “frseeeeeeeefronnnng” (1669).
For Charles Bernstein, the sound of a work of literature, much like the shape of poetry on the page, might be an element which is “extralexical but… not extrasemantic” (5). It is different from the written word but it is not a meaningless ornament. For the Broadcast, however, it might as well be the case that sound is made irrelevant to meaning. Or, we can argue that the meaning being made is in the realm of performance studies and not literature. The pure temporality of the Broadcast helps. We can stop reading the book to look, but we cannot stop the Broadcast and still listen. Moreover, when the Broadcast records, it is listening to the book’s listening of Dublin, removed by another degree from the soundscape of Dublin.
The Broadcast is not however without value. Bernstein echoes Serres when he aggrandizes the “sheer noise of language” (22) which must take precedence over the impulse to decode everything. The Broadcast answers this need to not immediately rationalize and sublimate in analysis everything that is heard, but to rather hear without listening. Cue the poet Robert Carleton Brown who once said that writing since the very beginning has been “bottled up” inside of books (23). And in 1982, the stopper on Joyce’s spuming prose was popped.
Featured Image: “telemachus: the tower, 8 a.m., theology, white/gold, heir, narrative (young)” by Flickr user brad lindert, CC-BY-2.0
Shantam Goyal studies English Literature at the State University of New York at Buffalo for his PhD. He completed his M.Phil in 2018 from the University of Delhi with a dissertation titled “Listen Ulysses: Joyce and Sound.” He hopes to continue this thread for his doctoral research on Finnegans Wake and mishearing. Besides Joyce Studies and Sound Studies, he works on Poetics and Jazz Studies, and is also attempting to translate parts of Ulysses into Hindi as a personal project. His reviews, articles, and creative work have appeared in The Print, The Hindu Business Line, Vayavya, ColdNoon, Daath Voyage, and Café Dissensus among other publications. He prefers that any appellations for him such as academic, poet, or person be prefaced with “Delhi-based.”
REWIND! . . .If you liked this post, you may also dig:
This Is How You Listen: Reading Critically Junot Diaz’s Audiobook-Liana Silva
“‘HOW YOU SOUND??’: The Poet’s Voice, Aura, and the Challenge of Listening to Poetry”-John Hyland
“This Is Not A Sound”: The Treachery Of Sound in Comic Books-Osvaldo Oyola