Archive | European Studies RSS for this section

Your Voice is (Not) Your Passport

In summer 2021, sound artist, engineer, musician, and educator Johann Diedrick convened a panel at the intersection of racial bias, listening, and AI technology at Pioneerworks in Brooklyn, NY. Diedrick, 2021 Mozilla Creative Media award recipient and creator of such works as Dark Matters, is currently working on identifying the origins of racial bias in voice interface systems. Dark Matters, according to Squeaky Wheel, “exposes the absence of Black speech in the datasets used to train voice interface systems in consumer artificial intelligence products such as Alexa and Siri. Utilizing 3D modeling, sound, and storytelling, the project challenges our communities to grapple with racism and inequity through speech and the spoken word, and how AI systems underserve Black communities.” And now, he’s working with SO! as guest editor for this series (along with ed-in-chief JS!). It kicked off with Amina Abbas-Nazari’s post, helping us to understand how Speech AI systems operate from a very limiting set of assumptions about the human voice. Last week, Golden Owens took a deep historical dive into the racialized sound of servitude in America and how this impacts Intelligent Virtual Assistants. Today, Michelle Pfeifer explores how some nations are attempting to draw sonic borders, despite the fact that voices are not passports.–JS

In the 1992 Hollywood film Sneakers, depicting a group of hackers led by Robert Redford performing a heist, one of the central security architectures the group needs to get around is a voice verification system. A computer screen asks for verification by voice and Robert Redford uses a “faked” tape recording that says “Hi, my name is Werner Brandes. My voice is my passport. Verify me.” The hack is successful and Redford can pass through the securely locked door to continue the heist. Looking back at the scene today it is a striking early representation of the phenomenon we now call a “deep fake” but also, to get directly at the topic of this post, the utter ubiquity of voice ID for security purposes in this 30-year-old imagined future.

In 2018, The Intercept reported that Amazon filed a patent to analyze and recognize user’s accents to determine their ethnic origin, raising suspicion that this data could be accessed and used by police and immigration enforcement. While Amazon seemed most interested in using voice data for targeting users for discriminatory advertising, the jump to increasing surveillance seemed frighteningly close, especially because people’s affective and emotional states are already being used for the development of voice profiling and voice prints that expand surveillance and discrimination. For example, voice prints of incarcerated people are collected and extracted to build databases of calls that include the voices of people on the other end of the line.


“Collect Calls From Prison” by Flickr User Cobalt123 (CC BY-NC-SA 2.0)

What strikes me most about these vocal identification and recognition technologies is how their appeal seems to lie, for advertisers, surveillers, and policers alike that voice is an attractive method to access someone’s identity. Supposedly there are less possibilities to evade or obfuscate identification when it is performed via the voice. It “is seen as a solution that makes it nearly impossible for people to hide their feelings or evade their identities.” The voice here works as an identification document, as a passport. While passports can be lost or forged, accent supposedly gives access to the identity of a person that is innate, unchanging, and tied to the body. But passports are not only identification documents. They are also media of mobility, globally unequally distributed, that allow or inhibit movement across borders. States want to know who crosses their borders, who enters and leaves their territory, increasingly so in the name of security.

What, then, when the voice becomes a passport? Voice recognition systems used in asylum administration in the Global North show what is at stake when the voice, and more specifically language and dialect, come to stand in for a person’s official national identity. Several states including Denmark, the Netherlands, the United Kingdom, Switzerland, Sweden, as well as Australia and Canada have been experimenting with establishing the voice, or more precisely language and dialect, to take on the passport’s role of identifying and excluding people.

“Passport Brochure” by Craig James (CC BY-NC 2.0)

In the 1990s—not too far from the time of Sneakers release—they started to use a crude form of linguistic analysis, later termed Language Analysis for the Determination of Origin (LADO), as part of the administration of claims to asylum. In cases where people could not provide a form of identity documentation or when those documents would be considered fraudulent or inauthentic, caseworkers would look for this national identity in the languages and dialects of people. LADO analyzes acoustic and phonetic features of recorded speech samples in relation to phonetics, morphology, syntax, and lexicon, as well as intonation and pronunciation.

The problems and assumptions of this linguistic analysis are multiple as pointed out and critiqued by linguists. 1) it falsely ties language to territorial and geopolitical boundaries and assumes that language is intimately tied to a place of origin according to a language ideology that maps linguistic boundaries onto geographical boundaries. Nation-state borders on the African continent and in the Middle East were drawn by colonial powers without considerations of linguistic communities. 2) LADO thinks of language and dialect as static, monoglossic and a stable index of identity. These assumptions produce the idea of a linguistic passport in which language is supposed to function as a form of official state identification that distributes possibilities and impossibilities of movement and mobility. As a result, the voice becomes a passport and it simultaneously functions as a border, by inscribing language into territoriality. As Lawrence Abu Hamdan has written and shown through his sound art work The Freedom of Speech itself, LADO functions to control territory, produce national space, and attempts to establish a correlation between voice and citizenship.

Language Analysis is the Second Step in Claiming Asylum in the UK (Home Office Science: Migration Border Analysis, 2012 p.37), see also K. Wilson’s LADO: An Investigative Study

I’ll add that the very idea of a passport has a history rooted in forms of colonial governance and population control and the modern nation-state and territorial borders. The body is intimately tied to the history of passports and biometrics. For example, German colonial administrators in South-West Africa, present day Namibia, and German overseas colony from 1884 to 1919 instituted a pass batch system to control the mobility of Indigenous people, create an exploitable labor force, and institute and reinforce white supremacy and colonial exploitation. Media and Black Studies scholar Simone Browne describes biometrics as “digital epidermalization,” to describe how surveillance becomes inscribed and encoded on the skin. Now, it’s coming for the voice too.

In 2016 the German government took LADO a step further and started to use what they call a voice biometric software that supposedly identifies the place of origin of people who are seeking asylum. Someone’s spoken dialect is supposedly recognized and verified on the basis of speech recordings with an average lengths of 25,7 seconds by a software employed by the German Ministry for Migration and Refugees (in German abbreviated as BAMF). The now used dialect recognition software used by German asylum administrators distinguishes between 4 large Arabic dialect groups: Levantine, Maghreb, Iraqi, Egyptian, and Gulf dialect. Just recently this was expanded with language models for Farsi, Dari and Pashto. There are plans to expand this software usage to other European countries, evidenced by BAMF traveling to other countries to demonstrate their software.

“voice vectors” Universal (CC0 1.0)

This “branding” of BAMF’s software stands in stark contradiction to its functionality. The software’s error rate is 20 percent. It is based on a speech sample as short as 26 seconds. People are asked to describe pictures while their speech is recorded, the software then indicates a percentage of probability of the spoken dialect and produces a score sheet that could indicate the following: 74% Egyptian, 13% Levantine, 8% Gulf Arabic, 5 % Other. The interpretation of results is left to the caseworkers without clear instructions on how to weigh those percentages against each other. The discretion left to caseworkers makes it more difficult to appeal asylum decisions. According to the Ministry, the results are supposed to give indications and clues about someone’s origin and are not a decision-making tool. However, as I have argued elsewhere, algorithmic or so-called “intelligent” bordering practices assume neutrality and objectivity and thereby conceal forms of discrimination embedded in technologies. In the case of dialect recognition the score sheet’s indicated probabilities produce a seeming objectivity that might sway case-workers in one direction or another. Moreover, the software encodes distinctions between who is deserving of protection and who is not; a feature of asylum and refugee protection regimes critiqued by many working in the field.

The functionality and operations of the software are also intentionally obscured. Research and sound artist Pedro Oliveira addresses the many black-boxed assumptions entering the dialect recognition technology. For instance, in his work Das hätte nicht passieren dürfen he engages with the labor involved in producing sound archives and speech corpora and challenges “ the idea that it might be feasible, for the purposes of biometric assessment, to divorce a sound’s materiality from its constitution as a cultural phenomenon.” Oliveira’s work counters the lack of transparency and accountability of the BAMF software. Information about its functionality is scarce. Freedom of information requests and parliamentary inquiries about the technical and algorithmic properties and training data of the software were denied as the information was classified because “the information can be used to prepare conscious acts of deception in the asylum proceeding and misuse language recognition for manipulation,” the German government argued.  While it is not necessarily deepfakes like the one Brandes produced to forego a security system that the German authorities are worried about, the specter of manipulation of the software looms large. 

The consequences of the software’s poor functionality can have drastic consequences for asylum decisions. Vice reported in 2018 the story of Hajar, whose name was changed to protect his identity. Hajar’s asylum application in Germany was denied on the basis of a dialect recognition software that supposedly indicated that he was a Turkish speaker and, thus, could not be from the Autonomous Region Kurdistan as he claimed. Hajar who speaks the Kurdish dialect Sorani had been instructed by BAMF to speak into a telephone receiver and describe an image in his first language. The software’s results indicated a 63% probability that Hajar speaks Turkish and the caseworker concluded that Hajar had lied in his asylum hearings about his origin and his reasons to seek asylum in Germany who continued to appeal the asylum decision. The software is not equipped to verify Sorani and should not have been used on Hajar in the first place.

Biometric Island, Gdansk University of Technology 2021, Image by Dawid Weber  (CC BY 3.0)

Why the voice? It seems that bureaucrats and caseworkers saw it as a way to identify people with ease and scale language analysis more easily. It is also important to consider the context in which this so-called voice biometry is used. Many people who seek asylum in Germany cannot provide identity documents like passports, birth certificates, or identification cards. This is the case because people cannot take them with them as they flee, they are lost or stolen on people’s journeys, or they are confiscated by traffickers. Many forms of documentation are also not accepted as legitimate by state authorities. Generally, language analysis is used in a hostile political context in which claims to asylum are increasingly treated with suspicion.

The voice as a part of the body was supposed to provide an answer to this administrative problem of states. In response to the long summer of migration in 2015 Germany hired McKinsey to overhaul their administrative processes, save money, accelerate asylum procedures, and make them more “efficient.” In July 2017, the head of the Department for Infrastructure and Information Technology of the German Federal Office for Migration and Refugees hailed the office’s new voice and dialect recognition software as “unrivaled world-wide” in its capacity to determine the region of origin of asylum seekers and to “detect inconsistencies” in narratives about their need for protection. More than identification documents, personal narratives, or other features of the body, the voice, the BAMF expert suggests is the medium that allows for the indisputable verification of migrants’ claims to asylum, ostensibly pinpointing their place of origin.

Voice and dialect recognition technology are established by policy makers and security industries as particularly successful tools to produce authentic evidence about the origin of asylum seekers. Asylum seekers have to sound like being from a region that warrants their claims to asylum: requiring the translation of voices into geographical locations. As a result, automated dialect recognition becomes more valuable than someone’s testimony. In other words, the voice, abstracted into a percentage, becomes the testimony. Here, the software, similarly to other biometric security systems, is framed as more objective, neutral, and efficient way of identifying the country of origin of people as compared to human decision-makers. As the German Migration agency argued in 2017: “The IT supported, automated voice biometric analysis provides an independent, objective and large-scale method for the verification of the indicated origin.”

“Soundwave and Spectrogram of “CIRCLE” by Lena Zipp, University of Zurich (CC BY-NC-ND 2.0)

The use of dialect recognition puts forth an understanding of the voice and language that pinpoints someone’s origin to a certain place, without a doubt and without considering how someone’s movement or history. In this sense, the software inscribes a vision of a sedentary, ahistorical, static, fixed, and abstracted human into its operations. As a result, geographical borders become reinforced and policed as fixed boundaries of territorial sovereignty. This vision of the voice ignores multiple mobilities and (post)colonial histories and reinscribes the borders of nation-states that reproduce racial violence globally. Dialect recognition reproduces precarity for people seeking asylum. As I have shown elsewhere, in the absence of other forms of identification and the presence of generalized suspicion of asylum claims, accent accumulates value while the content of testimony becomes devalued. Asylum applicants are placed in a double bind, simultaneously being incited to speak during asylum procedures and having their testimony scrutinized and placed under general suspicion.

Similar to conventional passports, the linguistic passport also represents a structurally unequal and discriminatory regime that needs to be abolished. The software was framed as providing a technical solution to a political problem that intensifies the violence of borders. We need to shift to pose other questions as well. What do we want to listen to? How could we listen differently? How could we build a world in which nation-states and passports are abolished and the voice is not a passport but can be appreciated in its multiplicity, heteroglossia, and malleability? How do we want to live together on a planet increasingly becoming uninhabitable?

Featured Image: Voice Print Sample–Image from US NIST

Michelle Pfeifer is postdoctoral fellow in Artificial Intelligence, Emerging Technologies, and Social Change at Technische Universität Dresden in the Chair of Digital Cultures and Societal Change. Their research is located at the intersections of (digital) media technology, migration and border studies, and gender and sexuality studies and explores the role of media technology in the production of legal and political knowledge amidst struggles over mobility and movement(s) in postcolonial Europe. Michelle is writing a book titled Data on the Move Voice, Algorithms, and Asylum in Digital Borderlands that analyses how state classifications of race, origin, and population are reformulated through the digital policing of constant global displacement.

tape-reel

REWIND! . . .If you liked this post, you may also dig:

“Hey Google, Talk Like Issa”: Black Voiced Digital Assistants and the Reshaping of Racial Labor–Golden Owens

Beyond the Every Day: Vocal Potential in AI Mediated Communication –Amina Abbas-Nazari 

Voice as Ecology: Voice Donation, Materiality, Identity–Steph Ceraso

The Sound of What Becomes Possible: Language Politics and Jesse Chun’s 술래 SULLAE (2020)Casey Mecija

The Sonic Roots of Surveillance Society: Intimacy, Mobility, and Radio–Kathleen Battles

Acousmatic Surveillance and Big Data–Robin James

SO! Amplifies: Immigrants Wake America Podcast and the Work of Engaged Digital Humanities

SO! Amplifies. . .a highly-curated, rolling mini-post series by which we editors hip you to cultural makers and organizations doing work we really really dig.  You’re welcome!

Conceptualized at a time of rampant increase in anti-immigrant violence, Immigrants Wake America is a creative response to the growing bias and violence against immigrant women in the U.S., as seen in the Atlanta shootings, the rise in hate crimes since the onset of Covid-19, and the US-Mexico border crisis. We believe that storytelling allows us to find similarities and differences between ourselves and others, offering a humanizing counterpart to harmful media narratives. The podcast creates a living archive of stories not yet heard, serving as an audio intervention into how immigrant women’s (hi)stories are narrated and passed on.

Tenement Museum in New York’s Lower East Side. Image by Flickr User Cliff Dix (CC BY-NC-ND 2.0)

Immigrants Wake America is a public humanities, community-engaged project of digital storytelling through podcasts, in partnership with the Tenement Museum in New York. It features storytellers who share their family stories about migration and the centrality of women in their life histories. These storytellers have submitted stories to the Tenement Museum’s digital archive Your Story, Our Story (YSOS),

Founded in 1988, the Tenement Museum, focuses on immigration and immigrants to “foster a society that embraces and values the role of immigration in the evolving American identity.” YSOS cofounded by Annie Polland and Kathryn Lloyd, is a digital archive that houses stories associated with immigration, migration, and cultural identity. Some of the storytellers are first generation immigrants, while others are descendants of immigrants, born and raised in the US; their great-grandparents or grandparents migrated to the US ages ago. Through YSOS, the Tenement Museum invites people across the country to share their stories in the online digital storytelling exhibit. Each story reveals one individual’s experience. Together, the stories help us see how the unique histories shape the nation, and the patterns that bind us together.

screencap of Your Story Our Story homepage

Through exploring and curating stories from Your Story Our Story, we facilitate conversations that supplement and expand it. This makes possible the conception of an archive that is both dynamic and collaborative. Such an archive resists the colonization and appropriation of lives and narratives of our storytellers. We navigate through the ethical conundrums that one might structurally and personally face in this collaborative endeavor. In our engagement with the archives at the Tenement Museum, we believe that our podcasting project really opens up the possibilities for an expansion of the archive.

We released our first episode, the Introductory Episode on January 15th, 2022, and have since been consistently releasing one episode per month.

While our podcast does not claim to retrieve or lay out these microhistories in their entirety, at an early stage of its development, we came to realize the potential that the form of the podcast itself offers for a different kind of storytelling. In our podcast, we treat stories as primary documents instead of marginalia. Michelle Caswell (2014) uses the term “symbolic annihilation” to describe the absence or misrepresentation of marginalized communities in archives. She advocates the powerful forces of community archives in countering “symbolic annihilation.” In thinking about archives in The Archaeology of Knowledge, Michel Foucault is concerned with “the density of discursive practices” wherein he observes “systems that establish statements as events and things (145)” This system of statements (as events or things) is what contributes to the law of what can be said. Processes of digital communal archiving such as those done by South Asian American Digital Archive (SAADA) or the Tenement Museum attempt to extend or expand the systematic possibility of events and things. Caswell and her colleagues have demonstrated the importance and success of the SAADA project. They have also pointed to the impossibility of representation in a traditional archive which is built on violence committed on colonized and enslaved bodies, also eloquently pointed out by Saidiya Hartman’s scholarship.

Through our experience we’ve learnt that podcasts can serve as a transgressive-dynamic expansion of digital archiving, given their unique ability to cut across racial and gendered lines of preconceived sonic notions and their potential to expand the current techniques and media of digital archiving. We map this formal potential of the podcast in the way it intersects with digital archiving in the following ways:

First, narratorial voice.

We wanted our project to act as an intervention in the way in which immigrant women’s (hi)stories are consumed and passed on. We wanted to provide counter narratives. It was essential that the storytellers share their stories in their own voices, literally! The audio medium allows us to produce a space for listening to voices that are otherwise marginalized and/or demonized. 

–Le Li and Shruti Jain

Among the several unique and inspiring stories of resilience that the Tenement Museum houses, one such is a story by an immigrant case manager at the American Civic Association in Binghamton, Goretti Mugambwa. The museum and our podcast make it possible for her story to be narrated by herself in her voice. With her experience of working with the refugee and immigrant community she also does not just remain an individual voice, but acts to further a collective assertion.

Next, sonic variations.

Our storytellers’ voices are not just “characteristics” of the story but are an essential part of the story itself. We believe that each immigrant and their descendent brings to the story their unique tonal texture. This diversity destabilizes what immigrants and their descendants are expected to sound like. The sounds we add in the editing process are minimal. We try not to impose emotional cues and responses upon our listeners. 

–Shruti Jain and Le Li

The multiplicity of voices in our podcast–and therefore in the archive–are not just “characteristics” of the aural storytelling or listening process, but are as much an essential part of the story itself. In line with what The Sonic Color Line reminds us, our work also finds that, “sound frequently appears to be visuality’s doppelgänger in U.S. racial history” (Stoever 4). This leads to the coding of race as not just visual but aural too. We want to clarify that the white constructed ideas of how people of color must sound flatten out the complexities in how people within and across communities do sound. At the same time, these notions of white sonic normativity also create a strong sense of what one must or must not sound like in order to succeed in the racial capitalist world order. The storytellers of our podcast and we ourselves are of diverse backgrounds. This, for us, is a way to demonstrate the “complex range of sounds actually produced by people of color” (Stoever 43). As Nancy Morales argues in “Óyeme Voz: U.S. Latin@ & Immigrant Communities Re-Sound Citizenship and Belonging,” the sound of ‘everyday voices’ mobilized against—and remarking on—the nation-state’s attempts to mark immigrant communities as vulnerable exerts an impactful and profoundly material agency.” With its conversational and collaborative format, our podcast serves as a dynamic medium to represent (his)stories that complicate generic conventions in critical ways.

Then, collaboration.

We have also been personally deeply impacted by the process of working on this podcast. We have made lasting bonds with our colleagues and storytellers alike. The storytellers of our podcast act not just as guests, but as collaborators and stakeholders. Instead of interpreting the stories in our own way and retelling the stories, we collaborate with the storytellers, and facilitate the unfolding of hidden stories by the storytellers. Dr. Lisa Yun, Professor of English at Binghamton University, and Kathryn Lloyd, Senior Director of Programs, Tenement Museum, have been advisors and the executive producers of the podcast. Together with Lloyd and Yun, we built a project on the ethos of collaboration.

The editing process of IWA too, is different. Rather than making individual editorial decisions, we engage the storytellers directly in post-production. After finishing a first edit of an episode collaboratively between ourselves, we then send it to the storytellers for their feedback and approval before releasing it. Sometimes, the storytellers do suggest changes. Based on their feedback, we re-edit the episode and eventually release it after the storytellers approval. We have also innovated methods of community editing, where we edit in groups of as large as 15 people.

Finally, accessibility.

The podcast medium makes Immigrants Wake America an ideal project for the public humanities. As opposed to lengthier podcasts, each episode of our podcast is edited down to 15-20 minutes. These can be used by educators as an in-class resource to generate discussion and activities. Community listeners could tune in during lunch breaks, get-togethers, cooking, driving or doing chores. Our episodes can also serve as conversation starters and help facilitate affective bonds among immigrants and non-immigrants alike.

The final episode of our first season, “Finding Our Grandmother in the Records,” aired just last week, and a second season is in the works.

As a way to expand this project, our second season will feature storytellers from our local community in addition to Your Story, Our Story. We plan to have units within our project dedicated to translation, recording and editing, and creating teaching resources. We aim for meaningful and engaged conversations and try to blur the supposed boundaries between the university and the community. Join us!

The first season of Immigrants Wake America was sponsored through the Institute for Advanced Studies in the Humanities at Binghamton University and a Public Humanities Grant from Humanities New York. Dr. Lisa Yun, Professor of English at Binghamton University, and Kathryn Lloyd, Senior Director of Programs, Tenement Museum, have been our advisors and the executive producers of the podcast. IWA is available on major streaming platforms such as Spotify, Google Podcasts, Apple Podcasts, Amazon Music, Soundcloud, and Audible.

Le Li and Shruti Jain are pursuing their PhDs at Binghamton University in the Translation Research and Instruction Program and the English Department respectively. They were Humanities New York Public Humanities fellows (2021-22) and graduate fellows of the Institute for Advanced Studies in the Humanities (IASH) at Binghamton University (2021-22). Through their podcast project and their work with digital community archives, Le and Shruti are currently working on exploring intersection between podcasts and digital archiving. They try to capitalize on the unique ability that the form of the podcast offers to cut across racial and gendered lines of preconceived sonic notions, which makes possible the conception of an archive that can be both dynamic and collaborative. Le’s research interests include translation studies, cultural studies, diaspora studies, and public humanities. Shruti’s PhD focuses on the Enlightenment, British Empire and the relationalities between race and caste formations. 

REWIND!…If you liked this post, you may also dig all this good stuff about sound studies pedagogy! Good luck with Fall semester, folks!:

“Heavy Airplay, All Day with No Chorus”: Classroom Sonic Consciousness in the Playlist ProjectTodd Craig

SO! Podcast #79: Behind the Podcast: deconstructing scenes from AFRI0550, African American Health Activism – Nic John Ramos and Laura Garbes

The Sounds of Anti-Anti-Essentialism: Listening to Black Consciousness in the Classroom- Carter Mathes

Making His Story Their Story: Teaching Hamilton at a Minority-serving Institution–Erika Gisela Abad

Teaching Soundwalks in a Course on Gentrification, Black Music, and Corporate America–Rami Toubia Stucky

Deejaying her Listening: Learning through Life Stories of Human Rights Violations– Emmanuelle Sonntag and Bronwen Low

Audio Culture Studies: Scaffolding a Sequence of Assignments– Jentery Sayers

Deep Listening as Philogynoir: Playlists, Black Girl Idiom, and Love–Shakira Holt

“Toward A Civically Engaged Sound Studies, or ReSounding Binghamton”–Jennifer Lynn Stoever

Listening to #Occupy in the Classroom–D. Travers Scott

SO! Podcast #71: Everyday Sounds of Resilience and Being: Black Joy at School–Walter Gershon

Sounding Out! Podcast #13: Sounding Shakespeare in S(e)oul– Brooke Carlson

A Listening Mind: Sound Learning in a Literature Classroom–Nicole Brittingham Furlonge

My Voice, or On Not Staying Quiet–Kaitlyn Liu

(Re)Locating Soundscapes of Schooling: Learning to Listen to Children’s Lifeworlds–Cassie J. Brownell

If You Can Hear My Voice: A Beginner’s Guide to Teaching–Caroline Pinkston

Mukbang Cooks, Chews, and Heals – David Lee

SO! Podcast #80: Refugee Realities Miniseries–Steph Ceraso