Tag Archive | social media

Mimicked Voices and Nonhuman Listening: AI Deepfakes, Speech, and Sonic Manipulation in the Digital War on Ukraine

The essays collected in this series (link to the Introduction) trace how nonhuman listening operates through sound, speech, and platformed media across distinct but interconnected domains. Across these accounts, listening no longer secures meaning or relation; it becomes a site of contestation, where sound is mobilized, processed, and weaponized within systems that privilege circulation, recognition, and response over truth. In this contribution, Olga Zaitseva-Herz examines how nonhuman listening operates under conditions of war, where AI-generated voices and deepfakes destabilize the very grounds of auditory trust. Through the case of Ukraine, she shows how platforms and political actors alike exploit algorithmic listening systems to amplify affect, circulate disinformation, and transform voice into a tool of psychological warfare. Listening, in this context, becomes not a means of understanding but a terrain of uncertainty. –Guest Editor Kathryn Huether

Russia’s full-scale invasion of Ukraine has unfolded as the most digitally mediated war to date, shaped not only by what circulates online but by how content is heard, interpreted, and amplified.  Here, listening is not limited to human hearing: it also includes algorithmic systems that detect, rank, and amplify content, as well as political actors and online publics who interpret and recirculate it. Social media platforms—Telegram, Instagram, TikTok, Facebook—have become sites of psychological warfare where AI-generated audio, video, text, and image-based content are crafted to manipulate perception and provoke rapid emotional responses, often through algorithmic systems attuned to virality and affect. Ukrainian political authorities regularly caution users by saying that everything one reads, hears, or sees could be a psychological weapon. This is not rhetorical. Content is often designed to produce outrage, shock, and despair—emotions that travel quickly across platforms and influence public mood.

AI is used to create fake news videos, synthetic voices, and deepfake conversations, complicating how authenticity is heard and assessed. Some recordings circulating on social media simulate “leaked” phone calls revealing political dissent or strategic plans that are then shared on social media sites such as Telegram, Instagram, and Facebook. At the same time, the fact that people’s original voices can now also be generated with AI means that one can claim that their recorded voice is AI-generated. A widely circulated case involved Russian music producer Iosif Prigozhin, whose alleged call criticizing the Kremlin provoked significant backlash. Soon after he claimed the recording was an AI forgery – a statement whose truth remains unclear, but which strategically exploits growing public awareness of deepfakes as a means of discrediting or distancing from damaging material. Deepfakes thus do not merely deceive; they also destabilize the conditions of listening and trust, turning listening itself into a site of strategic uncertainty.. This uncertainty exploits a growing crisis of trust in listening itself, where voices can always be disavowed as synthetic. Against this backdrop, music and voice emerge as especially powerful media for manipulation, parody, retaliation, and symbolic struggle.

Graafika. Kuulaja. / Creator: Keerend, Avo (autor) / Date: 1980 / Providing institution: Pärnu Muuseum / Aggregator: E-Varamu / Providing Country: Estonia / CC0 1.0 / Graafika. Kuulaja. by Keerend, Avo (autor) – 1980 – Pärnu Museum, Estonia – CC0.

AI Songs as a Tool of Revenge

AI generative tools are also used for irony or parody, such as in the viral remake “Samotni Moskali,” [Lonely Muscovites], which mocks the Ukrainian pop star Ani Lorak, who moved to Russia. On November 13th, 2023, Ukrainian journalist and politician Anton Gerashchenko’s Telegram channel posted a video remake of Ani Lorak’s old song “Poludneva Speka” [Midday Heat], renamed “Samotni Moskali.” This video quickly went viral on social media. Her big hit from the ’00s has been remade into strongly pro-Ukrainian content, featuring clips from current frontlines to illustrate new lyrics generated by an AI voice engineered to closely mimic Lorak’s vocal timbre and affect. The parody relies on listeners recognition of her voice and affective style, while the imitation introduces a strong contentual shift between the original and synthetic lyrics.

This social media burst was a response to Ani Lorak’s claimed political neutrality in the context of Russia´s full-scale war against Ukraine, despite clear signs from her that supported Russia. These actions seemed aimed at revenge and at the same time, the public breakup of her Ukrainian fan base, showing the impact of her choices, while her Ukrainian audience felt betrayed.  It led to many satirical memes, including AI-generated songs related to her stage persona, appearing on social media. Knowing that, under current Russian politics, she could get into trouble there if the government took the promoted `support´ for the Ukrainian army seriously. The revenge group went even further by creating a homepage called “Ani Lorak Foundation,” completely dedicated to fundraisers for the Ukrainian army, which is represented like Lorak’s own project where she showcases her support of Ukrainian battalions. Some military drones deployed by the Ukrainian side even ended up bearing stickers with the name of the “Ani Lorak Foundation.“ This case demonstrates how AI tools became instruments of public satire, sabotage and protest in the context of the current full-scale war.

AI Songs as a Weapon

During the full-scale invasion, Russia has been using AI-generated music as a weapon for propaganda and disinformation. In 2023, multiple songs in Ukrainian were created to disrupt Ukraine’s military mobilization efforts and went viral. One of these, the song “Mamo, Ia Ukhyliant” [Mother, I am a Draft Dodger], became particularly popular in a multitude of variations. Their circulation shows how platforms “listen” to wartime content through metrics of repetition, provocation, and affective intensity, amplifying messages not because they are true, but because they are likely to generate reaction and spread. These songs were algorithmically promoted on TikTok and successfully sparked a viral challenge aimed at undermining Ukraine’s mobilization in 2024 by encouraging Ukrainian men to evade the draft, flee, and party abroad instead. In return, Ukrainian intelligence has released an official statement that these songs are products of the Russian disinformation campaign.

This example shows how AI-generated songs are actively used as powerful tools of war, spreading political messages and influencing people’s political choices. Also, the fact that all these songs about draft evasion were released in Ukrainian highlights the goal of targeting Ukrainian men specifically, since Russian men usually don’t speak Ukrainian and therefore wouldn’t be affected by the content. Furthermore, the presence of a large number of these `draft dodger’ songs at the same time created the impression of widespread societal acceptance through repetition and algorithmic amplification. In this way, repetition itself became a signal of apparent legitimacy: the more frequently such content circulated, the more easily platforms and audiences could register it as evidence of broader consensus around draft evasion within Ukrainians.

Photo by Jon Tyson on Unsplash

AI Pictures on Facebook Mimicking Sound and Sonic Affect

Visual disinformation follows similar viral patterns. There has been a surge of AI-generated images with war-related content, often mimicking sound to intensify emotional impact and prompt affective listening by showing a screaming child amid the rubble or a crying soldier in a Ukrainian uniform, paired with a patriotic, pro-Ukrainian message that encourages interaction, such as a like or comment. Even without actual sound, such images solicit a kind of affective listening in which suffering is not literally heard but imagined, projected, and emotionally registered through visual cues. Meanwhile, although this truth-blurring pattern attracted significant attention among many Ukrainians, ironic counter-memes emerged, mocking its primitive approach.

According to warnings from the Ukrainian online security agency, these accounts aim to interact with pro-Ukrainian users, ultimately adding them as friends or followers. Then, when they build a large enough audience, they shift the type of content they share to pro-Russian. The strategy relies on gathering an audience that is specifically pro-Ukrainian, as they interact with images of crying soldiers or the suffering of the Ukrainian people at the front. In this sense, the filtering process functions as a form of nonhuman listening at the level of audience formation: platforms and account managers learn which publics respond to particular emotional cues, cultivate those publics through repeated engagement, and later redirect them toward different ideological content. This creates a filtering mechanism through which an initially pro-Ukrainian audience is gathered, profiled, and later ideologically redirected, alienating loyal followers while pulling political opinion in a more pro-Russian direction.

Pro-Russian AI Songs in Germany to weaken Support of Ukraine

In Germany, AI-generated songs are being utilized as propaganda tools to promote pro-Russian sentiment and anti-Ukrainian views. The right-wing party AfD has embraced AI songs as a potent tool in this regard. Multiple mostly anonymous YouTube accounts have emerged spreading right-wing ideas, with these songs not only addressing German political issues but also openly supporting Russia. For instance, one song titled “Meine Stimme Habt ihr nicht” [You don’t get my vote] features an AI-created avatar of a tall, strong woman holding German and Russian flags. The version of the same song was also released in Russian. The lyrics criticize Germany’s political course, including military aid to Ukraine, and expresses a desire to be friends with Russia.  Its circulation across German and Russian suggests that listening is being calibrated for different national and linguistic publics, allowing similar political messages to be heard through distinct affective and ideological frames shaped by language, audience, and context.

Contemporary propaganda is increasingly shaped not just by human intent but by rapidly developing nonhuman listening systems—both in production and amplification. Algorithmic listening and perception are exploited to privilege what provokes, not what is true, complicating efforts to regulate digital hate, emotion, and influence. In this context, listening becomes not only a human practice of interpretation, but also a technical system of detection, ranking, and amplification—and, crucially, a site of failure where truth, trust, and perception can no longer be reliably aligned.

Featured Image: Photo by Stanislav Vlasov on Unsplash.

Olga Zaitseva-Herz is an ethnomusicologist working at the intersection of Ukrainian music, war, displacement, and digital culture. She is currently a postdoctoral researcher at the Kule Centre for Ukrainian and Canadian Folklore at the University of Alberta and a guest scholar at Think Space Ukraine at the University of Regensburg. Her research examines how song operates as a medium of political mediation, cultural diplomacy, and historical memory, with a particular focus on popular music and AI-generated sound during Russia’s full-scale invasion of Ukraine. Combining perspectives from ethnomusicology, sound studies, and media analysis, her work investigates how music shapes narratives of resistance, belonging, and global visibility, and how sonic practices illuminate the broader entanglements of culture, technology, and power.

REWIND! . . .If you liked this post, you may also dig:

Hate & Non-Human Listening, an Introduction–Kathryn Huether

Your Voice is (Not) Your PassportMichelle Pfeifer 

Mapping the Music in Ukraine’s Resistance to the 2022 Russian InvasionMerje Laiapea

SO! Amplifies: An Interactive Map of Music as Ukrainian Resistance to the 2022 Russian InvasionMerje Laiapea





What Do We Hear in Depp v. Heard?

As you probably know—whether you want to or not—the jury reached a verdict earlier this summer in the trial between Amber Heard and Johnny Depp. The trial, in the Fairfax County Circuit Court in Virginia, involved defamation and counter-defamation claims by the two actors. Heard published a 2018 op-ed in The Washington Post in which she claimed to be “a public figure representing domestic abuse.” Depp sued her for defamation, she counter-sued, and a seven-week spectacle of celebrity, misogyny, and power followed, in which Depp substantially prevailed.

What does a close listening to Depp v. Heard tell us about this particular trial, as well as about sex and power in the courtroom more generally? 

Depp v. Heard did not just randomly become a media circus. As Joanne Sweeny noted in Slate, the judge made two procedural rulings that led to the ensuing frenzy—and greatly tipped the scales toward the plaintiff. Firstly, the judge allowed cameras in the courtroom to broadcast the proceedings. The Code of Virginia leaves this decision largely up to the court’s discretion, but also stipulates that coverage of “proceedings concerning sexual offenses” is prohibited. Despite the content and high-profile nature of this case, Judge Penney Azcarate decided to proceed with the broadcast. 

Untitled Image by Flickr user SethTippie

Azcarate’s decision is strikingly at odds with the court’s emphasis on silence and decorum. Court order CL-2019-2911 stated, for example, that “Quiet and order shall be maintained at all times. Audible comments of any kind during the court proceedings … will not be tolerated.” In fact, Azcarate interrupted proceedings during trial to tell courtroom spectators to keep their mouths shut. During trial, extraneous noise is heard not just as uncivil but as a threat to impartiality and fairness. However, according to the judge’s logic, this threat is only perceived  within the courtroom. 

This brings us to the second procedural ruling of consequence here. Despite the frenzy enveloping the case, Azcarate decided not to sequester the jury. Jury sequestration involves  the members of the jury being isolated  from public and press during a trial, in order to avoid accidental or deliberate exposure to outside influence or information. Video from the courtroom flooded the internet and, as commentators have argued, likely and unduly influenced the jury, who were not isolated and prevented from accessing TV or social media. As Depp’s legions of supporters raged online, social media effectively became part of his legal team. This  work was done in great part through sound. 

Social media online commentary forensically dissected Heard’s oral testimony, noting changes in her breathing patterns or her speech cadence. Often they would hone in on the fact that she “exhale[d] erratically,” or “can talk so fast,” as seen in this Entertainment Tonight compilation:

The online jury adjudicated on all these vocal elements as proof that she was lying. One internet article described her in audiotape evidence as “cackle[ing] like a witch” and alternating between “laugh[ing] hysterically” and using a “baby voice.” Heard’s detractors took her voice as proof that she was emotionless, robotic, calculated, too well-rehearsed—but also that she was chaotic, nervous, crazy. 

In contrast, commentators described Depp’s voice as “calm,” “calming,” and “soothing,” with Tik Tok users hash-tagging ASMR to audio of him. One fan even posted a ninety-minute ASMR video of his testimony. Multiple Twitter users claimed that “you can hear the pain” in his voice, from an audiotape admitted during trial. At other times, he is applauded for “giggling” and laughing during the trial, with fans hearing it as evidence of his authenticity and “kind soul.” One YouTube commentator, Grandma WHOa, writes that they wish he would record an audiobook so they could “listen to his calming, sexy soothing man voice.” 

So far, so predictable. These are well-established, recognizable patterns about how we hear men’s vs women’s voices in public life—e.g. critiques of Hilary Clinton’s shrill, whiny voice. But listening in to the trial also reveals that this isn’t just a case of online fan culture on overdrive. Instead, it shows how broader social dynamics around gender and power don’t just create outside noise, but are built into formal legal practice within the courtroom.

Much of the conflict here follows a common pattern in defamation cases involving sexual violence claims, with questions around who gets to be a victim (see in my forthcoming piece in HAU: Journal of Ethnographic Theory titled “The Tone of Justice: Voicing the Perpetrator-as-Victim in Sexual Assault Cases”). Depp claimed to have suffered through the defamatory statement and through a longer history of abuse by Heard. His fans framed him as a hero and a victim, using the  social media hashtag #HeardIsAnAbuser. On the other hand, they refused to believe Heard’s claims of having suffered abuse. This determination was based at least in part on Heard’s vocal performance and courtroom testimony, with detractors hearing duplicity in her exhalations, her rapid pace, the occasional firmness and confidence of her tone. As one Depp supporter commented on a video of Heard’s testimony,  “There’s no way a victim sounds like this.” 

Yet in a key strategic move, Depp’s lawyers chose to make Heard sound precisely as sexual assault victims often do during trial. Seeking to dismantle her credibility, they looked to the toolkit of how to deal with a victim in court, mobilizing a well-worn set of legal techniques used to interrogate survivors of sexual violence. In one cross-examination, for example, the plaintiff’s counsel declares that Heard’s “lies have been exposed to the world multiple times.” This claim is then manifested through a series of vocal disciplinary tactics to undermine Heard’s testimony and depict her as a false witness.

For instance, the lawyer, Camille Vasquez, repeatedly employs a common interrogation technique of speaking over and cutting off Heard as she is replying to a question. As legal scholars and sociologists have shown, such techniques are often used in sexual assault cases to intimidate and shape perceptions of the complainant. In a pioneering study on courtroom talk during rape trials, Gregory Matoesian, in Reproducing rape: Domination through talk in the courtroom (1993) describes how lawyers reproduce patriarchal relations of dominance and subordination by “usurping” the witness’ ability to respond (186). As he notes, questions—wielded like weapons of attack by skillful lawyers—are more powerful than answers. 

Vocal technique and dynamics are key here. In Vasquez’s cross-examinations, she repeatedly raises her voice to interrupt Heard, disciplining her before the jury and spectators. She laughs at her testimony and infantilizes Heard, at times speaking to her in calm tones before quickly shifting to a harsher timbre. At one point, Vasquez snaps her notes shut and walks back to her seat while Heard is still answering her question. Heard is forced into abrupt silence. Unable to respond to the question she was asked, she audibly loses control of the narrative being spun. Vasquez also frequently speaks over her and directly to the judge, objecting that Heard is being non-responsive. The lawyer performs for the judge and jury her refusal to listen to Heard. 

At other moments, Vasquez’s voice and affect telegraph exasperation, as she audibly sighs while Heard attempts to answer a question. As Heard and Vasquez go back-and-forth over a line of questioning, Vasquez’s voice bristles with irritation as she speaks in clipped tones, with sharp inflection at the end of each line: “Yes?” “Right?” “Yes or no?” These interjections add an aural layer of interpretation to Heard’s testimony in real-time, guiding the jury to hear the witness as evasive and therefore unreliable. Vasquez’s expressions are all part of a careful vocal strategy, implicitly saying to the jury, “Can you believe this woman?” 

Screenshot from NBC Today video, “Amber Heard Breaks Silence: I Don’t Blame The Jury”

Of course, the answer is no. Jessica Winter, writing in The New Yorker, points out that Heard lost in part because of her “tearless crying,” the fact that she appeared insincere. Winter acknowledges that successful testimony is about “affect and presentation”, a reality that is no secret. In fact, jury instructions in Depp v. Heard clearly state that determinations of witness credibility are based in part on witnesses’ “appearance and manner.” Jurors must use their “common sense” to “determine which witnesses are more believable.” 

But how is “common sense” established? Listening closely to this trial reminds us that such understandings are constructed and regulated through sound as well as through determinants of “appearance and manner,” both in and out of the courtroom. Vasquez’s performance, Heard’s subordinated testimony, and the commentary of millions of avid consumers underline that Heard and Depp sound to many people exactly as common sense and conventional norms would dictate. 

A woman claiming abuse and assault at the hands of a more powerful man is always subject to patriarchal ways of listening, even if she is rich, famous, straight, and white. These ways of listening are contradictory. Research shows that “masculine” voices are heard as more authoritative and dominant, while women are often heard as weak, uncertain, lacking confidence. The public ear hears other racialized and gendered voices through similar power inequities, including queer, nonbinary, and LGBT voices or voices of people of color. In the context of sexual assault adjudication, however, Heather Hlavka and Sameena Mulla show in their Law & Society Review article “That’s How She Talks”: Animating Text Message Evidence in the Sexual Assault Trial” “that a confident voice and calm performance can work against a victim-witness in court, by suggesting that she is not passive or meek enough to be a ‘real’ victim.” On the other hand, they note that a victim-witness who cries on the stand may give the impression of performing or acting. Lawyers audibly manipulate these perceptions, as the examples here show, and men (particularly heteronormative, white men in positions of power) reap huge benefits from them.  

Many observers of Depp v. Heard have noted the toxic social media sludge around the case, as well as the danger that the verdict poses to survivors of domestic abuse and sexual assault. But listening closely to the proceedings shows us that these outcomes aren’t random and aren’t just part of informal processes like trial by Tik Tok. 

Instead, formal court proceedings manipulate and mobilize social scripts around survivors of sexual assault and domestic violence, and around women and marginalized others, to reach their outcomes. We can hear how this strategy plays out through sound and voice, from sighing and interrupting to laughter and silence. The jury instructions in Depp v. Heard state that “Our system of law does not permit jurors to be governed by sympathy, prejudice, or public opinion.” But despite claims that the legal system is based on objectivity and impartiality, we can hear that the law never exists in a bubble – and lawyers often and successfully rely on this very fact. 

Featured image: “Courtroom” by Flickr user Karen Neoh, CC BY 2.0

Nomi Dave is a former lawyer, interdisciplinary researcher, and co-director of the Sound Justice Lab at the University of Virginia, where she is Associate Professor of Music. She is currently co-writing  and co-directing a documentary film, Big Mouth, on a defamation lawsuit connected to a sexual violence case in Guinea.

tape reel

REWIND!…If you liked this post, you may also dig:

“People’s lives are at stake”: A conversation about Law, Listening, and Sound between James Parker and Lawrence English—Lawrence English and James Parker

Vocal Gender and the Gendered Soundscape: At the Intersection of Gender Studies and Sound Studies—Christine Ehrick

Or Does it Explode?: Sounding Out the U.S. Metropolis in Hansberry’s A Raisin in the Sun—Liana Silva