Experts in detecting audio and video fakes say there is overwhelming evidence that the recording of a Baltimore County principal making racist and antisemitic comments is AI-generated.
The two experts — the director of a university media forensics lab and the CEO of an artificial intelligence detection firm that has worked with companies like Visa and Microsoft — say the audio has the hallmarks of a fake.
The audio circulated on social media in January, purporting to be of Pikesville High School principal Eric Eiswert making derogatory comments about students and staff. In the clip, the speaker refers to “ungrateful Black kids who can’t test their way out of a paper bag.” Outrage swirled and prompted a Baltimore County Public Schools investigation that has dragged on for nearly two months with no news on its outcome.
Eiswert has always maintained the audio was fake. Known as deepfakes, audio and video created using artificial intelligence have been used to spread misinformation about public figures like President Biden, but the use of the technology to harm the reputation of a local figure, like a school principal, is unusual.
Eiswert, speaking out for the first time since the audio was released, told The Banner the audio was created to damage his career.
“I did not make this statement, and these thoughts are not what I believe in as both an educator and a person,” he said of the offensive audio, in a written statement.
A spokesperson said Baltimore County Public Schools will not comment on the investigation since it is still ongoing. The school system will notify the public once the investigation is complete.
Audio has ‘hallmarks’ of AI
Siwei Lyu, director of a media forensics lab at the University at Buffalo, said the audio is not particularly sophisticated. Lyu has developed technologies at the State University of New York for spotting audio and images created using artificial intelligence.
This audio is “not a challenging case for the algorithms. I believe someone just made this using an AI voice generator,” Lyu said, adding that he doesn’t believe the person who made it put a lot of effort into the task. Online voice generator tools, like one from Eleven Labs, are available to anyone and advertise their ability to instantly create audio that’s indistinguishable from human speech.
There is, however, clear evidence the audio was manipulated, Lyu said.
“There is some signs of editing, like putting different pieces together,” he said. “This has the sound features of AI generation. The tone is a little flat.”
AI-generated voices tend to have unusually clean background sounds, or a lack of consistent breathing sounds or pauses, Lyu said.
In recent months, universities and companies have been using artificial intelligence to create methods of detecting deepfakes in ways the human ear can’t. Their methods have been getting better with time. Lyu has created the DeepFake-o-meter platform, for example.
Lyu, who has researched digital media forensics, computer vision and machine learning, said his team put the audio through several recent deepfake audio detection methods — three that their lab created, and two that others created. In four cases, the audio was deemed AI-generated with 99% surety, and in the other case with 74% certainty.
The less-certain detection method, Lyu said, identifies “vocoder artifacts,” or evidence of a step that converts a synthesized voice into audio. He said it’s a less reliable way to detect deepfakes than others.
Reality Defender has created its own methods, which Ben Colman, the CEO and co-founder, said could be done with 99% accuracy. The company has worked with governments and companies, including Visa, Microsoft and NBC, to detect images, text or audio deepfakes.
Colman’s team also deemed the Eiswert audio almost certainly AI-generated.
“Our platform not only found it was likely manipulated, but our team looked into it and found it has the hallmarks of AI-generated audio that was then recorded from speaker to another device, likely to mask its generative nature,” said Colman in a statement.
“Separately from the results on it being AI, the audio clearly has several moments where there’s an absence of sound,” Coleman said. “There are clear, sudden and incredibly short stops between bits of dialogue that indicate the absence of sound, which itself indicates some level of file manipulation.”
Lyu and Colman said detection techniques do not allow them to be 100% certain whether AI is involved. Lyu said the tools are not as good as DNA technology or fingerprinting.
Eiswert’s life upended
Billy Burke, the head of the administrators union representing Eiswert said the audio “was manufactured to damage Principal Eiswert, but it has also hurt the students, staff and Pikesville community and the lack of information has added to the damage.”
Many were quick to form their own conclusions about the recording’s authenticity. Former colleagues of Eiswert told The Banner that he’d never say such things and that the comments didn’t align with him as a person. Others have said the opposite. Former students of his declared on social media that it was him on the recording. Even Pikesville High School students told The Banner they believed those were his words.
Burke has maintained Eiswert’s innocence and has said that others should’ve done the same. He said in January that it was discouraging that people assumed Eiswert’s guilt before the investigation was complete, and it’s led to harassment and threats toward both of them. At a Jan. 23 county school board meeting, Burke said, the school system arranged for a police presence at Eiswert’s house.
In his statement, Eiswert called his over 25-year record with the school system “unblemished,” said he believes all students can succeed and noted that he’s created programs for students and educators to “celebrate diversity and excellence in and out of the classroom.”
“This is what makes this crime so insidious,” he wrote. “The belief in the potential of all people has been at the center of all I have done in my career.”
Exploding technology, few regulations
The world has started to see how deepfakes can be problematic. The Nieman Journalism Lab cites multiple examples of AI audio influencing elections in Slovakia, Pakistan and Bangladesh. In January, an AI-simulated voice of President Joe Biden was used to discourage New Hampshire voters from going to the polls in the primaries. While Nieman warns readers of AI influence in elections, not as common a conversation is how the technology can impact people who aren’t in the public eye.
According to NBC News, Biden signed a “wide-ranging” executive order in October that introduces regulations to AI companies. For example, it calls on the Commerce Department to create guidance on watermarking AI content that makes clear that it was not created by humans.
The technology available to create a deepfake has exploded in just the last year, Colman said, with thousands of tools that are readily available to create AI-generated audio or video of people. Both Lyu and Colman said there aren’t enough federal regulations in place to prevent deepfakes, and few people are ever prosecuted.
“What makes it even more scary, and an emergency, is that you don’t need to become a computer science, or cybersecurity security or an AI expert to use these tools,” Colman said. “The tools are ubiquitous, the protections are not existent.”
A single minute’s recording of someone’s voice can be enough to simulate it with a $5-a-month AI tool, Nieman reported. Getting a sample of the principal’s voice wouldn’t be difficult; there’s a three-minute video of Eiswert online from 2018.
Melba Pearson, vice chair of the Criminal Justice Section of the American Bar Association, said she couldn’t think of a single criminal charge a prosecutor could bring against someone if they faked the audio purporting to be Eiswert. Nothing was stolen; no computer was hacked. Maybe some obscure federal charge, since the audio was widely broadcast. Otherwise, its creator could get away with it.
”I think that we’re very much in uncharted territory because of the fact that artificial intelligence has really come to prominence in the last two, three years,” said Pearson, the director of Prosecution Projects at the Jack D. Gordon Institute for Public Policy at Florida International University.
Eiswert could sue, she said, but that’s only financial compensation without criminal accountability. Pearson wants to see legislation that would prevent people from using AI to destroy lives.
Lyu said he is not aware of a case where a deepfake’s creator was prosecuted.
“The court may not take this as evidence. This analysis is not going to be conclusive. I think it has to be accompanied by other evidence,” Lyu said.
The Eiswert audio, Lyu said, shows the danger of the degree to which artificial intelligence could be used to harm individuals. When deepfakes are used on celebrities or well-known political figures, they are easier to detect, both because there’s an abundance of video and audio of their voices and so the public is more likely to believe and spot a fake. “If they are focusing on less prominent people ... the damage they are causing is bigger.”
More From The Banner
AI experts: Racist audio of Baltimore County principal's voice is fake - The Baltimore Banner
Read More
Tidak ada komentar:
Posting Komentar