You may have seen photographs that suggest otherwise, but former president Donald Trump wasn’t arrested last week, and the pope didn’t wear a stylish, brilliant white puffer coat. These recent viral hits were the fruits of artificial intelligence systems that process a user’s textual prompt to create images. They demonstrate how these programs have become very good very quickly—and are now convincing enough to fool an unwitting observer.
So how can skeptical viewers spot images that may have been generated by an artificial intelligence system such as DALL-E, Midjourney or Stable Diffusion? Each AI image generator—and each image from any given generator—varies in how convincing it may be and in what telltale signs might give its algorithm away. For instance, AI systems have historically struggled to mimic human hands and have produced mangled appendages with too many digits. As the technology improves, however, systems such as Midjourney V5 seem to have cracked the problem—at least in some examples. Across the board, experts say that the best images from the best generators are difficult, if not impossible, to distinguish from real images.
“It’s pretty amazing, in terms of what AI image generators are able to do,” says S. Shyam Sundar, a researcher at Pennsylvania State University who studies the psychological impacts of media technologies. “There’s been a giant leap in the last year or so in terms of image-generation abilities.”
Some of the factors behind this leap in ability include the ever-increasing number of images available to train such AI systems, as well as advances in data processing infrastructure and interfaces that make the technology accessible to regular Internet users, Sundar notes. The result is that artificially generated images are everywhere and can be “next to impossible to detect,” he says.
One recent experiment highlighted how well AI is able to deceive. Sophie Nightingale, a psychologist at Lancaster University in England who focuses on digital technology, co-authored a study that tested whether online volunteers could distinguish between passportlike headshots created by an AI system called StyleGAN2 and real images. The results were disheartening, even back in late 2021, when the researchers ran the experiment. “On average, people were pretty much at chance performance,” Nightingale says. “Basically, we’re at the point where it’s so realistic that people can’t reliably perceive the difference between those synthetic faces and actual, real faces—faces of actual people who really exist.” Although humans provided some help to the AI (researchers sorted through the images generated by StyleGAN2 to select only the most realistic ones), Nightingale says that someone looking to use such a program for nefarious purposes would likely do the same.
In a second test, the researchers tried to help the test subjects improve their AI-detecting abilities. They marked each answer right or wrong after participants answered, and they also prepared participants in advance by having them read through advice for detecting artificially generated images. That advice highlighted areas where AI algorithms often stumble and create mismatched earrings, for example, or blur a person’s teeth together. Nightingale also notes that algorithms often struggle to create anything more sophisticated than a plain background. But even with these additions, participants’ accuracy only increased by about 10 percent, she says—and the AI system that generated the images used in the trial has since been upgraded to a new and improved version.
Ironically, as image-generating technology continues to improve, humans’ best defense from being fooled by an AI system may be yet another AI system: one trained to detect artificial images. Experts say that as AI image generation progresses, algorithms are better equipped than humans to detect some of the tiny, pixel-scale fingerprints of robotic creation.
Creating these AI detective programs works the same way as any other machine learning task, says Yong Jae Lee, a computer scientist at the University of Wisconsin–Madison. “You collect a data set of real images, and you also collect a data set of AI-generated images,” Lee says. “Then you can train a machine-learning model to distinguish the two.”
Still, these systems have significant shortcomings, Lee and other experts say. Most such algorithms are trained on images from a specific AI generator and are unable to identify fakes produced by different algorithms. (Lee says he and a research team are working on a way to avoid that problem by training the detector to instead recognize what makes an image real.) Most detectors also lack the user-friendly interfaces that have tempted so many people to try the generative AI systems.
Moreover AI detectors will always be scrambling to keep up with AI image generators, some of which incorporate similar detection algorithms but use them as a way to learn how to make their fake output less detectable. “The battle between AI systems that generate images and AI systems that detect the AI-generated images is going to be an arms race,” says Wael AbdAlmageed, a research associate professor of computer science at the University of Southern California. “I don’t see any side winning anytime soon.”
AbdAlmageed says no approach will ever be able to catch every single artificially produced image—but that doesn’t mean we should give up. He suggests that social media platforms need to begin confronting AI-generated content on their sites because these companies are better posed to implement detection algorithms than individual users are.
And users need to more skeptically evaluate visual information by asking whether it’s false, AI-generated or harmful before sharing. “We as human species sort of grow up thinking that seeing is believing,” AbdAlmageed says. “That’s not true anymore. Seeing is not believing anymore.”