小程序
传感搜
传感圈

AI Can Re-create What You See from a Brain Scan

2023-03-27
关注

Functional magnetic resonance imaging, or fMRI, is one of the most advanced tools for understanding how we think. As a person in an fMRI scanner completes various mental tasks, the machine produces mesmerizing and colorful images of their brain in action.

Looking at someone’s brain activity this way can tell neuroscientists which brain areas a person is using but not what that individual is thinking, seeing or feeling. Researchers have been trying to crack that code for decades—and now, using artificial intelligence to crunch the numbers, they’ve been making serious progress. Two scientists in Japan recently combined fMRI data with advanced image-generating AI to translate study participants’ brain activity back into pictures that uncannily resembled the ones they viewed during the scans. The original and re-created images can be seen on the researchers’ website.

“We can use these kinds of techniques to build potential brain-machine interfaces,” says Yu Takagi, a neuroscientist at Osaka University in Japan and one of the study’s authors. Such future interfaces could one day help people who currently cannot communicate, such as individuals who outwardly appear unresponsive but may still be conscious. The study was recently accepted to be presented at the 2023 Conference on Computer Vision and Pattern Recognition.

The study has made waves online since it was posted as a preprint (meaning it has not yet been peer-reviewed or published) in December 2022. Online commentators have even compared the technology to “mind reading.” But that description overstates what this technology is capable of, experts say.

“I don’t think we’re mind reading,” says Shailee Jain, a computational neuroscientist at the University of Texas at Austin, who was not involved in the new study. “I don’t think the technology is anywhere near to actually being useful for patients—or to being used for bad things—at the moment. But we are getting better, day by day.”

The new study is far from the first that has used AI on brain activity to reconstruct images viewed by people. In a 2019 experiment, researchers in Kyoto, Japan, used a type of machine learning called a deep neural network to reconstruct images from fMRI scans. The results looked more like abstract paintings than photographs, but human judges could still accurately match the AI-made images to the original pictures.

Neuroscientists have since continued this work with newer and better AI image generators. In the recent study, the researchers used Stable Diffusion, a so-called diffusion model from London-based start-up Stability AI. Diffusion models—a category that also includes image generators such as DALL-E 2—are “the main character of the AI explosion,” Takagi says. These models learn by adding noise to their training images. Like TV static, the noise distorts the images—but in predictable ways that the model begins to learn. Eventually the model can build images from the “static” alone.

Released to the public in August 2022, Stable Diffusion has been trained on billions of photographs and their captions. It has learned to recognize patterns in pictures, so it can mix and match visual features on command to generate entirely new images. “You just tell it, right, ‘A dog on a skateboard,’ and then it’ll generate a dog on a skateboard,” says Iris Groen, a neuroscientist at the University of Amsterdam, who was not involved in the new study. The researchers “just took that model, and then they said, ‘Okay, can we now link it up in a smart way to the brain scans?’”

The brain scans used in the new study come from a research database containing the results of an earlier study in which eight participants agreed to regularly lay in an fMRI scanner and view 10,000 images over the course of a year. The result was a huge repository of fMRI data that shows how the vision centers of the human brain (or at least the brains of these eight human participants) respond to seeing each of the images. In the recent study, the researchers used data from four of the original participants.

To generate the reconstructed images, the AI model needs to work with two different types of information: the lower-level visual properties of the image and its higher-level meaning. For example, it’s not just an angular, elongated object against a blue background—it’s an airplane in the sky. The brain also works with these two kinds of information and processes them in different regions. To link the brain scans and the AI together, the researchers used linear models to pair up the parts of each that deal with lower-level visual information. They also did the same with the parts that handle high-level conceptual information.

“By basically mapping those to each other, they were able to generate these images,” Groen says. The AI model could then learn which subtle patterns in a person’s brain activation correspond to which features of the images. Once the model was able to recognize these patterns, the researchers fed it fMRI data that it had never seen before and tasked it with generating the image to go along with it. Finally, the researchers could compare the generated image to the original to see how well the model performed.

Many of the image pairs the authors showcase in the study look strikingly similar. “What I find exciting about it is that it works,” says Ambuj Singh, a computer scientist at the University of California, Santa Barbara, who was not involved in the study. Still, that doesn’t mean scientists have figured out exactly how the brain processes the visual world, Singh says. The Stable Diffusion model doesn’t necessarily process images in the same way the brain does, even if it’s capable of generating similar results. The authors hope that comparing these models and the brain can shed light on the inner workings of both complex systems.

As fantastical as this technology may sound, it has plenty of limitations. Each model has to be trained on, and use, the data of just one person. “Everybody’s brain is really different,” says Lynn Le, a computational neuroscientist at Radboud University in the Netherlands, who was not involved in the research. If you wanted to have AI reconstruct images from your brain scans, you would have to train a custom model—and for that, scientists would need troves of high-quality fMRI data from your brain. Unless you consent to laying perfectly still and concentrating on thousands of images inside a clanging, claustrophobic MRI tube, no existing AI model would have enough data to start decoding your brain activity.

Even with those data, AI models are only good at tasks for which they’ve been explicitly trained, Jain explains. A model trained on how you perceive images won’t work for trying to decode what concepts you’re thinking about—though some research teams, including Jain’s, are building other models for that.

It’s still unclear if this technology would work to reconstruct images that participants have only imagined, not viewed with their eyes. That ability would be necessary for many applications of the technology, such as using brain-computer interfaces to help those who cannot speak or gesture to communicate with the world.

“There’s a lot to be gained, neuroscientifically, from building decoding technology,” Jain says. But the potential benefits come with potential ethical quandaries, and addressing them will become still more important as these techniques improve. The technology’s current limitations are “not a good enough excuse to take potential harms of decoding lightly,” she says. “I think the time to think about privacy and negative uses of this technology is now, even though we may not be at the stage where that could happen.”

参考译文
人工智能可以重现你从脑部扫描中看到的东西
功能性磁共振成像(fMRI)是了解我们如何思考的最先进工具之一。当一个人在功能核磁共振扫描仪中完成各种心理任务时,机器会产生令人着迷的多彩的大脑活动图像。用这种方法观察一个人的大脑活动可以告诉神经科学家一个人在使用大脑的哪个区域,但不能告诉科学家这个人在想什么、看到什么或感觉到什么。几十年来,研究人员一直在试图破解这个密码,现在,他们利用人工智能来处理数字,已经取得了重大进展。日本的两位科学家最近将功能磁共振成像数据与先进的图像生成人工智能相结合,将研究参与者的大脑活动转化为与他们在扫描期间看到的惊人相似的图片。原始图像和重新制作的图像可以在研究人员的网站上看到。“我们可以使用这些技术来建立潜在的脑机接口,”日本大阪大学的神经科学家、该研究的作者之一高木裕(Yu Takagi)说。这种未来的界面有一天可以帮助那些目前无法沟通的人,比如那些表面上看起来没有反应但可能仍然有意识的人。这项研究最近被接受在2023年计算机视觉和模式识别会议上发表。这项研究自2022年12月以预印本的形式发布以来(这意味着它尚未经过同行评审或发表),在网上引起了轩然大波。网上评论者甚至将这项技术比作“读心术”。但专家表示,这种描述夸大了这项技术的能力。“我不认为我们能读心术,”德克萨斯大学奥斯汀分校的计算神经科学家谢莉·贾恩(Shailee Jain)说,他没有参与这项新研究。“我认为这项技术目前还没有真正对病人有用,或者被用来做坏事。但我们一天比一天好。这项新研究远不是第一个在大脑活动中使用人工智能来重建人们看到的图像的研究。在2019年的一项实验中,日本京都的研究人员使用了一种称为深度神经网络的机器学习,从功能磁共振成像扫描中重建图像。结果看起来更像抽象画而不是照片,但人类裁判仍然可以准确地将人工智能制作的图像与原始图片匹配。此后,神经科学家们继续使用更新更好的人工智能图像生成器进行这项工作。在最近的研究中,研究人员使用了稳定扩散(Stable Diffusion),这是伦敦初创公司Stability AI的所谓扩散模型。扩散模型——这一类别还包括图像生成器,如DALL-E 2——是“人工智能爆炸的主角,”高木说。这些模型通过在训练图像中添加噪声来学习。就像电视上的静态噪声一样,噪声会扭曲图像,但以一种可预测的方式,模型开始学习。最终,该模型可以仅从“静态”中构建图像。“稳定扩散”于2022年8月向公众发布,它已经接受了数十亿张照片及其说明文字的训练。它已经学会了识别图片中的模式,所以它可以根据命令混合和匹配视觉特征来生成全新的图像。“你只要告诉它,对吧,‘一只狗在滑板上’,然后它就会生成一只狗在滑板上,”阿姆斯特丹大学(University of Amsterdam)的神经科学家艾瑞斯·格林(Iris Groen)说,他没有参与这项新研究。研究人员“只是采用了这个模型,然后他们说,‘好吧,我们现在能以一种聪明的方式将它与大脑扫描联系起来吗?’” 这项新研究中使用的脑部扫描数据来自一个研究数据库,该数据库包含了一项早期研究的结果。在该研究中,8名参与者同意在一年的时间里定期躺在功能磁共振成像扫描仪中,观看1万张图像。结果是一个巨大的功能磁共振成像数据库,显示了人类大脑的视觉中心(或者至少是这八名人类参与者的大脑)在看到每张图像时的反应。在最近的研究中,研究人员使用了来自四位原始参与者的数据。为了生成重建图像,AI模型需要处理两种不同类型的信息:图像的低级视觉属性及其高级含义。例如,它不仅仅是一个有棱角的、在蓝色背景下的细长物体——它是天空中的一架飞机。大脑也会处理这两种信息,并在不同的区域处理它们。为了将大脑扫描和人工智能连接在一起,研究人员使用线性模型来配对处理较低层次视觉信息的部分。他们对处理高级概念信息的部分也做了同样的处理。“通过将这些图像相互映射,他们能够生成这些图像,”格林说。然后,人工智能模型可以了解一个人大脑激活的哪些微妙模式对应于图像的哪些特征。一旦模型能够识别这些模式,研究人员就会给它提供它从未见过的fMRI数据,并让它生成相应的图像。最后,研究人员可以将生成的图像与原始图像进行比较,看看模型的表现如何。作者在研究中展示的许多图像看起来惊人地相似。加州大学圣巴巴拉分校(University of California, Santa Barbara)的计算机科学家Ambuj Singh说:“让我感到兴奋的是,它确实有效。”他没有参与这项研究。不过,这并不意味着科学家们已经确切地弄清楚了大脑是如何处理视觉世界的,辛格说。稳定扩散模型不一定以与大脑相同的方式处理图像,即使它能够产生类似的结果。作者希望将这些模型和大脑进行比较,可以揭示这两个复杂系统的内部工作原理。尽管这项技术听起来很神奇,但它有很多局限性。每个模型都必须在一个人的数据上接受训练并使用。“每个人的大脑都是不同的,”荷兰内梅亨大学(Radboud University)的计算神经科学家林恩·勒(Lynn Le)说,她没有参与这项研究。如果你想让人工智能从你的大脑扫描中重建图像,你必须训练一个自定义模型——为此,科学家们需要从你的大脑中获得大量高质量的功能磁共振成像数据。除非你同意完全静止不动,专注于叮当作响、幽闭恐怖的核磁共振成像管内的数千张图像,否则现有的人工智能模型都没有足够的数据来解码你的大脑活动。贾恩解释说,即使有这些数据,人工智能模型也只擅长那些经过明确训练的任务。一个以你如何感知图像为训练对象的模型,在试图解码你正在思考的概念时是行不通的——尽管一些研究团队,包括Jain的团队,正在为此建立其他模型。目前尚不清楚这项技术是否能重建参与者只是想象而不是亲眼看到的图像。这种能力对于这项技术的许多应用都是必要的,比如使用脑机接口来帮助那些不会说话或手势的人与世界交流。 “从神经科学上讲,从构建解码技术中可以获得很多东西,”Jain说。但潜在的好处伴随着潜在的伦理困境,随着这些技术的改进,解决这些问题将变得更加重要。她说,这项技术目前的局限性“不足以成为轻视解码潜在危害的借口”。“我认为现在是时候考虑这项技术的隐私和负面用途了,尽管我们可能还没有到可能发生这种情况的阶段。”
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

提取码
复制提取码
点击跳转至百度网盘