AI Is Getting Powerful. But Can Researchers Make It Principled?

2023-04-05

关注

Soon after Alan Turing initiated the study of computer science in 1936, he began wondering if humanity could one day build machines with intelligence comparable to that of humans. Artificial intelligence, the modern field concerned with this question, has come a long way since then. But truly intelligent machines that can independently accomplish many different tasks have yet to be invented. And though science fiction has long imagined AI one day taking malevolent forms such as amoral androids or murderous Terminators, today’s AI researchers are often more worried about the everyday AI algorithms that already are enmeshed with our lives—and the problems that have already become associated with them.

Even though today’s AI is only capable of automating certain specific tasks, it is already raising significant concerns. In the past decade, engineers, scholars, whistleblowers and journalists have repeatedly documented cases in which AI systems, composed of software and algorithms, have caused or contributed to serious harms to humans. Algorithms used in the criminal justice system can unfairly recommend denying parole. Social media feeds can steer toxic content toward vulnerable teenagers. AI-guided military drones can kill without any moral reasoning. Additionally, an AI algorithm tends to be more like an inscrutable black box than a clockwork mechanism. Researchers often cannot understand how these algorithms, which are based on opaque equations that involve billions of calculations, achieve their outcomes.

Problems with AI have not gone unnoticed, and academic researchers are trying to make these systems safer and more ethical. Companies that build AI-centered products are working to eliminate harms, although they tend to offer little transparency on their efforts. “They have not been very forthcoming,” says Jonathan Stray, an AI researcher at the University of California, Berkeley. AI’s known dangers, as well as its potential future risks, have become broad drivers of new AI research. Even scientists who focus on more abstract problems such as the efficiency of AI algorithms can no longer ignore their field’s societal implications. “The more that AI has become powerful, the more that people demand that it has to be safe and robust,” says Pascale Fung, an AI researcher at the Hong Kong University of Science and Technology. “For the most part, for the past three decades that I was in AI, people didn’t really care.”

Concerns have grown as AI has become widely used. For example, in the mid-2010s, some Web search and social media companies started inserting AI algorithms into their products. They found they could create algorithms to predict which users were more likely to click on which ads and thereby increase their profits. Advances in computing had made all this possible through dramatic improvements in “training” these algorithms—making them learn from examples to achieve high performance. But as AI crept steadily into search engines and other applications, observers began to notice problems and raise questions. In 2016 investigative journalists raised claims that certain algorithms used in parole assessment were racially biased.

That report’s conclusions have been challenged, but designing AI that is fair and unbiased is now considered a central problem by AI researchers. Concerns arise whenever AI is deployed to make predictions about people from different demographics. Fairness has now become even more of a focus as AI is embedded in ever more decision-making processes, such as screening resumes for a job or evaluating tenant applications for an apartment.

In the past few years, the use of AI in social media apps has become another concern. Many of these apps use AI algorithms called recommendation engines, which work in a similar way to ad-serving algorithms, to decide what content to show to users. Hundreds of families are currently suing social media companies over allegations that algorithmically driven apps are directing toxic content to children and causing mental health problems. Seattle Public Schools recently filed a lawsuit alleging that social media products are addictive and exploitative. But untangling an algorithm’s true impact is no easy matter. Social media platforms release few data on user activity, which are needed for independent researchers to make assessments. “One of the complicated things about all technologies is that there’s always costs and benefits,” says Stray, whose research focuses on recommender systems. “We’re now in a situation where it’s hard to know what the actual bad effects are.”

The nature of the problems with AI is also changing. The past two years have seen the release of multiple “generative AI” products that can produce text and images of remarkable quality. A growing number of AI researchers now believe that powerful future AI systems could build on these achievements and one day pose global, catastrophic dangers that could make current problems pale in comparison.

What form might such future threats take? In a paper posted on the preprint repository arXiv.org in October, researchers at DeepMind (a subsidiary of Google’s parent company Alphabet) describe one catastrophic scenario. They imagine engineers developing a code-generating AI based on existing scientific principles and tasked with getting human coders to adopt its submissions to their coding projects. The idea is that as the AI makes more and more submissions, and some are rejected, human feedback will help it learn to code better. But the researchers suggest that this AI, with its sole directive of getting its code adopted, might potentially develop a tragically unsound strategy, such as achieving world domination and forcing its code to be adopted—at the cost of upending human civilization.

Some scientists argue that research on existing problems, which are already concrete and numerous, should be prioritized over work involving hypothetical future disasters. “I think we have much worse problems going on today,” says Cynthia Rudin, a computer scientist and AI researcher at Duke University. Strengthening that case is the fact that AI has yet to directly cause any large-scale catastrophes—although there have been a few contested instances in which the technology did not need to reach futuristic capability levels in order to be dangerous. For example, the nonprofit human rights organization Amnesty International alleged in a report published last September that algorithms developed by Facebook’s parent company Meta “substantially contributed to adverse human rights impacts” on the Rohingya people, a minority Muslim group, in Myanmar by amplifying content that incited violence. Meta responded to Scientific American’s request for comment by pointing to a previous statement to Time magazine from Meta’s Asia-Pacific director of public policy Rafael Frankel, who acknowledged that Myanmar’s military committed crimes against the Rohingya and stated that Meta is currently participating in intergovernmental investigative efforts led by the United Nations and other organizations.

Other researchers say preventing a powerful future AI system from causing a global catastrophe is already a major concern. “For me, that’s the primary problem we need to solve,” says Jan Leike, an AI researcher at the company OpenAI. Although these hazards are so far entirely conjectural, they are undoubtedly driving a growing community of researchers to study various harm-reduction tactics.

In one approach called value alignment, pioneered by AI scientist Stuart Russell at the University of California, Berkeley, researchers seek ways to train an AI system to learn human values and act in accordance with them. One of the advantages of this approach is that it could be developed now and applied to future systems before they present catastrophic hazards. Critics say value alignment focuses too narrowly on human values when there are many other requirements for making AI safe. For example, just as with humans, a foundation of verified, factual knowledge is essential for AI systems to make good decisions. “The issue is not that AI’s got the wrong values,” says Oren Etzioni, a researcher at the Allen Institute for AI. “The truth is that our actual choices are functions of both our values and our knowledge.” With these criticisms in mind, other researchers are working to develop a more general theory of AI alignment that works to ensure the safety of future systems without focusing as narrowly on human values.

Some scientists are taking approaches to AI alignment that they see as more practical and connected with the present. Consider recent advances in text-generating technology: the leading examples, such as DeepMind’s Chinchilla, Google Research’s PaLM, Meta AI’s OPT and OpenAI’s ChatGPT, can all produce content that is racially biased, illicit or deceptive—a challenge that each of these companies acknowledges. Some of these companies, including OpenAI and DeepMind, consider such problems to be ones of inadequate alignment. They are now working to improve alignment in text-generating AI and hope this will offer insights into aligning future systems.

Researchers acknowledge that a general theory of AI alignment remains absent. “We don’t really have an answer for how we align systems that are much smarter than humans,” Leike says. But whether the worst problems of AI are in the past, present or future, at least the biggest roadblock to solving them is no longer a lack of trying.

参考译文

人工智能越来越强大。但研究人员能让它有原则吗?

1936年，艾伦·图灵开始研究计算机科学后不久，他就开始思考人类是否有一天能制造出与人类智能相当的机器。人工智能是研究这一问题的现代领域，自那时以来已经取得了长足的进展。但是能够独立完成许多不同任务的真正智能机器还有待发明。尽管科幻小说一直想象人工智能有一天会以恶意的形式出现，比如不道德的机器人或凶残的终结者，但如今的人工智能研究人员往往更担心已经与我们的生活交织在一起的日常人工智能算法，以及已经与它们相关的问题。尽管今天的人工智能只能自动化某些特定的任务，但它已经引起了人们的严重担忧。在过去十年中，工程师、学者、举报人和记者多次记录了由软件和算法组成的人工智能系统对人类造成或促成严重伤害的案例。刑事司法系统中使用的算法可能会不公平地建议拒绝假释。社交媒体可以将有毒内容引导到脆弱的青少年。人工智能引导的军用无人机可以在没有任何道德推理的情况下杀人。此外，AI算法更像是一个神秘的黑盒，而不是发条机制。研究人员通常无法理解这些算法是如何实现它们的结果的，这些算法基于涉及数十亿次计算的不透明方程。人工智能的问题并没有被忽视，学术研究人员正在努力使这些系统更安全、更道德。构建以人工智能为中心的产品的公司正在努力消除危害，尽管他们的努力往往不太透明。加州大学伯克利分校(University of California, Berkeley)的人工智能研究员乔纳森•特雷(Jonathan Stray)表示:“他们一直不太愿意透露信息。”人工智能已知的危险，以及它潜在的未来风险，已经成为新的人工智能研究的广泛驱动力。即使是专注于更抽象问题(如人工智能算法的效率)的科学家，也不能再忽视他们所在领域的社会影响。香港科技大学(Hong Kong University of Science and Technology)的人工智能研究员冯pascale Fung表示:“人工智能越强大，人们就越要求它必须安全可靠。”“在我从事人工智能的过去30年里，大多数情况下，人们并不真正关心。”随着人工智能的广泛应用，人们的担忧也在增加。例如，在2010年代中期，一些网络搜索和社交媒体公司开始在他们的产品中插入人工智能算法。他们发现他们可以创建算法来预测哪些用户更有可能点击哪些广告，从而增加他们的利润。通过“训练”这些算法的巨大改进，计算的进步使这一切成为可能——使它们从例子中学习以实现高性能。但随着人工智能逐渐进入搜索引擎和其他应用程序，观察人士开始注意到问题并提出问题。2016年，调查记者提出，用于假释评估的某些算法存在种族偏见。该报告的结论受到了挑战，但设计公平公正的人工智能现在被人工智能研究人员视为一个核心问题。每当人工智能被用于对不同人口结构的人进行预测时，人们就会产生担忧。随着人工智能被嵌入到越来越多的决策过程中，例如筛选工作简历或评估公寓租户申请，公平现在已经成为一个更加关注的焦点。在过去几年里，在社交媒体应用程序中使用人工智能已成为另一个令人担忧的问题。许多这类应用程序使用了一种名为推荐引擎的人工智能算法，其工作方式类似于广告服务算法，以决定向用户展示哪些内容。数百个家庭目前正在起诉社交媒体公司，指控算法驱动的应用程序将有毒内容导向儿童，并导致心理健康问题。西雅图公立学校最近提起诉讼，指控社交媒体产品会让人上瘾并具有剥削性。但要理清算法的真正影响并非易事。社交媒体平台很少发布用户活动数据，独立研究人员需要这些数据进行评估。“所有技术的一个复杂之处在于，总有成本和收益，”Stray说，他的研究重点是推荐系统。“我们现在的情况是，很难知道实际的坏影响是什么。”人工智能问题的性质也在发生变化。过去两年，我们看到了多种“生成式人工智能”产品的发布，这些产品可以生成高质量的文本和图像。越来越多的人工智能研究人员现在认为，强大的未来人工智能系统可能会建立在这些成就的基础上，有朝一日会带来全球性的灾难性危险，使当前的问题相形见绌。未来的威胁会以什么形式出现呢?在10月份发布在预印本库arXiv.org上的一篇论文中，DeepMind(谷歌母公司Alphabet的子公司)的研究人员描述了一个灾难性的场景。他们设想工程师根据现有的科学原理开发一种代码生成AI，并让人类编码员将其提交的代码应用到他们的编码项目中。这个想法是，随着人工智能提交越来越多的意见书，其中一些被拒绝，人类的反馈将帮助它更好地学习编码。但研究人员认为，这种人工智能的唯一指令是让它的代码被采用，它可能会发展出一种可悲的不健全的战略，比如实现世界统治，迫使它的代码被采用——以颠覆人类文明为代价。一些科学家认为，对现有问题的研究应该优先于涉及假设的未来灾难的工作，这些问题已经是具体而大量的。杜克大学(Duke University)计算机科学家、人工智能研究员辛西娅·鲁丁(Cynthia Rudin)说:“我认为，我们今天面临着更严重的问题。”人工智能还没有直接导致任何大规模的灾难，尽管有一些有争议的例子，在这些例子中，技术不需要达到未来的能力水平就会变得危险。例如，非营利人权组织大赦国际(Amnesty International)在去年9月发表的一份报告中称，Facebook母公司Meta开发的算法通过放大煽动暴力的内容，对缅甸少数穆斯林群体罗兴亚人“严重造成了不利的人权影响”。Meta在回应《科学美国人》的置评请求时指出，Meta的亚太公共政策总监拉斐尔·弗兰克尔(Rafael Frankel)此前在《时代》杂志上发表的一份声明，承认缅甸军方对罗兴亚人犯下了罪行，并表示Meta目前正在参与由联合国和其他组织领导的政府间调查工作。其他研究人员表示，防止未来强大的人工智能系统引发全球灾难已经是一个主要问题。“对我来说，这是我们需要解决的主要问题，”OpenAI公司的人工智能研究员简·莱克(Jan Leike)说。尽管到目前为止这些危害还完全是推测的，但它们无疑正促使越来越多的研究人员研究各种减少危害的策略。在由加州大学伯克利分校的人工智能科学家斯图尔特·拉塞尔(Stuart Russell)首创的一种被称为价值校准的方法中，研究人员寻求训练人工智能系统学习人类价值观并按照这些价值观行事的方法。这种方法的优点之一是，它可以在现在开发，并在未来的系统出现灾难性危险之前应用于它们。批评人士表示，在确保人工智能安全还有许多其他要求的情况下，价值取向过于狭隘地关注人类价值观。例如，与人类一样，经过验证的事实性知识基础对于人工智能系统做出正确决策至关重要。艾伦人工智能研究所(Allen Institute for AI)研究员奥伦·埃齐奥尼(Oren Etzioni)表示:“问题不在于人工智能的价值观是错误的。“事实是，我们的实际选择是我们的价值观和知识的功能。”考虑到这些批评，其他研究人员正在努力开发一种更普遍的人工智能对齐理论，以确保未来系统的安全，而不是狭隘地关注人类价值。一些科学家正在采取他们认为更实用、更贴近现实的人工智能校准方法。考虑到文本生成技术的最新进展:领先的例子，如DeepMind的Chinchilla、谷歌Research的PaLM、Meta AI的OPT和OpenAI的ChatGPT，都可以生成带有种族偏见、非法或欺骗性的内容——这些公司都承认这是一个挑战。其中一些公司，包括OpenAI和DeepMind，认为这些问题是没有充分协调的问题。他们现在正在努力改善文本生成人工智能的对齐，并希望这将为对齐未来的系统提供见解。研究人员承认，关于人工智能对齐的一般理论仍然缺乏。雷克说:“我们真的不知道如何让比人类聪明得多的系统保持一致。”但无论人工智能最糟糕的问题是在过去、现在还是未来，至少解决这些问题的最大障碍不再是缺乏尝试。

您觉得本篇内容如何

评分

声明：转载此文是出于传递更多信息之目的。若有来源标注错误或侵犯了您的合法权益，请与我们联系，我们将及时更正、删除，谢谢。