小程序
传感搜
传感圈

New Exascale Supercomputer Can Do a Quintillion Calculations a Second

2023-02-14
关注

“Exascale” sounds like a science-fiction term, but it has a simple and very nonfictional definition: while a human brain can perform about one simple mathematical operation per second, an exascale computer can do at least one quintillion calculations in the time it takes to say, “One Mississippi.”

In 2022 the world’s first declared exascale computer, Frontier, came online at Oak Ridge National Laboratory—and it’s 2.5 times faster than the second-fastest-ranked computer in the world. It will soon have better competition (or peers), though, from incoming examachines such as El Capitan, housed at Lawrence Livermore National Laboratory, and Aurora, which will reside at Argonne National Laboratory.

It’s no coincidence that all of these machines find themselves at facilities whose names end with the words “national laboratory.” The new computers are projects of the Department of Energy and its National Nuclear Security Administration (NNSA). The DOE oversees these labs and a network of others across the country. NNSA is tasked with keeping watch over the nuclear weapons stockpile, and some of exascale computing’s raison d’être is to run calculations that help maintain that arsenal. But the supercomputers also exist to solve intractable problems in pure science.

When scientists are finished commissioning Frontier, which will be dedicated to such fundamental research, they hope to illuminate core truths in various fields—such as learning about how energy is produced, how elements are made and how the dark parts of the universe spur its evolution—all through almost-true-to-life simulations in ways that wouldn’t have been possible even with the nothing-to-sniff-at supercomputers of a few years ago.

“In principle, the community could have developed and deployed an exascale supercomputer much sooner, but it would not have been usable, useful and affordable by our standards,” says Douglas Kothe, associate laboratory director of computing and computational sciences at Oak Ridge. Obstacles such as huge-scale parallel processing, exaenergy consumption, reliability, memory and storage—along with a lack of software to start running on such supercomputers—stood in the way of those standards. Years of focused work with the high-performance computing industry lowered those barriers to finally satisfy scientists.

Frontier can process seven times faster and hold four times more information in memory than its predecessors. It is made up of nearly 10,000 CPUs, or central processing units—which perform instructions for the computer and are generally made of integrated circuits—and almost 38,000 GPUs, or graphics processing units. GPUs were created to quickly and smoothly display visual content in gaming. But they have been reappropriated for scientific computing, in part because they’re good at processing information in parallel.

Inside Frontier, the two kinds of processors are linked. The GPUs do repetitive algebraic math in parallel. “That frees the CPUs to direct tasks faster and more efficiently,” Kothe says. “You could say it’s a match made in supercomputing heaven.” By breaking scientific problems into a billion or more tiny pieces, Frontier allows its processors to each eat their own small bite of the problem. Then, Kothe says, “it reassembles the results into the final answer. You could compare each CPU to a crew chief in a factory and the GPUs to workers on the front line.”

The 9,472 different nodes in the supercomputer—each essentially its own not-so-super computer—are also all connected in such a way that they can pass information quickly from one place to another. Importantly, though, Frontier doesn’t just run faster than machines of yore: it also has more memory and so can run bigger simulations and hold tons of information in the same place it’s processing those data. That’s like keeping all the acrylics with you while you’re trying to do a paint-by-numbers project rather than having to go retrieve each color as needed from the other side of the table.

With that kind of power, Frontier—and the beasts that will follow—can teach humans things about the world that might have remained opaque before. In meteorology, it could make hurricane forecasts less fuzzy and frustrating. In chemistry, it could experiment with different molecular configurations to see which might make great superconductors or pharmaceutical compounds. And in medicine, it has already analyzed all of the genetic mutations of SARS-CoV-2, the virus that causes COVID—cutting the time that calculation takes from a week to a day—to understand how those tweaks affect the virus’s contagiousness. That saved time allows scientists to perform ultrafast iterations, altering their ideas and conducting new digital experiments in quick succession.

With this level of computing power, scientists don’t have to make the same approximations they did before, Kothe says. With older computers, he would often have to say, “I’m going to assume this term is inconsequential, that term is inconsequential. Maybe I don’t need that equation.” In physics terms, that’s called making a “spherical cow”: taking a complex phenomenon, like a bovine, and turning it into something highly simplified, like a ball. With exascale computers, scientists hope to avoid cutting those kinds of corners and simulate a cow as, well, essentially a cow: something that more closely approaches a representation of reality.

Frontier’s upgraded hardware is the main factor behind that improvement. But hardware alone doesn’t do scientists that much good if they don’t have software that can harness the machine’s new oomph. That’s why an initiative called the Exascale Computing Project (ECP)—which brings together the Department of Energy and its National Nuclear Security Administration, along with industry partners—has sponsored 24 initial science-coding projects alongside the supercomputers’ development.

Those software initiatives can’t just take old code—meant to simulate, say, the emergence of sudden severe weather—plop it onto Frontier and say, “It made an okay forecast at lightning speed instead of almost lightning speed!” To get a more accurate result, they need an amped-up and optimized set of codes. “We're not going to cheat here and get the same not-so-great answers faster,” says Kothe, who is also ECP’s director.

But getting greater answers isn’t easy, says Salman Habib, who’s in charge of an early science project called ExaSky. “Supercomputers are essentially brute-force tools,” he says. “So you have to use them in intelligent ways. And that's where the fun comes in, where you scratch your head and say, ‘How can I actually use this possibly blunt instrument to do what I really want to do?’” Habib, director of the computational science division at Argonne, wants to probe the mysterious makeup of the universe and the formation and evolution of its structures. The simulations model dark matter and dark energy’s effects and include initial conditions that investigate how the universe expanded right after the big bang.

Large-scale astronomical surveys—for instance, the Dark Energy Spectroscopic Instrument in Arizona—have helped illuminate those shady corners of the cosmos, showing how galaxies formed and shaped and spread themselves as the universe expands. But data from these telescopes can’t, on its own, explain the why of what they see.

Theory and modeling approaches like ExaSky might be able to do so, though. If a theorist suspects that dark energy exhibits a certain behavior or that our conception of gravity is off, they can tweak the simulation to include those concepts. It will then spit out a digital cosmos, and astronomers can see the ways it matches, or doesn’t match, what their telescopes’ sensors pick up. “The role of a computer is to be a virtual universe for theorists and modelers,” Habib says.

ExaSky extends algorithms and software written for lesser supercomputers, but simulations haven’t yet led to giant breakthroughs about the nature of the universe’s dark components. The work scientists have done so far offers “an interesting combination of being able to model it but not really understand it,” Habib says. With exascale computers, though, astronomers s Habib can simulate a larger volume of space, using more cowlike physics, in higher definition. Understanding, perhaps, is on the way.

Another early Frontier project called ExaStar, led by Daniel Kasen of Lawrence Berkeley National Laboratory, will investigate a different cosmic mystery. This endeavor will simulate supernovae—the end-of-life explosions of massive stars that, in their extremity, produce heavy elements. Scientists have a rough idea of how supernovae play out, but no one actually knows the whole-cow version of these explosions or how heavy elements get made within them.

In the past, most supernova simulations simplified the situation by assuming stars were spherically symmetric or by using simplified physics. With exascale computers, scientists can make more detailed three-dimensional models. And rather than just running the code for one explosion, they can do whole suites, including different kinds of stars and different physics ideas, exploring which parameters produce what astronomers actually see in the sky.

“Supernovae and stellar explosions are fascinating events in their own right,” Kasen says. “But they’re also key players in the story of the universe.” They provided the elements that make up Earth and us —and the telescopes that look beyond us. Although their extreme reactions can’t quite be replicated in physical experiments, digital trials are both possible and less destructive.

A third early project is examining phenomena that are closer to home: nuclear reactors and their reactions. The ExaSMR project will use exascale computing to figure out what’s going on beneath the shielding of “small modular reactors,” a type of facility that nuclear-power proponents hope will become more common. In earlier days supercomputers could only model one component of a reactor at a time. Later they could model the whole machine but only at one point in time—getting, say, an accurate picture of when it first turns on. “Now we're modeling the evolution of a reactor from the time that it starts up over the course of an entire fuel cycle,” says Steven Hamilton of Oak Ridge, who’s co-leading the effort.

Hamilton’s team will investigate how neutrons move around and affect the chain reaction of nuclear fission, as well as how heat from fission moves through the system. Figuring out how the heat flows with both spatial and chronological detail wouldn’t have been possible at all before because the computer didn’t have enough memory to do the math for the whole simulation at once. “The next focus for us is looking at a wider class of reactor designs” to improve their efficiency and safety, Hamilton says.

Of course, nuclear power has always been the flip side of that other use of nuclear reactions: weapons. At Lawrence Livermore, Teresa Bailey leads a team of 150 people, many of whom are busy preparing the codes that simulate weapons to run on El Capitan. Bailey is associate program director for computational physics at Lawrence Livermore, and she oversees parts of the Advanced Simulation and Computing project—the national security side of things. Teams from the NNSA labs—supported by ECP and the Advanced Technology Development and Mitigation program, a more weapons-oriented effort—worked on R&D that helps with modernizing the weapons codes.

Ask any scientist whether computers like Frontier, El Capitan and Aurora are finally good enough, and you’ll never get a yes. Researchers would always take more and better analytical power. And there’s extrinsic pressure to keep pushing computing forward: not just for bragging rights, although those are cool, but because better simulations could lead to new drug discoveries, new advanced materials or new Nobel Prizes that keep the country on top.

All those factors have scientists already talking about the “post-exascale” future—what comes after they can do one quintillion math problems in one second. That future might involve quantum computers or augmenting exascale systems with more artificial intelligence. Or maybe it’s something else entirely. Maybe, in fact, someone should run a simulation to predict the most likely outcome or the most efficient path forward.

参考译文
新的百亿亿次超级计算机可以在一秒钟内进行百亿亿次计算
“百亿亿次”听起来像一个科幻术语,但它有一个简单而非虚构的定义:人类大脑每秒可以执行一次简单的数学运算,而一台百亿亿次计算机至少可以在说“一个密西西比河”的时间内完成1亿亿次运算。2022年,世界上第一台百亿亿次计算机Frontier在橡树岭国家实验室上线,比世界上排名第二的计算机快2.5倍。不过,它很快就会面临更激烈的竞争(或同行),因为即将推出的测试机器,比如位于劳伦斯利弗莫尔国家实验室的El Capitan,以及将位于阿贡国家实验室的Aurora。所有这些机器都位于以“国家实验室”结尾的设施中,这并非巧合。这些新型计算机是美国能源部及其国家核安全管理局(NNSA)的项目。能源部负责监督这些实验室和全国各地的其他实验室网络。NNSA的任务是监视核武器库存,而一些百亿亿次计算的理由être是为了运行计算,以帮助维持核武库。但超级计算机的存在也是为了解决纯科学领域的棘手问题。当科学家们完成对Frontier的调试后,它将致力于这些基础研究,他们希望通过几乎真实的模拟来阐明各个领域的核心真理——比如了解能量是如何产生的,元素是如何形成的,宇宙的黑暗部分是如何刺激它的进化的——这些都是几年前连连闻所未闻的超级计算机都无法实现的。橡树岭大学计算与计算科学实验室副主任道格拉斯·科特(Douglas Kothe)说:“原则上,这个社区本可以更早地开发和部署百亿亿次超级计算机,但按照我们的标准,它不可能是可用的、有用的和负担得起的。”诸如大规模并行处理、高能耗、可靠性、内存和存储等障碍,以及在这种超级计算机上启动运行的软件的缺乏,阻碍了这些标准的实现。多年专注于高性能计算行业的工作降低了这些障碍,最终满足了科学家的要求。与之前的产品相比,Frontier的处理速度快7倍,内存容量多4倍。它由近10,000个cpu或中央处理器(为计算机执行指令,通常由集成电路组成)和近38,000个图形处理器(图形处理单元)组成。gpu的诞生是为了在游戏中快速流畅地显示视觉内容。但它们已被重新用于科学计算,部分原因是它们擅长并行处理信息。在Frontier内部,这两种处理器是相互关联的。gpu并行地进行重复的代数运算。Kothe说:“这样cpu就可以更快更有效地执行任务。”“你可以说这是超级计算机天堂里的一对。”通过将科学问题分解成十亿个或更多的小块,Frontier允许处理器各自吃掉问题的一小部分。然后,Kothe说,“它将结果重新组合成最终答案。你可以把每个CPU比作工厂里的工长,把gpu比作一线的工人。” 这台超级计算机中有9472个不同的节点——每个节点实际上都有自己的不那么超级的计算机——它们也都以这样一种方式连接起来,可以将信息快速地从一个地方传递到另一个地方。重要的是,Frontier不仅比以前的机器运行得更快:它还拥有更多的内存,因此可以运行更大的模拟,并在处理这些数据的地方存储大量信息。这就像当你试图做一个按数字绘制的项目时,你随身携带所有的丙烯酸树脂,而不是根据需要从桌子的另一边检索每种颜色。有了这种力量,“前沿”——以及随之而来的野兽——可以让人类了解以前可能仍然不透明的世界。在气象学上,它可以使飓风预报不那么模糊和令人沮丧。在化学方面,它可以用不同的分子结构进行实验,看看哪些分子可以制成超导体或药物化合物。在医学上,它已经分析了导致covid - 19的SARS-CoV-2病毒的所有基因突变,将计算时间从一周缩短到一天,以了解这些调整如何影响病毒的传染性。节省的时间使科学家可以进行超快迭代,改变他们的想法,并快速连续地进行新的数字实验。Kothe说,有了这种水平的计算能力,科学家们不必像以前那样进行相同的近似。对于老式的计算机,他经常不得不说,“我要假设这个项是无关紧要的,那个项也是无关紧要的。也许我不需要这个方程。”用物理学的术语来说,这被称为“球形牛”:把一个复杂的现象,比如牛,变成高度简化的东西,比如一个球。有了百亿亿次计算机,科学家们希望避免这种走极端的情况,并将一头牛模拟成,实际上是一头牛:更接近现实的表现。Frontier的硬件升级是这一改进背后的主要因素。但是,如果没有软件来利用机器的新魅力,单靠硬件对科学家来说并没有多大好处。这就是为什么一项名为“百亿亿次计算项目”(ECP)的倡议——由能源部、国家核安全管理局以及行业合作伙伴联合起来——在超级计算机的开发过程中资助了24个初步的科学编码项目。这些软件计划不能只是把旧的代码——比如用来模拟突然出现的恶劣天气——扔到Frontier上,然后说:“它以闪电般的速度而不是几乎闪电般的速度做出了一个不错的预测!”为了得到更准确的结果,他们需要一组放大和优化的代码。“我们不会在这里作弊,更快地得到同样不太好的答案,”兼任ECP总监的Kothe说。但负责早期科学项目“ExaSky”的萨勒曼·哈比卜(Salman Habib)表示,要得到更好的答案并不容易。“超级计算机本质上是蛮力工具,”他说。“所以你必须以聪明的方式使用它们。而这'才是有趣的地方,你会挠头说,‘我怎么才能真正用这个可能很钝的工具去做我真正想做的事情呢?哈比卜是阿贡大学计算科学部的主任,他想探索宇宙的神秘构成及其结构的形成和演化。这些模拟模拟了暗物质和暗能量的影响,并包括了研究宇宙在大爆炸后如何膨胀的初始条件。 大规模的天文调查——例如,亚利桑那州的暗能量光谱仪器——已经帮助照亮了宇宙中那些阴暗的角落,展示了星系是如何在宇宙膨胀时形成、形成和扩散的。但这些望远镜的数据本身并不能解释它们所看到的现象的原因。不过,像ExaSky这样的理论和建模方法可能能够做到这一点。如果理论学家怀疑暗能量表现出某种行为,或者我们对引力的概念是错误的,他们可以调整模拟来包括这些概念。然后,它将吐出一个数字宇宙,天文学家可以看到它与望远镜传感器捕捉到的数据匹配或不匹配的方式。哈比卜说:“计算机的作用是成为理论家和建模者的虚拟宇宙。”ExaSky扩展了为小型超级计算机编写的算法和软件,但模拟还没有带来关于宇宙暗成分本质的巨大突破。到目前为止,科学家们所做的工作提供了“一个有趣的组合,能够模拟它,但不能真正理解它,”Habib说。不过,有了百亿亿次计算机,天文学家的哈比卜可以模拟更大体积的空间,使用更牛一样的物理,在更高的清晰度。理解,也许正在路上。另一个早期的前沿项目名为ExaStar,由劳伦斯伯克利国家实验室的Daniel Kasen领导,将研究一个不同的宇宙之谜。这项努力将模拟超新星——大质量恒星生命结束时的爆炸,在它们的极端情况下,会产生重元素。科学家们对超新星爆发的过程有一个粗略的了解,但没有人真正知道这些爆炸的完整版本,也没有人知道重元素是如何在爆炸中形成的。在过去,大多数超新星模拟通过假设恒星是球对称的或使用简化的物理来简化情况。有了百亿亿次计算机,科学家们可以做出更详细的三维模型。他们不只是为一次爆炸运行代码,而是可以做整个套件,包括不同种类的恒星和不同的物理思想,探索哪些参数产生了天文学家在天空中实际看到的东西。“超新星和恒星爆炸本身就是迷人的事件,”卡森说。“但它们也是宇宙故事中的关键角色。”它们提供了组成地球和我们的元素,以及望远镜。尽管它们的极端反应不能在物理实验中完全复制,但数字试验是可能的,而且破坏性较小。第三个早期项目正在研究更接近人类的现象:核反应堆及其反应。ExaSMR项目将使用百亿亿次计算来计算“小型模块化反应堆”的屏蔽下发生了什么,核能支持者希望这种设施将变得更加普遍。在早期,超级计算机一次只能模拟反应堆的一个组件。之后,他们可以对整个机器进行建模,但只能在一个时间点上进行建模——比如,获得机器第一次启动的精确图像。“现在我们正在模拟一个反应堆从启动到整个燃料循环过程的演变过程,”橡树岭的史蒂文·汉密尔顿(Steven Hamilton)说,他是这项工作的联合负责人。汉密尔顿的团队将研究中子如何移动并影响核裂变的链式反应,以及裂变产生的热量如何在系统中移动。因为计算机没有足够的内存一次性完成整个模拟的数学运算,所以以前根本不可能同时从空间和时间的细节上计算出热量是如何流动的。汉密尔顿说:“我们的下一个重点是研究更广泛的反应堆设计”,以提高它们的效率和安全性。 当然,核能一直是核反应的另一种用途——武器——的另一面。在Lawrence Livermore,特里萨·贝利(Teresa Bailey)领导着一个150人的团队,其中许多人正忙着准备在酋长岩上运行的模拟武器的代码。贝利是劳伦斯利弗莫尔大学计算物理学的副项目主任,她负责高级模拟和计算项目的部分工作——国家安全方面的工作。来自NNSA实验室的团队在ECP和先进技术开发与减缓计划(一个更面向武器的项目)的支持下,致力于帮助武器代码现代化的r&d。如果你问任何一个科学家,Frontier、El Capitan和Aurora这样的计算机是否足够好,你永远不会得到肯定的答案。研究人员总是需要更多更好的分析能力。此外,还有外部压力推动计算机技术不断向前发展:不仅仅是为了炫耀,尽管这很酷,而是因为更好的模拟可能会导致新的药物发现,新的先进材料或新的诺贝尔奖,让国家保持领先地位。所有这些因素已经让科学家们开始谈论“后百亿亿次”的未来——当他们能在一秒钟内解决一千亿亿次数学问题之后。未来可能涉及量子计算机或用更多人工智能增强百亿亿次系统。或者可能完全是另一回事。也许,事实上,应该有人进行模拟来预测最有可能的结果或最有效的前进路径。
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

提取码
复制提取码
点击跳转至百度网盘