小程序
传感搜
传感圈

Which companies are working on LLMs and ChatGPT alternatives?

2023-03-23
关注

Right now, major tech firms are clamouring to replicate the runaway success of ChatGPT, the generative AI chatbot developed by OpenAI using its GPT-3 large language model. Much like potential game-changers of the past, such as cloud-based Software as a Service (SaaS) platforms or blockchain technology (emphasis on potential), established companies and start-ups alike are going public with LLMs and ChatGPT alternatives in fear of being left behind.

While many of these will succeed – some in the mass market and others with niche applications – many more will likely fail as the market consolidates.

Who, then, are the companies in good stead to challenge OpenAI?

Meta has introduced its own Large Language Model, named LLaMA. (Photo by Shutterstock/Koshiro K)

Table of contents

  • Companies with Large Language Model projects
    • Google – LaMDA
    • AI21 – Jurassic-2
    • Anthropic – Claude
    • Baidu – ERNIE 3.0
    • Nvidia – DGX AI
    • DeepMind – Chinchilla
    • Meta – LLaMA

Companies with Large Language Model projects

Google – LaMDA

Google’s LaMDA has attracted the most attention from mainstream observers of any LLM outside of GPT-3, but for not quite the same reasons.

Months before ChatGPT exploded into national headlines in late 2022, LaMDA was proving controversial after Google engineer Blake Lemoine was suspended for claiming – falsely as became evident – that it had developed sentience.

In reality, the LaMDA LLM operates similarly to its main competitor, except that it has fewer parameters at 137 billion compared to 175 billion for GPT-3.5, which was used to train ChatGPT.

LaMDA is also the bedrock of Google’s chatbot competitor, named Bard, which the search giant is currently testing for search with select users. Bard had an inauspicious start, however, as it presented a factual error during a launch event.

AI21 – Jurassic-2

Israel-based start-up AI21, while less well-known than its rival OpenAI, is a serious challenger in the market. The company created the Jurassic-2 large language model in 2021 with a similar number of parameters to GPT-3.5 – 178 billion compared to 175 billion – and customisation capabilities.

Content from our partners

Banks must better balance compliance with customer outreach

Banks must better balance compliance with customer outreach

How one manufacturer transformed data capabilities by moving to the cloud

How one manufacturer transformed data capabilities by moving to the cloud

Why fashion’s future lies in the cloud

Why fashion’s future lies in the cloud

March 2023 then saw the release of Jurassic-2, which focuses on optimised performance as opposed to size. According to AI21, the smallest version of Jurassic-2 outperforms even the largest version of its predecessor. It will also contain a grammatical correction API and text segmentation capabilities.

View all newsletters Sign up to our newsletters Data, insights and analysis delivered to you By The Tech Monitor team

Users of AI21 studio can train their own versions of the LLM with as few as 50-100 training examples, which then become available for exclusive use.

AI21 also deployed Jurassic-1, and now Jurassic-2 to underpin its WordTune Spices chatbot, which distinguishes itself as a ChatGPT alternative by the use of live data retrieval and the citation of sources in its formulations. Given the risks of factual error and plagiarism associated with LLM chatbots, this is a significant advantage in an increasingly competitive field.

Anthropic – Claude

Founded by former OpenAI employees, Anthropic is fast making waves as a rival to its quasi-progenitor.

The generative AI company has launched its own large language model, Claude, whose ChatGPT alternative boasts what it calls “constitutional AI”. In effect, the model is designed to act according to programmed principles (i.e. its ‘constitution’) as opposed to ChatGPT, which is prohibited from answering certain controversial or dangerous queries.

Much like Microsoft’s investment in OpenAI, Google has invested $300m into Anthropic for a 10% stake in the company.

Baidu – ERNIE 3.0

Baidu – China’s answer to Google – is looking to combat its long-term struggles in the face of rival Tencent with its heavy investment in AI.

The team at Baidu has expanded its ERNIE 3.0 large language model into a new version called ERNIE 3.0 Titan. While its predecessor had just 10 billion parameters, Titan’s PaddlePaddle platform operates on 260 billion.

Titan’s creators claim that it is the “largest dense pre-trained model so far” and that it outperforms state-of-the-art models on natural language processing (NLP) tasks.

Nvidia – DGX AI

Hardware and software supplier Nvidia is currently core to the operation of ChatGPT, with an estimated 10,000 of the company’s GPUs used to train the chatbot and a predicted 30,000 to be used in future.

This dynamic could be upended, however, as Nvidia CEO Jensen Huang announced in February 2023 that the firm plans to make its DGX AI supercomputer available via the cloud.

Already accessible through Oracle Cloud Infrastructure and Microsoft Azure, the AI supercomputer will have the capacity to allow customers to train their own large language models.

Nvidia has seen a financial boost as companies such as Google and Microsoft look to it for the GPUs necessary for training.

DeepMind – Chinchilla

British AI company and Alphabet subsidiary Deepmind, famous for its AlphaGo program, is investing heavily in large language model research and development. Deepmind has iterated on multiple LLMs, including Gopher, Chinchilla and the RETRO system, which combines an LLM with an external database.

This experimentation is leading the way in more targeted and energy-efficient types of LLM – Chinchilla has just 70 billion parameters, as opposed to others with double, triple or even more than that, yet outperforms the larger Gopher at certain tasks. Likewise for the 7.5 billion-parameter RETRO, whose external database allows it to outperform vastly larger models.

Meta – LLaMA

Not content to invest in the metaverse, Meta has also entered the LLM space with its LLaMA model. Mark Zuckerberg’s company does not yet have a publicly available ChatGPT alternative but it is in development.

Unlike many others, the 65-billion parameter LLM has been made open source (upon request, crucially) with the intention of knowledge sharing and crowdsourcing bug fixes.

But just a week after it launched, a torrent for the LLM found its way to the wider internet via a 4Chan leak, prompting fears that such unfettered access could be used for phishing and other cybercrime activities.

Read more: This is how GPT-4 will be regulated

参考译文
哪些公司正在开发LLMs和ChatGPT替代品?
目前,各大科技公司都在争相复制ChatGPT的巨大成功。ChatGPT是OpenAI使用GPT-3大型语言模型开发的生成式人工智能聊天机器人。就像过去潜在的游戏规则改变者一样,比如基于云计算的软件即服务(SaaS)平台或区块链技术(强调潜力),老牌公司和初创公司都在使用LLMs和ChatGPT替代品上市,担心被甩在后面。虽然其中许多将会成功——一些在大众市场,另一些在小众应用中——但随着市场的整合,更多的可能会失败。那么,哪些公司有能力挑战OpenAI呢?谷歌的LaMDA在GPT-3之外的任何LLM中都吸引了主流观察者的最多关注,但原因并不完全相同。在ChatGPT于2022年底成为全国头条新闻的几个月前,在谷歌工程师Blake Lemoine因声称它已经发展出感知能力而被停职后,LaMDA被证明是有争议的。实际上,LaMDA LLM的操作与其主要竞争对手类似,只是它的参数较少,为1370亿,而用于训练ChatGPT的GPT-3.5的参数为1750亿。LaMDA也是谷歌的聊天机器人竞争对手Bard的基础,该搜索巨头目前正在对其进行测试,以选择用户进行搜索。然而,巴德的开局并不吉利,因为它在发布会上出现了一个事实错误。总部位于以色列的初创公司AI21虽然没有竞争对手OpenAI那么有名,但在市场上是一个强有力的挑战者。该公司在2021年创建了侏罗纪-2大型语言模型,其参数数量与gpt - 35相似,为1780亿,而gpt - 35为1750亿,并且具有定制功能。2023年3月,《侏罗纪2》发布,重点是优化性能,而不是大小。根据AI21的说法,最小版本的侏罗纪-2甚至超过了其前身的最大版本。它还将包含语法纠正API和文本分割功能。AI21工作室的用户可以用50-100个训练示例来训练他们自己的LLM版本,然后这些示例就可以供独家使用了。AI21还部署了Jurassic-1,现在是Jurassic-2,以支持其WordTune Spices聊天机器人,该聊天机器人通过使用实时数据检索和在其公式中引用源来区分自己作为ChatGPT替代品。考虑到与LLM聊天机器人相关的事实错误和抄袭的风险,这在竞争日益激烈的领域是一个显著的优势。Anthropic是由OpenAI的前员工创立的,作为其准前身的竞争对手,它正在迅速掀起波澜。这家生成式人工智能公司已经推出了自己的大型语言模型Claude,其ChatGPT替代方案吹嘘其所谓的“宪法AI”。实际上,该模型被设计成根据编程原则(即它的“宪法”)行事,而不是ChatGPT,后者被禁止回答某些有争议或危险的问题。就像微软投资OpenAI一样,谷歌向Anthropic投资了3亿美元,获得了后者10%的股份。百度——中国对谷歌的回应——正寻求通过在人工智能领域的大量投资来对抗竞争对手腾讯的长期挣扎。百度的团队已经将ERNIE 3.0大型语言模型扩展为ERNIE 3.0 Titan的新版本。它的前身只有100亿个参数,泰坦的PaddlePaddle平台有2600亿个参数。Titan的创造者声称,它是“迄今为止最大的密集预训练模型”,并且在自然语言处理(NLP)任务上优于最先进的模型。硬件和软件供应商英伟达目前是ChatGPT运营的核心,据估计,该公司有1万个gpu用于训练聊天机器人,预计未来将有3万个gpu被使用。 然而,这种动态可能会被颠覆,因为英伟达首席执行官黄仁勋在2023年2月宣布,该公司计划通过云提供其DGX AI超级计算机。这台人工智能超级计算机已经可以通过甲骨文云基础设施和微软Azure访问,它将有能力让客户训练自己的大型语言模型。随着谷歌和微软(Microsoft)等公司向英伟达寻求训练所需的gpu,英伟达的财务得到了提振。以AlphaGo程序而闻名的英国人工智能公司、Alphabet子公司Deepmind正在大力投资大型语言模型的研发。Deepmind已经在多个LLM上进行了迭代,包括Gopher、Chinchilla和RETRO系统,后者将LLM与外部数据库相结合。这一实验在更有针对性和节能的LLM类型中处于领先位置——栗鼠只有700亿个参数,而其他的则有两倍、三倍甚至更多,但在某些任务上胜过更大的地鼠。75亿参数的RETRO也是如此,其外部数据库使其性能优于大得多的模型。Meta不满足于投资于元宇宙,还通过其LLaMA模型进入了LLM领域。马克·扎克伯格的公司还没有公开的ChatGPT替代品,但它正在开发中。与其他许多参数不同,650亿参数的LLM是开源的(根据请求,至关重要),目的是知识共享和众包错误修复。但就在它发布一周后,LLM的洪流通过4Chan的泄露传播到了更广泛的互联网上,这引发了人们对这种不受限制的访问可能被用于网络钓鱼和其他网络犯罪活动的担忧。
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

提取码
复制提取码
点击跳转至百度网盘