翻译技术教育与研究2024年12月13日 08:01陕西
Cohere Releases “Aya Expanse” Multilingual AI Models
Cohere发布“Aya Expanse”
多语言AI模型
Cohere for AI, the research arm of the language artificial intelligence (AI) company, has introduced two new large language models (LLMs) — Aya Expanse 8B and 32B — as part of its ongoing project aimed at closing language divides in foundational AI datasets and models. The Aya Expanse models provide researchers access to advanced AI capabilities across 23 languages, including Arabic, Chinese, French, and Hindi.
语言人工智能公司Cohere的研究部门Cohere for AI推出了两个新的大型语言模型(LLMs)——Aya Expanse 8B和32B。作为其正在进行的项目的一部分,旨在消除基础AI数据集和模型中的语言的鸿沟。Aya Expanse模型为研究人员提供了跨23种语言(包括阿拉伯语、中文、法语和印地语)的高级的AI能力。
“Building on more than two years of open science research, Aya Expanse offers significant performance advances, setting a new state-of-the-art for multilingual LLMs,” the Cohere website states. “This includes a series of breakthroughs in data arbitrage, preference training for performance and safety, and model merging.”
Cohere网站表示:“建立在两年以上的开放科学研究的基础上,Aya Expanse在性能上取得了显著进步,设立了多语言大语言模型的新标杆。其中包括数据套利、性能和安全的偏好训练以及模型合并方面的一系列突破。”
According to a SiliconANGLE article, the two Aya Expanse models were launched with open weights on hosting sites Hugging Face and Kaggle, and they used “several new core research innovations” to achieve high performance, including “synthetic data and human feedback in late-term training.”
据SiliconANGLE文章报道,这两个Aya Expanse模型在Hugging Face和Kaggle等托管网站上以开放权重发布,它们使用了“若干新的核心研究创新”来实现高性能,包括“在后期训练中合成数据和人类反馈。”
In a blog post, Cohere claims that Aya Expanse 32B outperforms models like Google’s Gemma 2 27B and Meta’s Llama 3.1 70B. For lower-parameter options, Aya Expanse 8B also demonstrated advantages over other similar-sized models like Gemma 2 9B and Llama 3.1 8B. “The improvements in Aya Expanse are the result of a sustained focus on expanding how AI serves languages around the world by rethinking the core building blocks of machine learning breakthroughs,” the blog post states.
在一篇博客文章中,Cohere声称Aya Expanse 32B的性能优于Google的Gemma 2 27B和Meta的Llama 3.1 70B。对于较低参数选项,Aya Expanse 8B也显示出比其他类似尺寸的型号模型(如Gemma 2 9B和Llama 3.1 8B)更有优势。博客文章指出:“Aya Expanse的改进是通过重新思考机器学习突破的核心构建模块,持续关注于扩展AI是如何服务于世界各地语言的结果。”
According to a VentureBeat article by Emilia David, the Aya initiative attempts to solve the problem of research being done on LLMs that don’t perform well in languages other than English. “Many LLMs eventually become available in other languages, especially for widely spoken languages, but there is difficulty in finding data to train models with the different languages,” David writes. “It can also be difficult to accurately benchmark the performance of models in different languages because of the quality of translations.”
根据VentureBeat记者Emilia David的文章,Aya计划试图解决大语言模型在非英语语言中表现不佳的研究问题。David写道:“许多大语言模型最终都可以在其他语言中使用,尤其是广泛使用的语言,但很难找到用于不同语言训练模型的数据。” “由于翻译质量的原因,也很难准确地对不同语言的模型进行性能基准测试。”
Aya, derived from the Twi language term for “fern”, has grown into one of the world’s largest open-source multilingual projects, featuring over 513 million data points curated across 101 languages and 250 language ambassadors worldwide. This collaborative approach allows Aya’s datasets to expand research opportunities in regions where non-English AI resources remain limited.
Aya源自Twi语言中的“蕨类植物”一词,已发展成为世界上最大的开源多语言项目之一,涵盖101种语言和250名语言大使,拥有超过5.13亿个数据点。这种协作方法允许Aya的数据集在非英语AI资源有限的地区扩展研究机遇。
特别说明:本文仅用于学术交流,如有侵权请后台联系小编删除。