低资源语言用户需求被忽略?新技术助力缩小差距
翻译指令输入:
优化大语言模型低资源语言处理
Translating Input in Prompts Improves LLM Performance for Low-Resource Languages
01
2024年9月6日,来自普纳计算机技术研究所的特贾斯·德什潘德(Tejas Deshpande)和尼迪·科瓦塔尔(Nidhi Kowtal),以及印度理工学院马德拉斯分校的拉维拉杰·乔希(Raviraj Joshi)在一篇论文中提出了链式翻译提示(Chain-of-Translation Prompting, CoTR)。这是一种旨在提升大型语言模型(LLMs)在低资源语言处理能力的新型提示技术。 In a September 6, 2024, paper, Tejas Deshpande and Nidhi Kowtal from the Pune Institute of Computer Technology, along with Raviraj Joshi from the Indian Institute of Technology Madras, introduced Chain-of-Translation Prompting (CoTR), a new prompting technique designed to improve the performance of large language models (LLMs) for low-resource languages.
研究人员解释道,由于训练或微调数据有限,多语言LLMs在处理低资源语言的输入句子时常遇到困难。因此,”低资源语言使用者往往无法享受到先进自然语言处理(NLP)技术带来的益处,”研究人员强调了开发新技术以缩小这一差距的必要性。
The researchers explained that multilingual LLMs struggle to process input sentences (i.e., the actual text the LLM has to work on) in low-resource languages due to the limited data available for training or fine-tuning. As a result, “speakers of low-resource languages are frequently excluded from the benefits of advanced NLP technologies,” said the researchers, emphasizing the need for new techniques to close this gap.
为应对这一挑战,他们探索了利用LLMs多语言翻译能力的新型提示策略,并由此提出了CoTR。
To address this challenge, they explored new prompting strategies that leverage the multilingual translation abilities of LLMs and introduced CoTR.
02
CoTR重构了传统的提示方式:首先将低资源语言的输入句子翻译成资源更丰富的语言(如英语),LLMs通常在这些语言中表现更佳。随后,LLM在翻译后的文本上执行NLP任务(如情感分析或文本生成),最后可选择将输出重新翻译回原始语言。研究人员强调,”所有这些步骤都集成在一个单一提示中。”
CoTR restructures the traditional prompts by first translating the input sentence from a low-resource language into a higher-resource language, such as English, where LLMs typically perform better. The LLM then executes the NLP task — such as sentiment analysis or text generation — on the translated text, followed by an optional retranslation of the output back into the original language. “All these steps are specified in a single prompt,” the researchers emphasized. CoTR适用于多种任务,包括情感分析、仇恨言论分类、主题分类和文本生成等。
CoTR can be applied to various tasks, including sentiment analysis, hate speech classification, subject classification, and text generation.
研究团队选择了马拉地语作为CoTR的测试对象。马拉地语是一种印度语言,虽有大量使用者,但数字和语言资源匮乏,这使得NLP模型难以有效处理。
The researchers tested CoTR on Marathi, an Indic language with a significant speaker base but insufficient digital and linguistic resources, making it a challenge for NLP models to handle.
为验证CoTR的有效性,研究人员在多项任务中将其与标准提示方法进行了对比,包括情感分析、仇恨言论检测、新闻分类和新闻标题生成等。测试使用了多种模型,如GPT-4o、GPT-4o Mini、Llama 3.1 405B和Gemma-9B。
To validate CoTR’s effectiveness, they compared it against standard prompting methods across various tasks — including sentiment analysis, hate speech detection, news categorization, and news headline generation — using various models such as GPT-4o, GPT-4o Mini, Llama 3.1 405B, and Gemma-9B.
03
研究结果表明,将马拉地语输入句子先翻译成英语,然后使用单一提示执行任务,比直接用标准提示处理马拉地语文本的效果更好。CoTR在各种模型和数据集中的表现始终优于标准提示策略。
They found that translating the Marathi input sentence into English and then performing the task using a single prompt yielded superior results compared to directly processing Marathi text with a standard prompt. CoTR consistently outperformed standard prompting strategies across a variety of models and datasets.
研究人员表示:“这些结果凸显了基于翻译的提示策略在显著提升低资源语言多语言LLM性能方面的潜力。”
“The results underscore the potential of translation-based prompting strategies to significantly improve multilingual LLM performance in low-resource languages,” the researchers said.
他们还注意到,CoTR在较小规模模型(如Llama3-8B)上展现出最显著的性能提升。
They also noted that the most significant performance gains using CoTR were observed with smaller models, such as Llama3-8B.
研究人员强调,他们的工作”通过展示基于翻译的提示策略(尤其是单一提示)在提升低资源语言NLP性能方面的潜力,为多语言NLP领域做出了重要贡献。”
The researchers highlighted that their work “significantly contributes to multilingual NLP by demonstrating the potential of translation-based prompting strategies, particularly with a single prompt, to enhance NLP performance in low-resource languages.”
展望未来,研究团队计划将CoTR与思维链提示(Chain-of-Thought prompting)相结合,以进一步提高低资源语言的NLP准确性。他们表示:”这些策略的结合有望创建一个强大的框架,提升马拉地语NLP任务中的模型性能和可靠性。 Looking ahead, they plan to combine CoTR with Chain-of-Thought prompting to further improve NLP accuracy for low-resource languages. “Together, these strategies should create a robust framework that improves model performance and reliability in Marathi NLP tasks,” they said.
END 特别说明:本文内容选自Slator官网,仅供学习交流使用,如有侵权请后台联系小编删除。
– END –