Meet Phoenix: A New Multilingual LLM That Achieves Competitive Performance Among Open-Source English And Chinese Models
Large Language Models (LLMs) have taken the world by storm with their human-like capabilities and features. The latest addition to the long list of LLMs, the GPT-4 model, has exponentially increased the utility of ChatGPT due to its multimodal nature. This latest version takes input in the form of text and images and is already being used for creating high-quality websites and chatbots. Recently, a new model has been introduced to democratize ChatGPT, i.e., to make it more accessible and available to a wider audience, regardless of language or geographic constraints.
This latest model, called Phoenix, aims to achieve competitive performance not only in English Language and Chinese but also in languages with limited resources, such as Latin and non-Latin languages. Phoenix, the multilingual LLM that achieves great performance among open-source English and Chinese models, has been released to make ChatGPT available in places with restrictions imposed by OpenAI or local governments.
The author has described the significance of Phoenix as follows –
- Phoenix has been presented as the first open-source, multilingual, and democratized ChatGPT model. This has been achieved by using rich multilingual data in the pre-training and instruction-finetuning stages.
- The team has conducted instruction-following adaptation in multiple languages, focusing on non-Latin languages. Both instruction and conversational data have been used for training the model. This approach allows Phoenix to benefit from both, enabling it to generate contextually relevant and coherent responses in different language settings.
- Phoenix is a first-tier Chinese large language model that has achieved performance close to ChatGPT’s. Its Latin version Chimera is competitive in the English language.
- The authors have claimed that Phoenix is the SOTA open-source large language model for many languages beyond Chinese and English.
- Phoenix is among the first to systematically evaluate extensive LLMs, using automatic and human evaluations and evaluating multiple aspects of language generations.
Phoenix has demonstrated superior performance compared to existing open-source LLMs in Chinese, including models like BELLE and Chinese-LLaMA-Alpaca. In other non-Latin languages like Arabic, Japanese, and Korean, Phoenix largely outperforms existing models. Phoenix did not achieve SOTA results for Vicuna, which is an open-source chatbot with 13B parameters trained by fine-tuning LLaMA on user-shared conversations.
This is because Phoenix had to pay a multilingual tax when dealing with non-Latin or non-Cyrillic languages. The ‘multilingual tax’ refers to the performance degradation that a multilingual model may experience when generating text in languages other than its primary language. Paying for the tax has been considered worthy by the team for democratization as its a way to cater to minor groups who speak relatively low-resource languages. The team has proposed a Tax-free Phoenix: Chimera solution to mitigate the multilingual tax in Latin and Cyrillic languages. This involves replacing the backbone of Phoenix with LLaMA. In the English language, Chimera impressed GPT-4 with 96.6% ChatGPT Quality.
Phoenix seems promising due to its multilingual potential and its ability to enable people from diverse linguistic backgrounds to utilize the power of language models for their specific needs.
Check out the Paper and Github. Don’t forget to join our 19k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com
🚀 Check Out 100’s AI Tools in AI Tools Club
Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.
Credit: Source link
Comments are closed.