List of Groundbreaking and Open-Source Conversational AI Models in the Language Domain

On May 1, 2023

Conversational AI refers to technology like a virtual agent or a chatbot that use large amounts of data and natural language processing to mimic human interactions and recognize speech and text. In recent years, the landscape of conversational AI has evolved drastically, especially with the launch of ChatGPT. Here are some other open-source large language models (LLMs) that are revolutionizing conversational AI.

Release date: February 24, 2023

LLaMa is a foundational LLM developed by Meta AI. It is designed to be more versatile and responsible than other models. The release of LLaMA aims to democratize access to the research community and promote responsible AI practices.

LLaMa is available in several sizes, with the number of parameters ranging from 7B to 65B. Permission to the model’s access will be granted on a case-to-case basis to industry research laboratories, academic researchers, etc.

🚀 JOIN the fastest ML Subreddit Community

Release date: March 8, 2023

Open Assistant is a project developed by LAION-AI to provide everyone with a great chat-based large language model. Through extensive training in vast amounts of text and code, it has acquired the ability to perform various tasks, including responding to queries, generating text, translating languages, and producing creative content.

Even though OpenAssistant is still in the developmental stage, it has already acquired several skills, such as interacting with external systems like Google Search to gather information. Additionally, it is an open-source initiative, meaning that anyone can contribute to its progress.

Release date: March 8, 2023

Dolly is an instruction-following LLM developed by Databricks. It is trained on the Databricks machine-learning platform licensed for commercial use. Dolly is powered by the Pythia 12B model and has been trained on a wide range of instruction/response records totaling approximately 15k in number. Although not cutting-edge, Dolly’s performance in following instructions is impressively high-quality.

Release date: March 13, 2023

Alpaca is a small instruction-following model developed by Stanford University. It is based on Meta’s LLaMa (7B parameters) model. It is designed to perform well on numerous instruction-following tasks while being easy and cheap to reproduce at the same time.

Although it resembles OpenAI’s text-davinci-003 model, it is significantly cheaper (<$600) to produce. The model is open-source and has been trained on a dataset of 52,000 demonstrations of instruction-following.

Vicuna has been developed by a team of UC Berkeley, CMU, Stanford, and UC San Diego. It is a chatbot that has been trained by fine-tuning the LLaMa model on conversations shared by users and collected from ShareGPT.

Based on the transformer architecture, Vicuna is an auto-regressive language model and offers natural and engaging conversation capabilities. With 13B parameters, it produces more detailed and well-structured answers than Alpaca, and its quality is comparable to that of ChatGPT.

Release date: April 3, 2023

The Berkeley Artificial Intelligence Research Lab (BAIR) has developed Koala, which is a dialogue model based on the LLaMa 13B model. It is intended to be safer and more easily interpretable than other LLMs. Koala has been fine-tuned on freely available interaction data, focusing on data that includes interaction with highly capable closed-source models.

Koala is useful for studying language model safety and bias and understanding dialogue language models’ inner workings. Additionally, Koala is an open-source alternative to ChatGPT that includes EasyLM, a framework for training and fine-tuning LLMs.

Eleuther AI has created a set of autoregressive language models called Pythia, which are designed to support scientific research. Pythia consists of 16 different models ranging from 70M to 12B parameters. All models are trained using the same data and architecture, allowing for comparisons and exploring how they evolve with scaling.

Release date: April 5, 2023

Together has developed OpenChatKit, an open-source chatbot development framework that aims to simplify and streamline the process of building conversational AI applications. The chatbot is designed for conversation and instruction and excels in summarizing, generating tables, classification, and dialog.

With OpenChatKit, developers can access a robust, open-source foundation to create specialized and general-purpose chatbots for various applications. The framework is built on the GPT-4 architecture and is available in three different model sizes – 3B, 6B, and 12B parameters – to accommodate diverse computational resources and application requirements.

Release date: April 13, 2023

RedPajama is a project created by a team from Together, Ontocord.ai, ETH DS3Lab, Stanford CRFM, Hazy Research, and MILA Québec AI Institute. Their goal is to develop top-notch open-source models, beginning with reproducing the LLaMA training dataset that contains more than 1.2 trillion tokens.

This project aims to create a completely open, replicable, and cutting-edge language model with three essential elements: pre-training data, base models, and instruction-tuning data and models. The dataset is currently accessible through Hugging Face, and users have the option to replicate the results using Apache 2.0 scripts, which are available on GitHub.

Release date: April 19, 2023

StableLM is an open-source language model developed by Stability AI. The model is trained on an experimental dataset three times larger than The Pile dataset and is effective in conversational and coding tasks despite its small size. The model comes in 3B and 7B parameters, with larger models still to come.

StableLM can generate both text and code, making it suitable for various downstream applications. Stability AI is also making available a series of fine-tuned research models through instruction, utilizing a combination of five up-to-date open-source datasets specifically designed for conversational agents. These fine-tuned models are exclusively for research and are available under a non-commercial CC BY-NC-SA 4.0 license.

Check out the Paper and GitHub link. Don’t forget to join our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check Out 100’s AI Tools in AI Tools Club

References:

https://www.ibm.com/topics/conversational-ai
https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
https://crfm.stanford.edu/2023/03/13/alpaca.html
https://vicuna.lmsys.org/
https://bair.berkeley.edu/blog/2023/04/03/koala/
https://www.together.xyz/blog/redpajama
https://arxiv.org/pdf/2304.01373.pdf
https://openchatkit.net/
https://github.com/databrickslabs/dolly

I am a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi, and I have a keen interest in Data Science, especially Neural Networks and their application in various areas.

Credit: Source link