Another Large Language Model! Meet IGEL: An Instruction-Tuned German LLM Family

IGEL is the Instruction-tuned German large Language Model for Text. IGEL version 001 (Instruct-igel-001) is a primitive proof of concept meant to be used to determine whether or not it is feasible to construct a German instruction-tuned model from a combination of existing open-source models and a German-translated instruction dataset. 

The first version of IGEL was based on BigScience BLOOM, which Malte Ostendorff localized into German. IGEL is designed to perform various tasks related to natural language comprehension, including sentiment analysis, language translation, and question answering, with high accuracy and dependability in each area.

The team wanted to experiment with how well the LLMs perform instruction-based modeling tasks in German. They accomplished this using a pre-trained customized BLOOM model (6B) and fine-tuning it using a dataset based on translated instructions. To construct the dataset, an approach called automatic translation was used to transform the English instructions into German. Even though there was a greater chance of translation errors occurring due to this strategy, their goal was to determine whether or not the model could still learn to produce instructional replies.

🚀 JOIN the fastest ML Subreddit Community

LoRA-tuned BLOOM-CLP Deutsch (6.4B parameters) with merged weights for usage with Hugging Face Transformers is what users will find in Instruct-igel-001. Before instruct-igel-001 is trained on naive translated instruction datasets, there is not a lot of attention paid to data-cleaning, filtering, or post-processing of the data.

The team mentioned that hallucination, toxicity, and stereotyping are only some of the problems that instruct-igel-001 has, all of which are common with language models. They plan to finish developing the chat model to create a conversational interface. This will improve the data quality in ways that go beyond the traditional request-and-response methodology.


Check out the Blog and Try the model here. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.


Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.


🔥 Must Read- What is AI Hallucination? What Goes Wrong with AI Chatbots? How to Spot a Hallucinating Artificial Intelligence?

Credit: Source link

Comments are closed.