We know That LLMs Can Use Tools, But Did You Know They Can Also Make New Tools? Meet LLMs As Tool Makers (LATM): A Closed-Loop System Allowing LLMs To Make Their Own Reusable Tools
Large language models (LLMs) have excelled in a wide range of NLP tasks and have shown encouraging evidence of achieving some features of artificial general intelligence. Recent research has also revealed the possibility of supplementing LLMs with outside tools, considerably increasing their problem-solving powers and efficiency, similar to how human intelligence has evolved. However, the availability of appropriate tools is a major determinant of how applicable these tool-using procedures are. According to the lessons drawn from these milestones, the capacity for people to create their tools to solve new problems was a significant turning point in human development.
In this study, researchers from Google Deepmind, Princeton University and Stanford University apply this evolutionary notion to the field of LLMs, which is motivated by the significance of tool-making for humans. The system they suggest, dubbed LLMs As Tool Makers (LATM), enables LLMs to create their reusable tools to take on new responsibilities. Their strategy consists of two crucial phases: 1) creating tools: An LLM, often called the tool builder, creates tools (implemented as Python functions), especially for a specific job. 2) tool application: A second LLM, known as the tool user who may be the same person who created the tool applies the tools to deal with fresh requests. Due to the two-stage design, LATM may assign work to the most qualified LLM at each step.
In particular, a potent but resource-intensive model (such as GPT-4) may model the competent process of creating tools. On the other hand, a lightweight and affordable model (like the GPT-3.5 Turbo) may be attributed to the tool-using procedure, which is significantly easier. This method greatly lowers the average computing cost of handling several jobs while improving LLMs’ problem-solving skills. For a particular capability, the tool-making procedure only has to be carried out once. Thus, the produced tools may be applied to several task instances.
This method provides a scalable and economical alternative to deal with challenging problems. Think of a scenario where a user asks the LLM to arrange a meeting that works for everyone (for instance, through email exchanges). Complex arithmetic reasoning problems are frequently difficult for lightweight machines like the GPT-3.5 Turbo to complete. Stronger models, like the GPT-4, can, however, nonetheless get the right answers while having significantly higher inference costs. By using a powerful but expensive model as the tool maker and handing it off to a cost-effective model as the tool user, LATM gets over these obstacles. After the tool has been forged, the user may utilise the tool to do the work quickly and effectively after the tool has been forged.
This paradigm may also be used to tackle well-known games like the 24-game Sudoku and repetitive jobs in other processes like parsing and analyzing online articles into certain data formats or creating routing plans that fulfill various specialized requirements. They also add the dispatcher, a further lightweight LLM, which decides if an incoming problem can be resolved with already-existing tools or whether a new tool has to be developed. This gives their architecture an extra degree of dynamic and allows for real-time creation and use of tools. Their trials demonstrate the efficacy of this strategy on a variety of tough Big-Bench problems and complicated thinking tasks in general.
The outcomes demonstrate that LATM can perform as well as more resource-intensive models while being more reasonably priced. Exciting possibilities for a developing society using LLM-generated tools are made possible by this unique approach to LLMs, which imitates the evolutionary leap of humans in generating and utilizing tools.
Check out the Paper and Github Link. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com
🚀 Check Out 100’s AI Tools in AI Tools Club
Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.
Credit: Source link
Comments are closed.