The Emergence of Stacking: How is the Self-Referential Nature of Stacking in Large Language Models Transforming the Artificial Intelligence (AI) Industry?

On Apr 13, 2023

The AI industry is evolving and coming up with new and unique research and models daily. Whether we talk about healthcare, education, retail, marketing, or business, Artificial Intelligence and Machine Learning practices are beginning to shift how industries operate. Every organization is adopting AI to include its potential in people’s everyday lives. With automation and AI’s excellent capabilities to learn, reason, and execute decision-making, the field of AI is rapidly advancing.

The well-known Large Language Models, which have recently gained a lot of popularity, are the best example of AI’s takeover of the world. The famous ChatGPT, which uses the GPT transformer architecture to generate content, is currently the talk of the town and the go-to chatbot for most of the individuals aware of it. Recently, a Twitter user, Jay Hack, discussed an intriguing trend in AI known as the stacking of AI models in his tweet thread. Referring to the concept as “models all the way down,” Jay has mentioned how AI models use other similar models to perform tasks and perform decision-making.

Stacking is basically having an AI model that can invoke other models for solving a complex task, thereby resulting in emergent intelligence. The main idea behind the approach is to have AI models use other models as tools or mediums to accomplish a subtask or multiple subtasks. Some of the quoted examples are – GPT generating its own copies for solving subtasks, GPT using a vision model for drawing beautiful portraits, etc.

🚀 Check Out 100’s AI Tools in AI Tools Club

Jay has discussed the self-referential nature of stacking that can help develop models having Artificial General Intelligence (AGI). He has mentioned how by stacking multiple AI models on top of each other, each model can make use of the capabilities of the models below it, resulting in a system with greater overall intelligence. This approach is seen as the frontier in building systems that can perform tasks that were previously considered out of reach for AI.

Two of the recent examples of such LLMs that have utilized this concept for great purposes are BabyAGI and AutoGPT. Both of these LLMs recursively call themselves. On the one hand, where BabyAGI trains and evaluates various AI agents in a simulated environment and tests their ability to learn and perform difficult tasks. On the other hand, AutoGPT uses GPT-4 and GPT-3.5 via API to create full projects by iterating on its own prompts. AutoGPT even created a website using React and Tailwind CSS in under three minutes.

Other domains where stacking is popularizing are ViperGPT, which gives GPT access to a Python REPL (Read-Eval-Print Loop) and a high-level API for manipulating Computer Vision models. SayCan is also emerging in robotics, where an LLM is used as the backbone for robotic reasoning. Another recent project called ‘toolkit.club’ uses LLMs to build and deploy tools for other AIs. This uses a loop where the agent asks for a tool, the tool is made and deployed by LLM, and the agent thus uses the tool.

Consequently, stacking AI is rapidly advancing and opening doors to new capabilities. It can solve complex tasks that a single LLM query might not be able to solve. With correct usage and overcoming the limitations regarding AI safety, this approach can work wonders in the future for further developments.

This article is based on this tweet thread that discusses the above topic. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 18k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🚀 Check Out 100’s AI Tools in AI Tools Club

An intriguing trend in AI 🤖:

“Models all the way down” (aka “stacking”)

Have models invoke other models, then watch as emergent intelligence develops ✨

Here’s a discussion of what, how, and why this is important to watch 👇 pic.twitter.com/hN3JND32CK

— Jay Hack (@mathemagic1an) April 9, 2023

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

🚀 JOIN the fastest ML Subreddit Community

Credit: Source link