Superhuman Performance on the Atari 100K Benchmark: The Power of BBF – A New Value-Based RL Agent from Google DeepMind, Mila, and Universite de Montreal

On Jun 12, 2023

Deep reinforcement learning (RL) has emerged as a powerful machine learning algorithm for tackling complex decision-making tasks. To overcome the challenge of achieving human-level sample efficiency in deep RL training, a team of researchers from Google DeepMind, Mila, and Universite de Montreal has introduced a novel value-based RL agent called “faster, better, faster” (BBF). In their recent paper, “Bigger, Better, Faster: Human-level Atari with human-level efficiency,” the team presents the BBF agent, demonstrating super-human performance on the Atari 100K benchmark using a single GPU.

Addressing the Scaling Issue

The research team’s primary focus was to address the scaling issue of neural networks in deep RL when there are limited samples. Building upon the SR-SPR agent developed by D’Oro et al. (2023), which employs a shrink-and-perturb method, BBF perturbs 50 percent of the parameters of the convolutional layers toward a random target. In contrast, SR-SPR perturbs only 20 percent of the parameters. This modification results in improved performance of the BBF agent.

🚀 JOIN the fastest ML Subreddit Community

Scaling Network Capacity

To scale network capacity, the researchers utilize the Impala-CNN network and increase the size of each layer by four times. It was observed that BBF consistently outperforms SR-SPR as the width of the network is increased, whereas SR-SPR reaches its peak at 1-2 times the original size.

Enhancements for Better Performance

BBF introduces an update horizon component that exponentially decreases from 10 to 3. Surprisingly, this modification yields a stronger agent than fixed-value agents like Rainbow and SR-SPR. Additionally, the researchers apply a weight decay strategy and increase the discount factor during learning to alleviate statistical overfitting issues.

Empirical Study and Results

In their empirical study, the research team compares the performance of the BBF agent against several baseline RL agents, including SR-SPR, SPR, DrQ (eps), and IRIS, on the Atari 100K benchmark. BBF surpasses all competitors in terms of both performance and computational cost. Specifically, BBF achieves a 2x improvement in performance over SR-SPR while utilizing nearly the same computational resources. Furthermore, BBF demonstrates comparable performance to the model-based EfficientZero approach but with more than a 4x reduction in runtime.

Future Implications and Availability

The introduction of the BBF agent represents a significant advancement in achieving super-human performance in deep RL, particularly on the Atari 100K benchmark. The research team hopes their work will inspire future endeavors to push the boundaries of sample efficiency in deep RL. The code and data associated with the BBF agent are publicly available on the project’s GitHub repository, enabling researchers to explore and build upon their findings.

With the introduction of the BBF agent, Google DeepMind and its collaborators have demonstrated remarkable progress in deep reinforcement learning. By addressing the challenge of sample efficiency and leveraging advancements in network scaling and performance enhancements, the BBF agent achieves super-human performance on the Atari 100K benchmark. This work opens up new possibilities for improving the efficiency and effectiveness of RL algorithms, paving the way for further advancements in the field.

Check Out The Paper and Github. Don’t forget to join our 23k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check Out 100’s AI Tools in AI Tools Club

Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.

➡️ Try: Criminal IP: AI-based Phishing Link Checker Chrome Extension

Credit: Source link