This Microsoft Research Proposes PRISE: A Novel Machine Learning Method for Learning Multi-Task Temporal Action Abstractions that Capitalizes on a Novel Connection to NLP Methodology
Ever since its inception, robotics has made significant strides, with robots being widely used today in numerous industries, such as home monitoring and electronics, nanotechnology, aerospace, and many others. These robots are able to process complex, high-dimensional data and decide the best possible actions to take. They do so by constructing abstractions, i.e., compact summaries of what they see and what actions they can take, which helps them to generalize across tasks. Researchers have primarily concentrated on learning these abstractions or summaries from data instead of hand-crafting them.
In this work from Microsoft, the research team has focused on temporal action abstractions, i.e., breaking down complex policies into low-level tasks like picking up objects, walking, etc. They believe that this technique has great potential for action representation learning. They have introduced a new method called Primitive Sequence Encoding (PRISE), which helps teach robots multi-step skills. The results show that PRISE allows the robot to learn faster and perform better than if it was trained on all the action codes simultaneously.
PRISE has been inspired by ideas from NLP. The method takes the robot’s continuous actions and converts them into a set of discrete codes. The researchers pretrain a vector quantization module for this task and then apply the Byte Pair Encoding (BPE) technique (which is used in NLP to compress text) to identify these small routines within the action codes. Behavior Cloning is used for testing the robot, in which PRISE leverages these small routines or skills instead of the full set of instructions, making the process faster and more efficient than other methods.
The researchers also assessed the effectiveness of the skill tokens of PRISE. They first pretrained them using large-scale, multitask offline datasets and then evaluated them on two offline imitation learning (IL) scenarios – learning a multitask generalist policy and few-shot adaptation to unseen tasks.
For the first task, the researchers evaluated the average success rate across 90 tasks in the LIBERO-90 dataset. They observed that the use of skill tokens in PRISE leads to a significant improvement in performance as compared to other existing algorithms. The researchers also evaluated the 5-shot IL performance of PRISE across five unseen MetaWorld tasks. They found out that PRISE surpasses all other baselines by a large margin, highlighting its effectiveness in adapting to unseen downstream tasks.
In conclusion, this research paper presents PRISE, a new method that allows the robot to learn faster and perform better than if it was simultaneously trained on all the action codes. They leverage an NLP methodology called Byte Pair Encoding tokenization algorithm that enables efficient multitask policy learning and few-shot adaptation to unseen tasks. The experiments also demonstrate the method’s superiority over other algorithms, and this work has the potential to further improve the performance of robots across different tasks.
Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.
If you like our work, you will love our newsletter..
Don’t Forget to join our 38k+ ML SubReddit
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.
Credit: Source link
Comments are closed.