AI2 Researchers Introduce Objaverse: A Massive Dataset with 800K+ Annotated 3D Objects

On Mar 23, 2023

When it comes to machine learning (ML) and artificial intelligence (AI), having a good quality dataset with sufficient data points is of fundamental importance in building the foundation of any real-world AI-powered application. ML models need to be trained with an abundance of data in order to develop systems that attain high-performance accuracy. Furthermore, datasets are crucial for establishing a benchmark against which the accuracy of such models can be compared. For instance, over the past few years, data corpora like Wikipedia, Conceptual Captions, WebImageText, WebText, and many more have laid the groundwork for a tremendous advancement in various fields of AI, such as computer vision and natural language processing.

Although many datasets are available for conducting research or creating applications that can be used in a wide range of disciplines, the world of 3D data lacks high-quality, quantitative datasets. Even if researchers have a great deal of interest in developing applications in the field of 3D vision, the issue of medium-sized datasets with little diversity in terms of object categories persists. One such instance is the ShapeNet dataset, which, although considered a large-scale repository for 3D shapes, has data points with a value of only 50,000 objects. In response to this problem, a computer vision research team from the Allen Insitute for AI (A2I), known as PRIOR, introduced Objaverse 1.0, a large-scale dataset comprising over 800K 3D objects along with thorough annotations on captions, tags, and animations. The dataset seeks to surpass other large-scale 3D datasets in a number of metrics, including size, number of categories, and visual diversity of cases within a given category. Objaverse is now publicly accessible and is available for download on Hugging Face.

Being an order of magnitude larger than its previous counterparts, Objaverse consists of various visual treats, such as animals, cartoon characters, vehicles, food delicacies, etc. However, this is not where it ends! It even includes visuals for interiors and exteriors of large spaces that can come in handy for Emobied AI tasks like training robotic agents to navigate open spaces. Objaverse even has over 44K diverse animated 3D objects, and each object consists of detailed textual annotation regarding the name, description, tags, and any other supplementary metadata. The dataset’s inclusion of graphic elements created by more than 150K artists is among its most intriguing features. As such a large number of artists contributed to the creation of the dataset, it makes it large and immensely diverse.

To unlock the true potential of this unique large-scale 3D dataset, the PRIOR research team conducted a variety of experiments across different domains. Creating 3D representations of items suitable for video games and improving long-tail object recognition on the LVIS benchmark are a couple of examples. Some other intriguing applications of Objaverse include developing a new benchmark to assess the robustness of the CLIP model and training embodied AI navigation models that allow robots to execute object detection based on natural language. Objaverse has demonstrated its remarkable capabilities as it is already in use by Meta for Textured Mesh Generation and even by researchers at Columbia University for performing single-view 3D reconstruction.

🔥 Recommended Read: Leveraging TensorLeap for Effective Transfer Learning: Overcoming Domain Gaps

Using Objaverse, the researchers hope to revolutionize the field of 3D vision research by providing the AI community with access to a large, diversified dataset that can be utilized across various AI disciplines. They are incredibly interested in learning about all the ways that the research community will use Objaverse.

Check out the Paper and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 16k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.

Credit: Source link