Meet Segment AnyRGBD: A Toolbox To Segment Rendered Depth Images Based On SAM

To segment rendered depth pictures using SAM, researchers have developed the Segment AnyRGBD toolkit. SAD, short for Segment Any RGBD, was recently introduced by NTU researchers. SAD can easily segment any 3D object from RGBD inputs (or generated depth pictures alone).

The produced depth picture is then sent into SAM since researchers have shown that people can readily recognize things from the visualization of the depth map. This is accomplished by first mapping the depth map ([H, W]) to the RGB space ([H, W, 3]) through a colormap function. The rendered depth picture pays less attention to texture and more attention to geometry compared to the RGB image. In SAM-based projects such as SSA, Anything-3D, and SAM 3D, the input images are all RGB images. Researchers pioneered the use of SAM to extract geometrical details directly.

OVSeg is a zero-shot semantic segmentation tool used by researchers. The study’s authors have given consumers a choice between raw RGB photos or generated depth images as input to the SAM. The user may retrieve the semantic masks (where each hue represents a different class) and the SAM masks associated with the class in either way.

🚀 JOIN the fastest ML Subreddit Community

Outcomes

Since texture information is most prominent in RGB images and geometry information is present in in-depth photos, the former are brighter than their rendered counterparts. As the accompanying diagram shows, SAM offers a wider variety of masks for the RGB inputs than it does for the depth inputs.

Over-segmentation in SAM has reduced thanks to the produced depth picture. In the accompanying illustration, for instance, the chair is identified as one of the four segments of the table that were extracted from the RGB photos using semantic segmentation. However, the table is correctly classified as a whole on the depth image. In the accompanying picture, the blue circles indicate regions of the skull that are misclassified as walls in the RGB image but are correctly identified in the depth image.

The red circled chair in the depth picture may be two chairs so close together that they are treated as a single entity. The RGB photos’ texture data is crucial in identifying the item.

Repo and Tool

Visit https://huggingface.co/spaces/jcenaa/Segment-Any-RGBD to see the repository. 

This repository is open source based on OVSeg, which is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License. However, certain project parts are covered by different licenses: The MIT license covers both CLIP and ZSSEG.

https://huggingface.co/spaces/jcenaa/Segment-Any-RGBD is where one may give the tool a try. 

For this task, one will need a graphics processing unit (GPU) and may get one by duplicating the space and upgrading the settings to use a GPU instead of waiting in line. There is a significant delay between initiating the framework, processing SAM segments, processing zero-shot semantic segments, and generating 3D results. Final results are available in around 2–5 minutes.


Check out the Code and Repo. Don’t forget to join our 20k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check Out 100’s AI Tools in AI Tools Club


Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.


Credit: Source link

Comments are closed.

  • Slot777
  • Link Gacor
  • Link Gacor
  • Bonus Slot
  • Link gacor
  • link gacor
  • Situs Slot
  • BOKEP INDO
  • Slot Gacor