Google AI Open-Sources its Attention Center Model that Uses Machine Learning to Attempt to Identify Which Parts of an Image will Attract a Human’s Attention First 

Have you ever wondered, when seeing an image, are there any specific parts of the image you see first? What are these parts, and do they have some particular features that draw the focus to those parts? Now imagine a machine that can focus on these parts. Knowing these parts is a very helpful idea to make the process of image compression and decompression faster.

To decompress the sections which catch human attention first, the researchers at Google Research recently open-sourced an attention center model that employs machine learning-trained models to try to identify which parts of a picture will catch a human’s attention first.

This model is in Tensorflow lite format and takes an RGB image as input and gives the output image with a green dot at the center of attention.

Meet Hailo-8™: An AI Processor That Uses Computer Vision For Multi-Camera Multi-Person Re-Identification (Sponsored)
Source: https://opensource.googleblog.com/2022/12/open-sourcing-attention-center-model.html

The attention center model is a deep neural network that uses a pre-trained classification network, such as ResNet, MobileNet, etc., as its foundation and accepts an image as input. The attention center prediction module takes its input from several intermediate layers that the backbone network produces. As an example, lower layers frequently contain low-level information like intensity, color, and texture, whereas deeper levels typically contain higher-level and more meaningful information like shape and object.

A low-resolution version of the entire image is displayed at first. By the time your visual brain determines where to direct your pupils, that portion of the image has already begun to get sharper. The program then predicts where your eyes will go next as they move around the image and adds extra detail to those areas. The relatively dull regions are filled in last after those relatively sharp portions.

This model can be made really useful as this will help in faster loading of images as the important parts will load faster. It will also be useful while implementing machine learning and image processing since the more impactful parts are being searched. Thus, the implementations of such a model are extensive and much helpful. 


Check out the Github and Reference Article. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.


Rishabh Jain, is a consulting intern at MarktechPost. He is currently pursuing B.tech in computer sciences from IIIT, Hyderabad. He is a Machine Learning enthusiast and has keen interest in Statistical Methods in artificial intelligence and Data analytics. He is passionate about developing better algorithms for AI.


Credit: Source link

Comments are closed.