MIT Researchers Developed an Image Dataset that Allows Them to Simulate Peripheral Vision in Machine Learning Models

MIT researchers developed the Texture Tiling Model (TTM) to address the challenge of accurately modeling human visual perception in deep neural networks (DNNs), particularly focusing on peripheral vision. Peripheral vision, which represents the world with decreasing fidelity at greater eccentricities, plays a crucial role in human visual processing but is often overlooked in computer vision systems. The paper aims to bridge the gap between human and machine perception by evaluating DNNs’ performance in tasks constrained by peripheral vision compared to humans.

Current approaches to modeling peripheral vision in DNNs are disjointed and often rely on specialized architectures, loss of resolution models, or style transfer techniques. However, these approaches fail to fully capture the complexity of peripheral vision, such as crowding effects and sensitivity to clutter. The proposed method leverages the Texture Tiling Model (TTM), a well-tested model of peripheral vision in humans. The researchers modify TTM to be more flexible for use with DNNs, creating the Uniform Texture Tiling Model (uniformTTM). This allows for the generation of images transformed to capture the information available in human peripheral vision, which is then used to train and evaluate DNNs.

The Uniform Texture Tiling Model (uniformTTM) is applied to the COCO dataset to create COCO-Periph, a large dataset containing images transformed to simulate peripheral vision at various eccentricities. Through psychophysics experiments, both human and DNN performance in peripheral object detection are evaluated. The results show that while DNNs trained on COCO-Periph demonstrate improvements in performance compared to pre-trained models, they still underperform compared to humans, particularly in sensitivity to clutter. Additionally, training on COCO-Periph leads to small increases in corruption robustness, suggesting a potential link between peripheral vision and adversarial robustness.

In conclusion, the paper highlights the importance of accurately modeling peripheral vision in DNNs to mimic and benefit from the properties of human visual processing. While the proposed method of using uniformTTM and COCO-Periph dataset represents a significant step forward in this direction, there are still challenges in bridging the performance gap between humans and DNNs. Experiments indicate the requirement to optimize DNNs for generalization across various tasks and to better understand the relationship between peripheral vision and robustness. Overall, this work lays the foundation for advancements in areas such as driver safety, content memorability, UI/UX design, foveated rendering, and compression, where modeling human-like visual perception is crucial for improving machine performance.


Check out the Paper and MIT Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram ChannelDiscord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our 38k+ ML SubReddit

Want to get in front of 1.5 Million AI enthusiasts? Work with us here


Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.


🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others…


Credit: Source link

Comments are closed.