A camera that knows exactly where it is

On Jun 9, 2021

Overview of the on-sensor mapping. The system moves around and as it does it builds a visual catalogue of what it observes. This is the map that is later used to know if it has been there before.
*Image credit: University of Bristol*

Knowing where you are on a map is one of the most useful pieces of information when navigating journeys. It allows you to plan where to go next and also tracks where you have been before. This is essential for smart devices from robot vacuum cleaners to delivery drones to wearable sensors keeping an eye on our health.

But one important obstacle is that systems that need to build or use maps are very complex and commonly rely on external signals like GPS that do not work indoors, or require a great deal of energy due to the large number of components involved.

Walterio Mayol-Cuevas, Professor in Robotics, Computer Vision and Mobile Systems at the University of Bristol’s Department of Computer Science, led the team that has been developing this new technology.

He said: “We often take for granted things like our impressive spatial abilities. Take bees or ants as an example. They have been shown to be able to use visual information to move around and achieve highly complex navigation, all without GPS or much energy consumption.

“In great part this is because their visual systems are extremely efficient and well-tuned to making and using maps, and robots can’t compete there yet.”

However, a new breed of sensor-processor devices that the team calls Pixel Processor Array (PPA), allow processing on-sensor. This means that as images are sensed, the device can decide what information to keep, what information to discard and only use what it needs for the task at hand.

An example of such PPA device is the SCAMP architecture that has been developed by the team’s colleagues at the University of Manchester by Piotr Dudek, Professor of Circuits and Systems from the University of Manchester and his team. This PPA has one small processor for every pixel which allows for massively parallel computation on the sensor itself.

The team at the University of Bristol has previously demonstrated how these new systems can recognise objects at thousands of frames per second but the new research shows how a sensor-processor device can make maps and use them, all at the time of image capture.

This work was part of the MSc dissertation of Hector Castillo-Elizalde, who did his MSc in Robotics at the University of Bristol. He was co-supervised by Yanan Liu who is also doing his PhD on the same topic and Dr Laurie Bose.

Hector Castillo-Elizalde and the team developed a mapping algorithm that runs all on-board the sensor-processor device.

The algorithm is deceptively simple: when a new image arrives, the algorithm decides if it is sufficiently different to what it has seen before. If it is, it will store some of its data, if not it will discard it.

Right: the system moves around the world, Left: A new image is seen and a decision is made to add it or not to the visual catalogue (top left), this is the pictorial map that can then be used to localise the system later. *Image credit: University of Bristol*

As the PPA device is moved around by for example a person or robot, it will collect a visual catalogue of views. This catalogue can then be used to match any new image when it is in the mode of localisation.

Importantly, no images go out of the PPA, only the key data that indicates where it is with respect to the visual catalogue. This makes the system more energy efficient and also helps with privacy.

During localisation the incoming image is compared to the visual catalogue (Descriptor database) and if a match is found, the system will tell where it is (Predicted node, small white rectangle at the top) relative to the catalogue. Note how the system is able to match images even if there are changes in illumination or objects like people moving.

The team believes that this type of artificial visual systems that are developed for visual processing, and not necessarily to record images, is a first step towards making more efficient smart systems that can use visual information to understand and move in the world. Tiny, energy efficient robots or smart glasses doing useful things for the planet and for people will need spatial understanding, which will come from being able to make and use maps.

The research has been partially funded by the Engineering and Physical Sciences Research Council (EPSRC), by a CONACYT scholarship to Hector Castillo-Elizalde and a CSC scholarship to Yanan Liu.