AI advance could make robots better at exploring

A computer program that can solve 1980s platform games could help improve robot intelligence.

Feb. 25, 2021 4 minSource

Scientists have come up with a computer program that can master a variety of 1980s exploration games, paving the way for more self-sufficient robots.

They created a family of algorithms (software-based instructions for solving a problem) able to complete classic Atari games, such as Pitfall.

Previously, these scrolling platform games have been challenging to solve using artificial intelligence (AI).

The algorithms could help robots better navigate real-world environments.

These might include disaster zones, where robots could be sent out to search for survivors, or even just the average home.

This remains a core challenge in the fields of robotics and AI.

A number of the games used in the study require the user to explore mazes containing rewards, obstacles and hazards. The family of algorithms, known collectively as Go-Explore, produced substantial improvements on previous attempts to solve games such as the wittily titled Montezuma's Revenge, released in 1984, Freeway (1981) and the aforementioned Pitfall (1982).

The work falls into an area of AI research known as reinforcement learning.

"Our method is indeed pretty simple and straightforward, although that is often the case with scientific breakthroughs," researchers Adrien Ecoffet, Joost Huizinga, Jeff Clune said in response to questions sent over email.

"The reason our approach hadn't been considered before is that it differs strongly from the dominant approach that has historically been used for addressing these problems in the reinforcement learning community, called 'intrinsic motivation'. In intrinsic motivation, instead of dividing exploration into returning and exploring like we do, the agent is simply rewarded for discovering new areas."

But there's a problem with the intrinsic motivation approach, the scientists say. While searching tor a solution, the algorithm can "forget" about promising areas that still need to be explored. This is known as detachment.

To overcome this issue, the algorithms developed by Adrien Ecoffet and colleagues build up archives of areas they have visited to help them remember where they have been. This ensures the algorithm can return to a promising intermediate stage of the game as a point from which to explore further.

But there's another problem with previous approaches. "They rely on random actions that may be taken at any point in time, including while the agent is still going towards the area that actually needs to be explored," the scientists told BBC News.

"If you have an environment where your actions have to be accurate and precise, such as a game with many hazards that can instantly kill you, such random actions can prevent you from reaching the area you actually want to explore."

The technical term for this is "derailment".

The new method, described in the prestigious journal Nature, resolves the derailment problem by separating the concept of returning to previously visited areas from the process of exploring new ones.

The team members, who carried out their work while employed by Uber AI Labs in California, said the work lends itself to algorithms used for guiding robots in the home or in industrial settings.

They say that Go-Explore is designed to tackle longstanding problems in reinforcement learning. "Think about asking a robot to get you a coffee: there is virtually no chance it will happen to operate the coffee machine by just acting randomly."

The scientists added: "In addition to robotics, Go-Explore has already seen some experimental research in language learning, where an agent learns the meaning of words by exploring a text-based game, and for discovering potential failures in the behaviour of a self-driving car."

Follow Paul on Twitter.

Share this article:

Related Articles: