Skip to main content

Paper addressing multi-agent AI problems wins top accolade


A research team from the Department of Computer Science and the Department of Engineering has won ‘the outstanding student paper’ at one of the premier venues for publishing AI research: AAAI-18, the Association for the Advancement of Artificial Intelligence’s conference.

The winning paper 'Counterfactual Multi-Agent Policy Gradients' (COMA) presents a method which could soon make it possible to deploy learning multi-agent systems in the real world.

COMA differs from a lot artificial intelligence research by focussing on multi-agent problems, rather than single agent setting and two player games. There are many challenging multi-agent problems to tackle, ranging from self-driving cars to drones and even social interactions. In many of these applications a number of independent entities needs to be able to take independent actions based on local observations in order to achieve a common goal.

For example, in a fleet of search-and-rescue drones each single drone typically needs to be able to decide on its best course of action using only local information. This is commonly referred to as 'decentralised execution'. However, often the design of the policies can be carried out in a centralised fashion, for example when training of the policies is carried out using a simulator which has access to the observations and actions of all agents. The research team believes that this domain of centralised training and decentralised execution is one of the key avenues for successfully developing and deploying multi-agent systems in the real world.

One of the great challenges when training multi-agent policies is the credit assignment problem. Just like in a football team, the reward achieved depends on the actions of all of the different agents. Given that all agents are constantly improving their policies, it is difficult for any given agent to evaluate the impact of their individual action on the overall performance of the team. To address this issue, the research team (Computer Science’s Jakob N. Foerster, Gregory Farquhar and Professor Shimon Whiteson, with Engineering’s Triantafyllos Afouras and Nantas Nardelli) developed the COMA method. In the paper, the researchers model the problem setting of StarCraft unit-management as a challenging cooperative multi-agent problem. The team’s training method outperforms existing methods and achieves high win rates against the StarCraft bot.

The team’s certificate will be presented at AAAI-18 on 6 February.

The full paper is at: