Non-Markov Reinforcement Learning

Supervisors

Alessandro Ronca

Christoph Haase

Suitable for

MSc in Advanced Computer Science

Mathematics and Computer Science, Part C

Computer Science and Philosophy, Part C

Computer Science, Part C

Abstract

There are several projects available on the theme “Non-Markov Reinforcement Learning”. Please get in touch for the most up-todate information.

Non-Markov Reinforcement Learning focuses on how an agent can learn to behave optimally in environments where the history of past observations is relevant to predict what will happen in response to the agent’s actions. It is a more general setting than the one described by Markov Decision Processes (MDPs) where it is sufficient to consider the current observation. The challenge is to develop effective methods to learn and take decisions according to the sequence of past observations. Please see the references below for an introduction to the problem.

There are projects focusing on:

(i) performance bounds, e.g., PAC-style bounds and regret bounds,

(ii) efficient algorithms,

(iii) novel approaches to overcome the limitations of existing algorithms,

(iv) implementation and experimental evaluation.

The specific topic will be identified together with the candidate, based on the candidate’s interests and background.

References:

Alessandro Ronca, Giuseppe De Giacomo: Efficient PAC Reinforcement Learning in Regular Decision Processes. IJCAI 2021.

Roberto Cipollone, Anders Jonsson, Alessandro Ronca, Mohammad Sadegh Talebi: Provably Efficient Offline Reinforcement Learning in Regular Decision Processes. NeurIPS 2023.

Pre-requisites: Some familiarity with Reinforcement Learning, Automata, and Computational Learning Theory.

Non-Markov Reinforcement Learning

Supervisors

Suitable for

Abstract

Student Space