Skip to main content

Bayesian Reinforcement Learning: Robustness and Safe Training

Supervisor

Suitable for

Mathematics and Computer Science, Part C
Computer Science and Philosophy, Part C
Computer Science, Part C

Abstract

In this project we shall build on recent work on ``Safe Learning'' [2], which frames classical RL algorithms to synthesise policies that abide by complex tasks or objectives, whilst training safely (that is, without violating given safety requirements). Tasks/objectives for RL-based synthesis can be goals expressed as logical formulae, and thus be richer than standard reward-based goals. We plan to frame recent work by OXCAV [2] in the context of Bayesian RL, as well as to leverage modern robustness results, as in [3]. We shall pursue both model-based and -free approaches. [2] M. Hasanbeig, A. Abate and D. Kroening, ``Cautious Reinforcement Learning with Logical Constraints,'' AAMAS20, pp. 483-491, 2020. [3] B. Recht, ``A Tour of Reinforcement Learning: The View from Continuous Control,'' Annual Reviews in Control, Vol. 2, 2019.