Equivalence for Markov Decision Processes


Nathanaël Fijalkow


Research Fellow at the Alan Turing Institute

Queen Mary University of London,
Thursday 6th, 2017

Why?


Probabilistic Bisimulation

(due to Larsen, Skou 91)

Bisimulation for Non-Deterministic Systems


Labelled Transition Systems: $\Delta : Q \times A \to \mathcal{P}(Q)$

Bisimulation game: Duplicator claims that $p \equiv q$ We say that $p$ and $q$ are bisimilar if Duplicator can play forever


Coinductive definition: If $p \equiv q$, then for all $a$ in $A$, for all $p' \in \Delta(p,a)$, there exists $q' \in \Delta(q,a)$ such that $p' \equiv q'$

Bisimulation for Labelled Markov Chains


Labelled Markov Chains: $\Delta : Q \times A \to \mathcal{D}(Q)$

Coinductive definition: If $p \equiv q$, then for all $a$ in $A$, for all $C$ equivalence class of $\equiv$, $$\Delta(p,a,C) = \Delta(q,a,C)$$


Equivalent game: Duplicator claims that $p \equiv q$
  • Spoiler: pick $a$ in $A$ and $C \subseteq Q$ such that $\Delta(p,a,C) \neq \Delta(q,a,C)$
  • Duplicator: $p'$ in $C$, $q'$ not in $C$ and claim that $p' \equiv q'$
Duplicator claims they are bisimilar Duplicator claims they are bisimilar Spoiler picks a set of green states Duplicator claims they are bisimilar Spoiler picks a set of green states Duplicator picks a green state and a non-green state Duplicator claims they are bisimilar Spoiler picks a set of green states Duplicator picks a green state and a non-green state Spoiler's move was stupid! Duplicator claims they are bisimilar Spoiler picks a set of green states Duplicator claims they are bisimilar Spoiler picks a set of green states Duplicator picks a green state and a non-green state

25 years of research on Probabilistic Bisimulation


How to prove that two systems are not bisimilar?


Find a formula satisfying one but not the other!

Logical Characterisation

\[ \phi\ =\ \top\ \mid\ \phi \wedge \phi\ \mid\ \langle a \rangle_{> p} \phi\ \] Theorem: (Desharnais, Edalat, Panangaden 98)
Assumptions: the state space is a Polish space and the set of labels is countable

Two states are bisimiliar if, and only if, they satisfy the same formulas of this logic

Why countability of the set of labels?



Logical Characterisation, take 2

\[ \phi\ =\ \top\ \mid\ \phi \wedge \phi\ \mid\ \langle a \rangle_{> p} \phi\ \] Theorem: (F., Klin, Panangaden)
Assumptions: the state space is a Polish space and the transition function is continuous

Two states are bisimiliar if, and only if, they satisfy the same formulas of this logic

Reasoning on Traces


Key Notion: Bisimulation on Distributions


(due to Hermanns, Krčál, Křetı́nský 2014)

Defined for MDP: $\Delta : Q \times A \to \text{Conv}(\mathcal{D}(Q))$

If $\delta \equiv \delta'$, then for all $a$ in $A$:

From Traces to Bisimulation on Distributions

Theorem (folklore): For Markov Chains, bisimilarity on distributions is the same as trace equivalence.

Theorem (F., Kiefer, Shirmohammadi, FoSSaCS'16):

Logical Characterisations

\[ \phi\ =\ \top\ \mid\ \phi \wedge \phi\ \mid\ \langle a \rangle \phi\ \mid\ \Sigma_{\ge p} \text{ for } p \in \mathbb{Q} \] Theorem: (F., Klin @ Simons) Two distributions are bisimiliar if, and only if, they satisfy the same formulas of this logic

\[ \phi\ =\ \Sigma_{\ge p} \wedge \bigwedge_{a \in S} \langle a \rangle \phi \text{ for } p \in \mathbb{Q} \text{ and } S \subseteq A \text{ finite.} \] Theorem: (F., Klin @ Simons) Two distributions trace refine each other if, and only if, they satisfy the same formulas of this fragment