Skip to main content

Explainable Neural Networks

Supervisor

Suitable for

MSc in Advanced Computer Science

Abstract

Despite the increasing success of deep neural models, their general lack of interpretability is still a major drawback, carrying far-reaching consequences in safety-critical applications, such as healthcare and legal. Several directions of explaining neural models have recently been introduced, such as feature-based explanations and natural language explanations.  However, there are still several major open questions, such as:

Are explanations faithfully describing the decision-making processes of the models that they aim to explain?

Can explanations for the ground-truth label that are provided during training increase model robustness and generalization capabilities?

Can we do few-shot learning of natural language explanations? What are the advantages and disadvantages of each of the multiple types of explanations (e.g., feature-based, example-based, natural language, surrogate models)? The students will be able to pick one of these open questions or propose their own. The projects will also be co-supervised by Oana-Maria Camburu, a postdoctoral researcher with strong background and contributions in this area. "strong coding skills (preferably in deep learning platforms, such as Pytorch or Tensorflow), deep learning knowledge.

References: [1] https://papers.nips.cc/paper/2018/file/4c7a167bb329bd92580a99ce422d6fa6-Paper.pdf [2] https://www.aclweb.org/anthology/2020.acl-main.771/ [3] https://arxiv.org/abs/2004.14546 [4] https://arxiv.org/abs/1910.02065 [5] https://dl.acm.org/doi/abs/10.1145/3313831.3376219 "