Explainable Neural Networks
Supervisor
Suitable for
Abstract
Despite the increasing success of deep neural models, their general lack of interpretability is still a major drawback, carrying far-reaching consequences in safety-critical applications, such as healthcare and legal. Several directions of explaining neural models have recently been introduced, such as feature-based explanations and natural language explanations. However, there are still several major open questions, such as:
Are explanations faithfully describing the decision-making processes of the models that they aim to explain?
Can explanations for the ground-truth label that are provided during training increase model robustness and generalization capabilities?
Can we do few-shot learning of natural language explanations? What are the advantages and disadvantages of each of the multiple types of explanations (e.g., feature-based, example-based, natural language, surrogate models)? The students will be able to pick one of these open questions or propose their own. The projects will also be co-supervised by Oana-Maria Camburu, a postdoctoral researcher with strong background and contributions in this area. "strong coding skills (preferably in deep learning platforms, such as Pytorch or Tensorflow), deep learning knowledge.
References: [1] https://papers.nips.cc/paper/2018/file/4c7a167bb329bd92580a99ce422d6fa6-Paper.pdf [2] https://www.aclweb.org/anthology/2020.acl-main.771/ [3] https://arxiv.org/abs/2004.14546 [4] https://arxiv.org/abs/1910.02065 [5] https://dl.acm.org/doi/abs/10.1145/3313831.3376219 "