Skip to main content

Safety Assurance for Deep Neural Networks

Supervisor

Suitable for

MSc in Computer Science
Mathematics and Computer Science, Part C
Computer Science and Philosophy, Part C
Computer Science, Part C

Abstract

Professor Marta Kwiatkowska is happy to supervise projects in the area of safety assurance and automated verification for deep learning. This is a new research topic initiated with this paper (http://qav.comlab.ox.ac.uk/bibitem.php?key=HKW+17), see also https://www.youtube.com/watch?v=XHdVnGxQBfQ.

Below are some concrete project proposals:

  • Safety Testing of Deep Neural Networks. Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. In a recent paper (https://arxiv.org/abs/1710.07859) a method was proposed for searching for adversarial examples using the SIFT feature extraction algorithm. The method is based on a two-player turn-based stochastic game, where the first player's objective is to find an adversarial example by manipulating the features, and proceeds through Monte Carlo tree search. It was evaluated on various networks, including YOLO object recognition from camera images. This project aims to adapt the techniques to object detection in lidar images such as Vote3D (http://ori.ox.ac.uk/efficient-object-detection-from-3d-point-clouds/), utilising the Oxford Robotics Institute dataset (http://ori.ox.ac.uk/datasets/).
  • Safety Testing of End-to-end Neural Network Controllers. NVIDIA has created a deep learning system for end-to-end driving called PilotNet (https://devblogs.nvidia.com/parallelforall/explaining-deep-learning-self-driving-car/). It inputs camera images and produces a steering angle. The network is trained on data from cars being driven by real drivers, but it is also possible to use the Udacity simulator to train it. Safety concerns have been raised for neural network controllers because of their vulnerability to adversarial examples – an incorrect steering angle may force the car off the road. In a recent paper (https://arxiv.org/abs/1710.07859) a method was proposed for searching for adversarial examples using the SIFT feature extraction algorithm. The method is based on a two-player turn-based stochastic game, where the first player's objective is to find an adversarial example by manipulating the features, and proceeds through Monte Carlo tree search. This project aims to use these techniques to evaluate the robustness of PilotNet to adversarial examples.
  • Universal L0 Perturbations for Deep Neural Networks (with Wenjie Ruan). Since deep neural networks (DNNs) are deployed in autonomous driving systems, ensuring their safety, security and robustness is essential. Unfortunately, DNNs are vulnerable to adversarial examples - slightly perturbing an image may cause a misclassification, see CleverHans (https://github.com/tensorflow/cleverhans). Such perturbations can be generated by adversarial attackers. Most current adversarial perturbations are designed based on the L1, L2 or Linf norms. Recent work demonstrated the advantages of perturbations based on the L0-norm (http://nicholas.carlini.com/papers/2017_sp_nnrobustattacks.pdf). This project, aims to, given a well-trained deep neural network, demonstrate the existence of a universal (image-agnostic) L0-norm perturbation that causes most of images to be misclassified. The work will involve designing a systematic algorithm for computing universal perturbations and empirically analysing these to show whether they generalize well across neural networks. This project is suited to students familiar with neural networks and Python programming.