Security and Privacy of ML |
Summary
This course considers the new threat classes arising from incorporating Machine Learning in a system, exploring the potential vulnerabilities, potential modes of attack and opportunities for defence, detection, and reaction, providing an approach to the evaluation of the robustness of particular ML approaches. Adversarial machine learning will be a central theme of the course, including attack vectors like evasion, poisoning, and model extraction/inversion.
Objectives
The objectives of this course are to:
- explore the vulnerabilities of deep learning architectures;
- learn about defence mechanisms such as adversarial training, input sanitization, and certified defences, together with prompt injection attacks and strategies for making AI systems robust against manipulation via prompt engineering;
- study applications and analyse implications of these threats to privacy and safety to inform wider business discussions of risk.
Contents
- Privacy attacks on ML models: Threat modelling in ML deployment contexts; Attacks on training data, including: membership inference attacks (inferring if an instance is in the training data), model inversion attacks (reconstructing the training data); Model stealing attacks; White-box vs black-box attack settings.
- Privacy-preserving ML: Differential Privacy for ML Training; Federated Learning with Differential Privacy; Homomorphic Encryption and Secure Multi-Party Computation (for private training and private inference).
Requirements
Required background knowledge includes fundamentals of machine learning, security and privacy.