Skip to main content

Augmenting Biometric Datasets with Autoencoders

Supervisor

Suitable for

MSc in Advanced Computer Science
Mathematics and Computer Science, Part C
Computer Science and Philosophy, Part C
Computer Science, Part C
Computer Science, Part B

Abstract

Co-supervised by Systems Security Lab

User authentication is of great importance in the age of digital services, but passwords don't work well at scale. Biometrics are increasingly seen as the solution, especially behavioural biometrics that extract user traits implicitly while the user performs other tasks; examples include gait, keystroke dynamics, and responsive eye movements. A major hindrance in behavioural biometric authentication systems is the initial construction of the user template, called the enrolment phase, where the user must laboriously perform the task 'n' times so that their identifying features can be extracted. To facilitate the real-world adoption of these systems, n must be minimised.

In recent work, we conducted a user study in which users made mobile payments using both a smartwatch and a smart ring and we collected the inertial sensor (accelerometer, gyroscope, etc.) data from these devices. Using this data, we showed that payment gestures can be used to authenticate the user, both per-device (watch data can authenticate watch payments and ring data, ring payments) and cross-device (watch data can authenticate ring payments and vice versa). We also showed that while it was possible to train a reasonable classifier on a small number of samples, those trained on more samples performed proportionally better. The plan for this project would be to use autoencoders to augment a sparse enrolment dataset and show that a model trained on it can perform as well (at differentiating between users) as a model trained on a larger dataset. The project would compare different loss functions and autoencoder variants to find optimum combinations. The dataset contains several possible use-cases: the project could focus on either per-device or cross-device settings for either device, or a system could be developed on one and tested on another.

Pre-requisites: Knowledge of data analysis.