Student Projects
Undergraduate students in the third year of the Final Honour School of Computer Science may undertake a project. Fourth years of the Final Honour School of Computer Science, and students for the MSc in Computer Science are required to undertake a project. Mathematics & Computer Science undergraduates are required to undertake a Computer Science project or a Mathematics dissertation in their fourth year. Computer Science & Philosophy undergraduates may choose to undertake a Computer Science project or a Philosophy thesis in their fourth year.
This is your chance to work on your own project, something that interests and inspires you. We have put together a brief guidance document on how to do it; please click on the link below. On this site, you will also find a sortable and searchable list of projects and potential supervisors. Undergraduate students are welcome to choose one of the projects on the list, or approach a potential supervisor and negotiate your own topic. MSc students should discuss with potential supervisors and get agreement.
You will be expected to make arrangements with potential supervisors between weeks 5 and 7 in Hilary Term, for project work for the following year. Please don't contact potential supervisors outside this window.
Limited equipment is available for use in projects, namely LEAP equipment. If you have any requests, please contact the Head of Academic Administration.
Important Deadlines are:
- All:
- Monday of Week 7, Hilary Term : your online project registration survey must be completed. You will be sent a link.
- MSc students:
- Monday of Week 1, Trinity Term : submission deadline for your project proposal.
- 3rd/4th year students:
- 12pm on Monday of Week 4 of Trinity Term : Submission deadline.
- MSc students:
- Tuesday of Week -5 (minus 5), Michaelmas Term : Submission deadline
Project writing handbook
Sample projects
3rd year
4th year
List of projects
Suitable for
Research themes
Supervisors
Project | Supervisors | Themes | B | C | MSc | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() |
Aggregation of Photovoltaic Panels | Alessandro Abate | Artificial Intelligence and Machine Learning, Automated Verification | C | MSc | The increased relevance of renewable energy sources has modified the behaviour of the electrical grid. Some renewable energy sources affect the network in a distributed manner: whilst each unit has little influence, a large population can have a significant impact on the global network, particularly in the case of synchronised behaviour. This work investigates the behaviour of a large, heterogeneous population of photovoltaic panels connected to the grid. We employ Markov models to represent the aggregated behaviour of the population, while the rest of the network (and its associated consumption) is modelled as a single equivalent generator, accounting for both inertia and frequency regulation. Analysis and simulations of the model show that it is a realistic abstraction, and quantitatively indicate that heterogeneity is necessary to enable the overall network to function in safe conditions and to avoid load shedding. This project will provide extensions of this recent research. In collaboration with an industrial partner. Prerequisites: Computer-Aided Formal Verification, Probabilistic Model Checking |
||||||||||
![]() |
Analysis and verification of stochastic hybrid systems | Alessandro Abate | Automated Verification | C | MSc | Stochastic Hybrid Systems (SHS) are dynamical models that are employed to characterize the probabilistic evolution of systems with interleaved and interacting continuous and discrete components. Formal analysis, verification, and optimal control of SHS models represent relevant goals because of their theoretical generality and for their applicability to a wealth of studies in the Sciences and in Engineering. In a number of practical instances the presence of a discrete number of continuously operating modes (e.g., in fault-tolerant industrial systems), the effect of uncertainty (e.g., in safety-critical air-traffic systems), or both occurrences (e.g., in models of biological entities) advocate the use of a mathematical framework, such as that of SHS, which is structurally predisposed to model such heterogeneous systems. In this project, we plan to investigate and develop new analysis and verification techniques (e.g., based on abstractions) that are directly applicable to general SHS models, while being computationally scalable. Courses: Computer-Aided Formal Verification, Probabilistic Model Checking, Probability and Computing, Automata Logic and Games Prerequisites: Familiarity with stochastic processes and formal verification |
||||||||||
![]() |
Automated verification of complex systems in the energy sector | Alessandro Abate | Artificial Intelligence and Machine Learning, Automated Verification | C | MSc | Smart microgrids are small-scale versions of centralized electricity systems, which locally generate, distribute, and regulate the flow of electricity to consumers. Among other advantages, microgrids have shown positive effects over the reliability of distribution networks. These systems present heterogeneity and complexity coming from 1. local and volatile renewables generation, and 2. the presence of nonlinear dynamics both over continuous and discrete variables. These factors calls for the development of proper quantitative models. This framework provides the opportunity of employing formal methods to verify properties of the microgrid. The goal of the project is in particular to focus on the energy production via renewables, such as photovoltaic panels. The project can benefit form a paid visit/internship to industrial partners. Courses: Computer-Aided Formal Verification, Probabilistic Model Checking, Probability and Computing, Automata Logic and Games Prerequisites: Familiarity with stochastic processes and formal verification, whereas no specific knowledge of smart grids is needed. |
||||||||||
![]() |
Bayesian Reinforcement Learning: Robustness and Safe Training | Alessandro Abate | Artificial Intelligence and Machine Learning, Automated Verification | C | In this project we shall build on recent work on ``Safe Learning'' [2], which frames classical RL algorithms to synthesise policies that abide by complex tasks or objectives, whilst training safely (that is, without violating given safety requirements). Tasks/objectives for RL-based synthesis can be goals expressed as logical formulae, and thus be richer than standard reward-based goals. We plan to frame recent work by OXCAV [2] in the context of Bayesian RL, as well as to leverage modern robustness results, as in [3]. We shall pursue both model-based and -free approaches. [2] M. Hasanbeig, A. Abate and D. Kroening, ``Cautious Reinforcement Learning with Logical Constraints,'' AAMAS20, pp. 483-491, 2020. [3] B. Recht, ``A Tour of Reinforcement Learning: The View from Continuous Control,'' Annual Reviews in Control, Vol. 2, 2019. | |||||||||||
![]() |
Development of software for the verification of MPL models | Alessandro Abate | Automated Verification | C | MSc | This project is targeted to enhance the software tollbox VeriSiMPL (''very simple''), which has been developed to enable the abstraction of Max-Plus-Linear (MPL) models. MPL models are specified in MATLAB, and abstracted to Labeled Transition Systems (LTS). The LTS abstraction is formally put in relationship with its MPL counterpart via a (bi)simulation relation. The abstraction procedure runs in MATLAB and leverages sparse representations, fast manipulations based on vector calculus, and optimized data structures such as Difference-Bound Matrices. LTS can be pictorially represented via the Graphviz tool and exported to PROMELA language. This enables the verification of MPL models against temporal specifications within the SPIN model checker. Courses: Computer-Aided Formal Verification, Numerical Solution of Differential Equations Prerequisites: Some familiarity with dynamical systems, working knowledge of MATLAB and C |
||||||||||
![]() |
Formal verification of a software tool for physical and digital components | Alessandro Abate | Automated Verification | MSc | We are interested in working with existing commercial simulation software that is targeted around the modelling and analysis of physical, multi-domain systems. It further encompasses the integration with related software tools, as well as the interfacing with devices and the generation of code. We are interested in enriching the software with formal verification features, envisioning the extension of the tool towards capabilities that might enable the user to raise formal assertions or guarantees on models properties, or to synthesise correct-by-design architectures. Within this long-term plan, this project shall target the formal generation of faults warnings, namely of messages to the user that are related to ``bad (dynamical) behaviours'' or to unwanted ``modelling errors''. The student will be engaged in developing algorithmic solutions towards this goal, while reframing them within a formal and general approach. The project is inter-disciplinary in dealing with hybrid models involving digital and physical quantities, and in connecting the use of formal verification techniques from the computer sciences with more classical analytical tools from control engineering Courses: Computer-Aided Formal Verification, Software Verification. Prerequisites: Knowledge of basic formal verification. |
|||||||||||
![]() |
Innovative Sensing and Actuation for Smart Buildings | Alessandro Abate | Artificial Intelligence and Machine Learning, Automated Verification | C | MSc | Sensorisation and actuation in smart buildings and the development of smart HVAC (heat, ventilation and air-conditioning) control strategies for energy management allow for optimised energy usage, leading to the reduction in power consumption or to optimised demand/response strategies that are key in a rather volatile market. This can further lead to optimised maintenance for the building devices. Of course the sensitisation of buildings leads to heavy requirements on the overall infrastructure: we are interested in devising new approaches towards the concept of using ``humans as sensors''. Further, we plan to investigate approaches to perform meta-sensing, namely to extrapolate the knowledge from physical sensors towards that of virtual elements (as an example, to infer the current building occupancy from correlated measurements of temperature and humidity dynamics). On the actuation side, we are likewise interested in engineering non-invasive minimalistic solutions, which are robust to uncertainty and performance-certified. The plan for this project is to make the first steps in this direction, based on recent results in the literature. The project can benefit from a visit to Honeywell Labs (Prague). Courses: Computer-Aided Formal Verification. Prerequisites: Some familiarity with dynamical systems. | ||||||||||
![]() |
Model learning and verification | Alessandro Abate | Artificial Intelligence and Machine Learning, Automated Verification | C | MSc | This project will explore connections of techniques from machine learning with successful approaches from formal verification. The project has two sides: a theoretical one, and a more practical one: it will be up to the student to emphasise either of the two sides depending on his/her background and/of interests. The theoretical part will develop existing research, for instance in one of the following two inter-disciplinary domain pairs: learning & repair, or reachability analysis & Bayesian inference. On the other hand, a more practical project will apply the above theoretical connections on a simple models setup in the area of robotics and autonomy. Courses: Computer-Aided Formal Verification, Probabilistic Model Checking, Machine Learning |
||||||||||
![]() |
Precise simulations and analysis of aggregated probabilistic models | Alessandro Abate | Automated Verification | C | MSc | This project shall investigate a rich research line, recently pursued by a few within the Department of CS, looking at the development of quantitative abstractions of Markovian models. Abstractions come in the form of lumped, aggregated models, which are beneficial in being easier to simulate or to analyse. Key to the novelty of this work, the proposed abstractions are quantitative in that precise error bounds with the original model can be established. As such, whatever can be shown over the abstract model, can be as well formally discussed over the original one. This project, grounded on existing literature, will pursue (depending on the student's interests) extensions of this recent work, or its implementation as a software tool. Courses: Computer-Aided Formal Verification, Probabilistic Model Checking, Machine Learning |
||||||||||
![]() |
Reinforcement Learning for Space Operations | Alessandro Abate, Licio Romao | Artificial Intelligence and Machine Learning, Automated Verification | C | Oxcav has an ongoing collaboration with the European Space Agency (ESA) that involves applying Reinforcement Learning (RL) algorithms to a satellite called OPS-SAT, which has been launched in 2019 and is a flying laboratory that allows ESA’s partners to test and validate new techniques (more information can be found at https://www.esa.int/Enabling_Support/Operations/OPS-SAT). This project aims at designing controllers that will be used to perform nadir pointing and sun-tracking of OPS-SAT, while meeting some specifications (e.g., admissible nadir pointing errors). The focus will be on data-driven methods that leverage available sensors (gyroscopes, GPS, fine sun sensor, magnetometer) and actuators data using a RL architecture to come up with a safe policy that can yield an adequate performance. The main tasks of the project will consist in (1) exploring an ESA platform called MUST to collect all the necessary data and (2) implementing a RL scheme that will be later deployed in the satellite. Throughout the project you will have the opportunity to work with state-of-the-art data-driven techniques that have been developed at Oxcav, under the supervision of Prof. Alessandro Abate and Dr. Licio Romao. | |||||||||||
![]() |
Safe Reinforcement Learning | Alessandro Abate | Automated Verification | C | MSc | Reinforcement Learning (RL) is a known architecture for synthesising policies for Markov Decision Processes (MDP). We work on extending this paradigm to the synthesis of ‘safe policies’, or more general of policies such that a linear time property is satisfied. We convert the property into an automaton, then construct a product MDP between the automaton and the original MDP. A reward function is then assigned to the states of the product automaton, according to accepting conditions of the automaton. With this reward function, RL synthesises a policy that satisfies the property: as such, the policy synthesis procedure is `constrained' by the given specification. Additionally, we show that the RL procedure sets up an online value iteration method to calculate the maximum probability of satisfying the given property, at any given state of the MDP. We evaluate the performance of the algorithm on numerous numerical examples. This project will provide extensions of these novel and recent results. Prerequisites: Computer-Aided Formal Verification, Probabilistic Model Checking, Machine Learning |
||||||||||
![]() |
Safety verification for space dynamics via neural-based control barrier functions | Alessandro Abate | Automated Verification | C | Barrier functions are Lyapunov-like functions that serve as certificates for the safety verification of dynamical and control models. The OXCAV group has recently worked on the automated and sound synthesis of barrier functions structured as neural nets, with an approach that uses SAT modulo theory. In this project, we shall pursue two objectives: 1. Apply recent results [1] on sound and automated synthesis of barrier certificates on models that are pertinent to the Space domain, such as models for attitude dynamics. Airbus will support this part. 2. Develop new results that extend theory and algorithms to models encompassing uncertainty, such as probabilistic models or models that are adaptive to sensed data. Airbus will support this effort providing environments and models for experiments and testing. [1] A. Abate, D. Ahmed and A. Peruffo, ``Automated Formal Synthesis of Neural Barrier Certificates for Dynamical Models,'' TACAS, To appear, 2021. | |||||||||||
![]() |
Software development for abstractions of stochastic hybrid systems | Alessandro Abate | Automated Verification | C | MSc | Stochastic hybrid systems (SHS) are dynamical models for the interaction of continuous and discrete states. The probabilistic evolution of continuous and discrete parts of the system are coupled, which makes analysis and verification of such systems compelling. Among specifications of SHS, probabilistic invariance and reach-avoid have received quite some attention recently. Numerical methods have been developed to compute these two specifications. These methods are mainly based on the state space partitioning and abstraction of SHS by Markov chains, which are optimal in the sense of reduction in abstraction error with minimum number of Markov states. The goal of the project is to combine codes have been developed for these methods. The student should also design a nice user interface (for the choice of dynamical equations, parameters, and methods, etc.). Courses: Probabilistic Model Checking, Probability and Computing, Numerical Solution of Differential Equations Prerequisites: Some familiarity with stochastic processes, working knowledge of MATLAB and C |
||||||||||
![]() |
Complex Object Querying and Data Science | Michael Benedikt | Data and Knowledge | MSc | "We will look at query languages for transforming nested collections (collections that might contain collections). Such languages can be useful for preparing large scale feature data for machine learning algorithms. We have a basic implementation of such a language that we implement on top of the big-data framework Spark. The goal of the project is to extend the language with iteration. One goal will be to look at how to adapt processing techniques for nested data to support iteration. Another, closer to application is to utilize iteration to support additional steps of a data science pipeline, such as sampling. " | |||||||||||
![]() |
Decision procedures for arithmetic with powers | Michael Benedikt | Data and Knowledge | C | MSc | The project will look at implication and satisfaction problems involving formulas that involve additive arithmetic, inequalities, and some form of exponentiations with a constant base. For example, systems of inequalities with constant multiples of variables and constant multiple of expressions like 2^x. We will also look at first order logic built up from these inequalities. In the absence of exponentiatial terms like 2^x, the theory is known as Presburger Arithmetic, and has been heavily investigated in both theory and practice. In the presence of expenontial terms to a fixed base, it is known to be decidable, from work of Semenov in the late 70's and 80's. Recent work has shown fragments where the complexity is not too terrible, and in the process some new algorithms have been proposed. The goal of the project is to implement decision procedures for expressions of this form. A theoretically-oriented student can also work on complexity analysis. A pre-requisite for the project is good background in logic - Logic and Proof and preferably one of CAFV or KRR. | ||||||||||
![]() |
Genomic analysis using machine learning and large scale data management techniques | Michael Benedikt | Data and Knowledge | MSc | We will investigate novel risk analysis -- likelihood |
|||||||||||
![]() |
Graph Machine Learning with Neo4j | Michael Benedikt, Ismail Ilkan Ceylan | Artificial Intelligence and Machine Learning, Data and Knowledge | MSc | Co-supervisor Neo4J (industrial partner) We have discussed a set of projects concerning data science on graphs with Brian Shi (an Oxford alumni) of the Neo4j data science team, including (but not limited to): -- understanding and evaluating different initialization regimes in node embeddings, -- explainability and interpretability for graph ML, design of online grap ML algorithms/models where nodes and edges are addfed/modified in near real-time A student would be working with Ismail Ceylan, Michael Benedikt and the Neo4j team: the project will be research-focused and any software produced would be in the public domain. The balance of experiment and theory could be tuned to the student's interests. The main prerequisite would be a very good knowledge of graph ML, at the level of Oxford's GRL course. |
|||||||||||
![]() |
Interpolation | Michael Benedikt | Data and Knowledge | C | MSc | Let F1 and F2 be sentences (in first-order logic, say) such that F1 entails F2: that is, any model of F1 is also a model of F2. An interpolant is a sentence G such that F1 entails G, and G entails F2, but G only uses relations and functions that appear in *both* F1 and F2. The goal in this project is to explore and implement procedures for constructing interpolants, particularly for certain decidable fragments of first-order logic. It turns out that finding interpolants like this has applications in some database query rewriting problems. Prerequisites: Logic and Proof (or equivalent) |
||||||||||
![]() |
Modelling Delayed Labels in Online Continual Learning | Michael Benedikt | Data and Knowledge | MSc | Co-supervised by Phillip Torr Online continual learning is the problem of predicting every sample in the stream while simultaneously learning from it. That is to say, the stream first presents data to be predicted and then the stream reveals the labels for the model to train on. However, in most real-case scenarios, labelling is an expensive laborious and time-consuming procedure. Thereof, we seek to study the sensitivity of existing continual learning algorithms when labels of images at step t are only provided at step t + k. This setting poses two challenges. (1) Learning from unlabelled data. (2) Modelling the delayed labels. To that end, we are interested in proposing new algorithms that per time step t can perform self-supervision continually while jointly training on the labelled data revealed from step t-k. | |||||||||||
![]() |
Optimization of Web Query Plans | Michael Benedikt | Data and Knowledge | MSc | This project will look at how to find the best plan for a query, given a collection of data sources with access restrictions. We will look at logic-based methods for analyzing query plans, taking into account integrity constraints that may exist on the data. |
|||||||||||
![]() |
Optimized reasoning with guarded logics | Michael Benedikt | Data and Knowledge | C | MSc | "Inference in first-order logic is undecidable, but a number of logics have appeared in the last decade that achieve decidability. Many of them are guarded logics, which include the guarded fragment of first-order logic. Although the decidability has been known for some decades, no serious implementation has emerged. Recently we have developed new algorithms for deciding some guarded logics, based on resolution, which are more promising from the perspective of implementation. The project will pursue this both in theory and experimentally." Prerequisites A knowledge of first-order logic, e.g. from the Foundations of CS or Knowledge Representation and Reasoning courses, would be important. | ||||||||||
![]() |
Applications of Complex-Order Fractional Diffusion to Physical and Biological Processes | Alfonso Bueno-Orovio | Computational Biology and Health Informatics | B | C | Fractional differential equations (where standard derivatives are generalised to derivatives of non-integer, real order) have become a fundamental modelling approach for understanding and simulating the many aspects of spatial heterogeneity of complex materials and systems [1,2]. Very recently [3], we have been able to further extend such ideas to derivatives of complex-order. This project aims to further explore the capabilities of these novel complex-order fractional operators in modulating the spatiotemporal dynamics of different systems of interest. To investigate how complex-order fractional operators can advance the description of multiscale transport phenomena in physical and biological processes highly influenced by the heterogeneity of complex media. Students will investigate, by means of scientific computing and modelling and simulation, how complex-order fractional operators modulate the response of different systems of physical and biological interest. Different research options will be available, including the analysis of: (i) novel Turing patterns in physical and biological processes of morphogenesis; (ii) periodic precipitation patterns in the manufacturing of micro- and nano-structures; or (iii) models of population dynamics and epidemiology. 1. Fractional diffusion models of cardiac electrical propagation: role of structural heterogeneity in dispersion of repolarization role of structural heterogeneity in dispersion of repolarization https://doi.org/10.1098/rsif.2014.0352 2. Fourier spectral methods for fractional-in-space reastion- diffusion equations. https://doi.org/10.1007/s10543-014-0484-2 3. The Complex-Order Fractional Laplacian: Stable Operators, Spectral Methods, and Applications to Heterogeneous Media. https://doi.org/10.20944/preprints202107.0569.v1 | ||||||||||
![]() |
Applications of Fractional Calculus to Steganography | Alfonso Bueno-Orovio, Andrew Ker | Computational Biology and Health Informatics, Security | MSc | Steganography is the practice of representing information within another message or physical object, in such a manner that the presence of the information is not evident to human inspection. When applied to image files [1], a particularly powerful approach is to encode the hidden data along the edges of the objects within the image. The fractional Laplacian, as an extension of second-order derivatives from integer to real orders, is known to yield wider but sharper gradients than its standard counterpart, and therefore holds potential to outperform standard steganography techniques for hiding information along image edges. To investigate the performance of steganography techniques based on the fractional Laplacian for information hiding in digital images. (1) Image Steganography: A review of the Recent Advances https://doi.org/10.1109/ACCESS.2021.3053998 (2) Fourier spectral methods for fractional-in-space reaction- diffusion equations. https://doi.org/10.1007/s10543-014-0484-2 (3) USAD: undetectable stefanographic approach in DCT domain. https://doi.org/10.1080/13682199.2019.1620525 | |||||||||||
![]() |
High Throughput, High Resolution, and High Frame-Rate Analysis of Cellular Heart Function | Alfonso Bueno-Orovio | Computational Biology and Health Informatics | B | C | MSc | Co-Supervisor: Christopher Toepfer (Cardiovascular Medicine, Oxford) christopher.toepfer@cardiov.ox.ac.uk Cardiomyocytes are the cells responsible for generating contractile force in the heart. They are highly adaptable and alter their function in response to many stimuli (electrical stimuli, calcium, contraction, organisation of cellular structures, metabolism, and transcriptional signatures). These processes are all interrelated and dynamic in live cardiomyocytes, but we lack the ability to visualise them simultaneously and correlate them in real-time. We have previously developed specialised software [1, 2] to automate high-throughput analysis of cardiomyocyte function. However, these are not optimised to work with large files and cause data fragmentation as they do not work in tandem. To develop novel software solutions for the analysis of cellular heart function for three channel, high throughput, resolution, and frame-rate imaging and analysis of beating cardiomyocytes. Imaging live cardiomyocytes during contraction and relaxation necessitates high-frame rates (100s of FPS), which must be twinned with high resolutions (<70 nm per pixel), also allowing the simultaneous analysis of multiple well plates for high-throughput. In collaboration with the Oxford Department of Cardiovascular Medicine, and building on extensive datasets of high-resolution, high-rate fluorescent imaging of live cardiomyocytes, students taking this project will develop novel open-source software solutions integral to discovery and translational cardiovascular research. Different research options would be available, such as: (i) optimisation and/or development of novel algorithms for the analysis of sarcomere contractility and relaxation (currently 4 hours per 5 second movie; single node calling 4 cores); (ii) development of an analysis pipeline for masking, segmenting, and fitting action potentials from a variety of cellular systems; or (iii) development of an analysis pipeline for the automated analysis of mitochondrial shape, abundance, and cellular metabolites.
|
|||||||||
![]() |
Interpretable Cardiac Anatomy and Electrophysiology Modelling in Human Heart Disease Using Variational Mesh Autoencoders | Alfonso Bueno-Orovio | Computational Biology and Health Informatics | C | MSc | Co-supervised by Vicente Grau (Biomedical Engineering, Oxford) vicente.grau@eng.ox.ac.uk Cardiac anatomy and function vary considerably across the human population with important implications for clinical diagnosis, treatment planning, and prognosis of disease. Consequently, many computer-based approaches have been developed to capture this variability for a wide range of applications, including explainable cardiac disease detection and prediction, dimensionality reduction, cardiac shape analysis, and the generation of virtual heart populations. Here, we will leverage on these technologies to further investigate connections between cardiac anatomy and electrocardiographic (ECG) signals in patients with Hypertrophic Cardiomyopathy (HCM), the most common hereditary heart disease and leading cause of sudden cardiac death in the young and competitive athletes. To investigate connections between patterns of cardiac hypertrophic and their manifestation in the ECG in HCM patients. Generation of virtual populations of HCM hearts, capturing the variability in cardiac shape and ECG signals in this group of high-risk patients. We have recently proposed novel variational mesh autoencoder (mesh VAE) methods as a novel geometric deep learning approach to model population-wide variations in cardiac anatomy [1], which enable direct processing of surface mesh representations of the cardiac anatomy in an efficient manner. These methods can also be extended to simultaneously learn the ECG signal of a given patient [2]. Exploiting cardiac Magnetic Resonance Imaging (MRI) and ECG resources from the UK Biobank, this project will explore the quality and interpretability of the mesh VAE's latent space for the reconstruction and synthesis of multi-domain cardiac signals in patients with HCM. It will also investigate the method's ability to generate realistic virtual populations of cardiac anatomies and ECG signals in terms of multiple clinical metrics in this group of patients. (1) Interpretable Cardiac Anatomy Modeling Using Variational Mesh Autoencoders.https://doi.org/10.3389/fcvm.2022.983868 (2) Multi-Domain variational Autoencoders for Combined Modeling of MRS-Based Biventricular Anatomy and ECG-Based Cardiac Electrophysiology. https://doi.org/10.3389/fphys.2022.886723 | ||||||||||
![]() |
Modelling and Analysing Large Scale Emergency Evacuations | Ani Calinescu, Imran Hashmi, Nicholas Bishop | Artificial Intelligence and Machine Learning | MSc | Large scale mass assemblies such as the FIFA world cup bear a potential risk of disaster, either natural or man-made, resulting in crisis situations necessitating emergency evacuation. Such emergency situations can be complex, requiring large crowds to be navigated through a limited number of narrow passages quickly in a safe fashion. Therefore, in order for large infrastructures to be evacuated effectively, a reliable evacuation strategy must be developed, evaluated and implemented. Agent-based models, which simulate complex systems in a bottom-up fashion by explicitly modelling individuals and their decision-making processes, provide environments to evaluate the practical effectiveness of evacuation strategies. This project focusses on the further development and specialisation of agent-based models for emergency evacuation. In short, the main goals of this project are: A. Formalising crowd modelling complexities using agent-based approaches, including: a. Realistically modelling a synthetic population of individuals. b. Hierarchical crowd structures and crowd population dynamics. c. Behaviour modelling: pedestrian movements, mobility patterns, stampede physics, interactions, interdependence, and interruptions; co-operation and co-ordination. B. Developing a crowd simulation and analysis framework to: a. Model programable spatially explicit environments and building structures at real scale (see Fig 1). b. Conceptualize mass gathering configurations (events, routines, mobility patterns). c. Specify and evaluate emergency evacuation scenarios. d. Propose and test new evacuation strategies / algorithms. e. Use optimization algorithms for comparing different strategies and building evacuation plans. C. Explore techniques to enhance simulation speed and scale using modern techniques: a. Adopting a network-centric design based on message-passing in order to exploit highly parallelised hardware and sparse tensor calculus for fast simulation. b. Integrating agent-based models with deep learning architectures for fast calibration of model parameters via gradient-based methods. References & Recommended Reading: [1] Imran Mahmood, Muhammad Haris, and Hessam Sarjoughian. 2017. Analyzing Emergency Evacuation Strategies for Mass Gatherings using Crowd Simulation And Analysis framework: Hajj Scenario. In Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation (SIGSIM-PADS '17). Association for Computing Machinery, New York, NY, USA, 231–240. https://doi.org/10.1145/3064911.3064924 [2] I. Mahmood, T. Nadeem, F. Bibi and X. Hu, "Analyzing Emergency Evacuation Strategies For Large Buildings Using Crowd Simulation Framework," 2019 Winter Simulation Conference (WSC), 2019, pp. 3116-3127, doi: 10.1109/WSC40007.2019.9004906. [3] S. Long, D. Zhang, S. Li, S. Yang and B. Zhang, "Simulation-Based Model of Emergency Evacuation Guidance in the Metro Stations of China," in IEEE Access, vol. 8, pp. 62670-62688, 2020, doi: 10.1109/ACCESS.2020.2983441. [4] Pidd, M., F. N. De Silva, and R. W. Eglese. "A simulation model for emergency evacuation." European Journal of operational research 90, no. 3 (1996): 413-419. [5] Tan, Lu, Mingyuan Hu, and Hui Lin. "Agent-based simulation of building evacuation: Combining human behavior with predictable spatial accessibility in a fire emergency." Information Sciences 295 (2015): 53-66. [6] Li, Yang, Maoyin Chen, Zhan Dou, Xiaoping Zheng, Yuan Cheng, and Ahmed Mebarki. "A review of cellular automata models for crowd evacuation." Physica A: Statistical Mechanics and its Applications 526 (2019): 120752 [7] Graph Representation Learning 2022-2023, Department of Computer Science Lecture Ismail Ilkan Ceylan https://www.cs.ox.ac.uk/teaching/courses/2022-2023/grl/ [8] Hamilton, William L. "Graph representation learning." Synthesis Lectures on Artifical Intelligence and Machine Learning 14.3 (2020): 1-159 https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book.pdf [9] Macal, Charles M. "Everything you need to know about agent-based modelling and simulation." Journal of Simulation 10, no. 2 (2016): 144-156 [10] Abar, Sameera, Georgios K. Theodoropoulos, Pierre Lemarinier, and Gregory MP O’Hare. "Agent Based Modelling and Simulation tools: A review of the state-of-art software." Computer Science Review 24 (2017): 13-33. Required Prerequisites: Familiarity with Python, Artificial Intelligence. Desirable Prerequisites: Familiarity with PyTorch, Graph Representation Learning, Machine Learning, Bayesian Statistical Probabilistic Programming, Combinatorial Optimisation, Computational Game Theory. | |||||||||||
![]() |
Specification and Verification of Agent-based Models using Computer Aided Formal Verification | Ani Calinescu, Imran Hashmi, Nicholas Bishop | Artificial Intelligence and Machine Learning | MSc | Agent-based Modelling (ABM) is a unique paradigm for studying complex systems wherein individual agents are explicitly modelled and interact with each other as part of a synthetic population. Within an agent-based model, each agent individually assesses its situation and makes potentially randomised decisions according to a set of internal rules. Note that agents may be heterogeneous, possessing different properties which may cause them to behave differently. By employing an agent-based model, one may hope to understand how global, or macroscopic, properties of a complex system may emerge from the collective behaviour of individuals. In particular, agent-based models allow practitioners to model real-world systems at extremely high fidelity, by taking into account many small details. However, this individual-based view on a global system naturally leads to a great computational overhead. As a result, it becomes challenging to carry out verification, validation, and analysis of agent-based models, and therefore reason about their correctness. In this project, we will focus on addressing this shortcoming. More specifically, this project will focus on finding computationally succinct representations of agent-based models which can be combined with techniques from formal verification and probabilistic model checking in order verify and reason about the properties of agent-based models. In short, the main goals of this project include: A. To formally describe agent-based models using a mathematical framework (e.g., finite dynamical systems, graph theory and Markov chains). B. Develop techniques to transform a formal ABM into probabilistic timed automata (PTAs) or discrete-time Markov chains (DTMCs) for formal verification using the PRISM model checker. C. Explore and implement state-space reduction techniques and demonstrate applications for industrial scale agent-based models. Familiarity with PRISM, Computer-Aided Formal Verification, Probabilistic Model Checking. Desirable Prerequisites: Machine Learning. [1] Laubenbacher, R., Jarrah, A.S., Mortveit, H.S., Ravi, S. (2012). Agent Based Modeling, Mathematical Formalism for. In: Meyers, R. (eds) Computational Complexity. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1800-9_6 [2] Banisch, Sven. "The probabilistic structure of discrete agent-based models." Discontinuity, Nonlinearity, and Complexity 3(3) (2014) 281--292 | DOI:10.5890/DNC.2014.09.005 [3] Banisch, Sven. Markov chain aggregation for agent-based models. Springer, 2015 https://link.springer.com/content/pdf/10.1007/978-3-319-24877-6.pdf [4] Clarke, Edmund M., et al., eds. Handbook of model checking. Vol. 10. Cham: Springer, 2018. [5] Computer-Aided Formal Verification: 2022-2023, Department of Computer Science Lecturer David Parker https://www.cs.ox.ac.uk/teaching/courses/2022-2023/computeraidedverification/ [6] Macal, Charles M. "Everything you need to know about agent-based modelling and simulation." Journal of Simulation 10, no. 2 (2016): 144-156 [7] Abar, Sameera, Georgios K. Theodoropoulos, Pierre Lemarinier, and Gregory MP O’Hare. "Agent Based Modelling and Simulation tools: A review of the state-of-art software." Computer Science Review 24 (2017): 13-33. | |||||||||||
![]() |
Frontiers in Graph Representation Learning | Ismail Ilkan Ceylan | Artificial Intelligence and Machine Learning | C | MSc | There are various projects available in the theme of ‘Graph Representation Learning’ (e.g., graph neural networks), please get in touch for the most up-to-date information. There are multiple projects aiming for (i) designing new graph neural network models to overcome some well-known existing limitations of graph neural networks (e.g., oversmoothing, oversquashing, inexpressiveness), (ii) improving our overall theoretical understanding of graph representation learning models (foundational aspects related to probability theory, graph theory, logic, as well as algorithmic aspects), (iii) improving the inductive bias of graph neural networks for machine learning over different types of graphs (e.g., directed, relational graphs), (iv) exploring tasks beyond widely studied ones, such as graph generation, from a foundational aspect (v) novel applications of graph represenation learning. It is important to identify a project which matches the students background and so the details are somewhat subtle to be included in a project description. There are various industrial and academic collaborators, depending on the specific direction. Most of the necessary background information is covered in the graph representation learning course offered in the Department in MT. |
||||||||||
![]() |
Diffusion Models for 3D Novel View Synthesis | Ronald Clark | Artificial Intelligence and Machine Learning | C | MSc | Diffusion-based generative models (such as stable diffusion) have shown that it is possible to generate photorealistic images from text input. For many VR/AR and robotics applications it is desirable to generate multiple images of the same subject or scene from different viewpoints. In this project you will investigate how to get diffusion models to generate multiple views of the same subject from different viewpoints. The consistency of the views will be evaluated by running them through a structure from motion system to obtain a full 3D model of the scene. Prerequisites: Suitable for those who have done the Machine Learning course. Having also done Computer Graphics would be useful. | ||||||||||
![]() |
Generative Plant Disease Dataset | Ronald Clark | Systems | MSc | Co-supervised by Julian Godding (j.godding@gardin.co.uk) This project will investigate the use of generative models to create a novel dataset for disease classification on images from a bespoke camera using publicly available data. Accuracy scores of the final disease classification model can be used to assess the relative improvements for different approaches. |
|||||||||||
![]() |
Implicit Regularisation for Neural Radiance Fields (NeRF) | Ronald Clark | Artificial Intelligence and Machine Learning | C | MSc | Neural radiance fields have emerged as a promising means of reconstructing 3D scenes from multiple images, allowing photorealistic novel views of the scene to be rendered. However, because of the under-constrained nature of the reconstruction problem they suffer from artifacts such as “floaters” - where the neural field incorrectly places high density values in free space. The goal of this project is to develop a method to regularise the neural field and eliminate artifacts such as floaters. The key idea will be to use the inductive biases inherent to convolutional networks (see Deep Image Prior) to bias the reconstruction to a noise-free field. This should allow for improved quality with no additional training data. Prerequisites: Suitable for those who have done the Machine Learning course. Having also done Computer Graphics would be useful. | ||||||||||
![]() |
Real-time Computer Vision | Ronald Clark | Systems | MSc | Co-supervised by Julian Godding (j.godding@gardin.co.uk) This project will discover novel methods to perform high-accuracy computer vision tasks on streamed camera images (>10 fps) using a raspberryPi CPU. This presents challenges due to the accuracy/compute trade-off. The student would need to test several different architectures, design novel solutions for online training and assess performance on a bespoke data set. |
|||||||||||
![]() |
Unsupervised Deep Image Stacking | Ronald Clark | Artificial Intelligence and Machine Learning | C | MSc | Description: Image stacking is a commonly used technique in astrophotography and other areas to improve the signal-to-noise ratio of images. The process works by first aligning a large number of short exposure images and then averaging them which reduces the noise variance of individual pixels. In this project you will investigate the use of neural networks for performing the stacking process in an “unsupervised” manner. This can be accomplished by predicting a distortion field for each image and using a consistency objective that tries to maximise the coherence between the undistorted images in the stack and the final output. During the project you will evaluate this approach and compare it to traditional image stacking Prerequisites: Suitable for those who have done the Machine Learning course. Having also done Computer Graphics would be useful. | ||||||||||
![]() |
Unsupervised Domain Adaption with Adversarial Networks | Ronald Clark | Systems | MSc | Co-supervised by Julian Godding - j.godding@gardin.co.uk At www.gardin.co.uk "Object counting and segmentation are necessary tasks in image-based plant phenotyping, notably for counting plant organs and segmenting different plant matter. One challenge for computer vision tasks in plant phenotyping is that, unlike large image datasets of general objects, plant image datasets usually include highly self-similar images with a small amount of variation among images within the dataset. An individual plant image dataset is often acquired under similar conditions (single crop type, same field) and therefore trying to directly use a CNN trained on a single dataset to new images from a different crop, field, or growing season will fail. This is because a model trained to count objects on one dataset (source dataset) will not perform well on a different dataset (target dataset) when these datasets have different prior distributions. This challenge is called domain-shift. An extreme case of domain shift in plant phenotyping is using a source dataset of plant images collected in an indoor controlled environment, which can be more easily annotated because plants are separated with controlled lighting on a blank background and attempting to apply the model to a target dataset of outdoor field images with multiple, overlapping plants with variable lighting and backgrounds, and blur due to wind motion. Giuffrida et al proposed a method to adapt a model trained to count leaves in a particular dataset to an unseen dataset in an unsupervised manner. Their method uses the representations of images from the source domain (from a pretrained network) to adapt to the target domain. They employed the Adversarial Discriminative Domain Adaptation approach to solving the domain shift problem in the leaf counting task across different datasets. Their method aligns the predicted leaf count distribution with the source domain’s prior distribution, which limits the adapted model to learning a leaf count distribution similar to the source domain. Ayalew et al propose a custom Domain-Adversarial Neural Network architecture and trained it using the unsupervised domain adaptation technique for density estimation-based organ counting. These approaches have delivered strong results in reducing the impact of domain shift. In this project, the data scientist will perform further research into the use of adversarial networks for domain adaptation and apply them to our unique plant image dataset. " |
|||||||||||
![]() |
Virtual Crop Manager using Large Language Models | Ronald Clark | Systems | MSc | Co-supervised by Julian Godding (j.godding@gardin.co.uk) Large language models have made extraordinary progress over the past year but fail to deliver scientifically accurate information in a reliable way. This project will look at fine-tuning existing approaches to NLP Q&A ranking using academic literature to create a virtual plant science assistant. A proof-of-concept reinforcement learning architecture should be incorporated to improve the model’s accuracy and reliability over time as it is used by plant scientists for improved crop management. |
|||||||||||
![]() |
Computer-aided analysis of the k-server conjecture | Christian Coester | Algorithms and Complexity Theory | B | C | Description: The k-server problem is a fundamental problem in the field of online algorithms: There are k mobile “servers” located in a metric space. One by one, requests appear at points of the space, and each request must be served immediately by moving a server to the requested point, without knowledge of future requests. The goal is to minimize the distance moved by servers. The k-server conjecture — arguably the most famous open problem in online algorithms — asserts that there exists an algorithm whose distance traveled is at most k times the optimum in hindsight, on any instance. An algorithm conjectured to achieve this has been known for over 30 years, but no one has been able to prove it. On specific small metric spaces, it is possible to test the validity of the conjecture with an exhaustive computer search. Similar computer searches recently refuted some stronger conjectures that would have implied the k-server conjecture. In light of these results, and the very recent refutation of the sibling “randomized k-server conjecture”, some specific metric spaces suggest themselves as potential counter-examples. In this project, the student will implement and run a search for counter-examples on some relevant metric spaces, and determine whether they satisfy the conjecture or not. | ||||||||||
![]() |
Topics in Online Algorithms and Learning-Augmented Algorithms | Christian Coester | Algorithms and Complexity Theory | C | MSc | Description: An online algorithm is an algorithm that receives its input over time, e.g. as a sequence of requests that need to be served. The algorithm must act on each request upon its arrival, without knowledge of future requests, often with the goal of minimising some cost. (For example, assigning taxis to ride requests with the goal of minimising distance traveled by taxis.) Due to the lack of information about future requests, the algorithm cannot always make optimal decisions, but for many problems there exist algorithms that provably find solutions whose cost is within a bounded factor of the optimum, regardless of the input. An emerging related field of research are learning-augmented algorithms, where the algorithm receives some (perhaps erroneous) predictions about the future as additional input. In this project, the student will work on a selected problem in the field of online algorithms or learning-augmented algorithms, with the goal of designing algorithms and theoretically analysing their performance. The project is suitable for mathematically oriented students with an interest in proving theorems about the performance of algorithms. | ||||||||||
![]() |
General game playing using inductive logic programming | Andrew Cropper | Artificial Intelligence and Machine Learning | B | C | MSc | In the general game playing competition, an agent is given the rules of a game (described as a logic program) and then starts playing the game (i.e. the agent generates traces of behaviour). This project will invert the task so that an agent is given traces of behaviour and then has to learn a set of rules that could produce the behaviour. The project will use techniques from inductive logic programming, a form of machine learning which learns computer programs from input/output examples. This work is mainly implementation and experimentation. Prerequisites: familiarity with logic programming (Prolog) | |||||||||
![]() |
Learning to program by debugging | Andrew Cropper | Artificial Intelligence and Machine Learning | MSc | Abstract: The goal of program synthesis/induction is to develop machines that can write programs for us. For instance, given examples of unsorted/sorted lists, the goal is to induce a sorting algorithm. This project focuses on inductive logic programming (ILP) [1,2], a form of machine learning based on mathematical logic. Given examples and background knowledge (BK), the goal of ILP is to induce a logic program (a set of logical rules) that with the BK generalises the examples. The goal of this project is to develop techniques to learn programs by debugging faulty programs. The project is based on a generate/test/constrain loop approach to ILP [3], where the constrain stage derives constraints from faulty programs to guide future generate stages. This project will to modify recent work [4] to apply to a Datalog setting, where we can extract more information from faulty programs, such as identifying unsatisfiable pairs of literals. This project is a mix of theory, implementation, and experimentation. Prerequisites: First-order logic, and preferably logic programming concepts (Prolog and/or Answer Set Programming), or at least an interest to learn about them. [1] Andrew Cropper, Sebastijan Dumancic: Inductive logic programming at 30: a new introduction. CoRR abs/2008.07912 (2020) [2] Andrew Cropper, Sebastijan Dumancic, Stephen H. Muggleton: Turning 30: New Ideas in Inductive Logic Programming. IJCAI 2020: 4833-4839 [3] Andrew Cropper, Rolf Morel: Learning programs by learning from failures. Mach. Learn. 110(4): 801-856 (2021) [4] Rolf Morel, Andrew Cropper: Learning Logic Programs by Explaining Failures. CoRR abs/2102.12551 (2021) | |||||||||||
![]() |
Machine learning efficient time-complexity programs | Andrew Cropper | Artificial Intelligence and Machine Learning | C | MSc | "Metaopt [1,2] is an inductive logic programming (ILP) system that learns efficient logic programs from examples. For instance, given input/output examples of unsorted/sorted lists Metaopt learns quicksort (rather than bubblesort). However, Metaopt does not identify the complexity class of learned programs. The goals of this project are to (1) develop methods to identify the complexity class of a program during the learning, and (2) see whether the complexity information can improve the proof search. This work is a mix of theory, implementation, and experimentation. [1] A. Cropper and S.H. Muggleton. Learning efficient logic programs. Machine learning (2018). https://doi.org/10.1007/s10994-018-5712-6 [2] A. Cropper and S.H. Muggleton. Learning efficient logical robot strategies involving composable objects. In Proceedings of the 24th International Joint Conference Artificial Intelligence (IJCAI 2015), pages 3423-3429. IJCAI, 2015." Prerequisites: familiarity with general machine learning (and preferably computational learning theory) and ideally logic programming | ||||||||||
![]() |
Projects on logic-based machine learning (inductive logic programming) | Andrew Cropper | Artificial Intelligence and Machine Learning | B | C | MSc | Description: Inductive logic programming (ILP) [1,2] is a form of machine learning based on mathematical logic. Given examples and background knowledge (BK), the goal of ILP is to induce a logic program (a set of logical rules) that with the BK generalises the examples. For instance, given examples of unsorted/sorted lists, the goal of ILP is to induce a sorting algorithm. I am happy to supervise projects on ILP. These projects will particularly suit students interested in constraint satisfaction, symbolic machine learning, and knowledge representation. Prerequisites: ideally you will have taken the courses Logic and Proof, Intelligent Systems, and Knowledge Representation & Reasoning | |||||||||
![]() |
Relevancy of background knowledge in inductive logic programming | Andrew Cropper | Artificial Intelligence and Machine Learning | B | C | MSc | Inductive logic programming (ILP) is a form of machine learning which learns computer programs from input/output examples of a target program. To improve learning efficiency, ILP systems use background knowledge (i.e. auxiliary functions such as partition and append). However, most ILP systems cannot handle large amounts of background knowledge, and overcoming this limitation is a key challenge in ILP. The goal of this project is to explore techniques to identify relevant background knowledge. There is much freedom with this project, where one could focus on logical aspects, such as finding logically redundant background knowledge, or one could instead focus on statistical aspects, such as finding background knowledge most likely to be relevant for a given task. This work is a mix of theory, implementation, and experiments. Prerequisites: familiarity with statistics, statistical machine learning, and ideally logic programming | |||||||||
![]() |
REASONING ABOUT TEMPORAL KNOWLEDGE | Bernardo Cuenca Grau, Przemysław Wałęga | Artificial Intelligence and Machine Learning, Data and Knowledge | MSc | "Our world produces nowadays huge amounts of time stamped data, say measurements from meteorological stations, recording of online payments, GPS locations of your mobile phone, etc. To reason on top of such massive temporal datasets effectively we need to provide a well-structured formalisation of temporal knowledge and to devise algorithms with good computational properties. This, however, is highly non-trivial; in particular logical formalisms for temporal reasoning often have high computational complexity. This project provides an opportunity to join the Knowledge Representation and Reasoning group and participate in exciting EPSRC-funded research on temporal reasoning, temporal knowledge graphs, and reasoning over streaming data. There are opportunities to engage both in theoretically-oriented and practically-oriented research in this area. For example, in recent months, we have been investigating the properties and applications of DatalogMTL---a temporal extension of the well-known Datalog rule language which is well-suited for reasoning over large and frequently-changing temporal datasets; the project could focus on analysing the theoretical properties of DatalogMTL and its fragments such as its complexity and expressiveness, or alternatively in more practical aspects such as optimisation techniques for existing reasoning algorithms. There are many avenues for research in this area and we would be more than happy to discuss possible concrete alternatives with the student(s)." The theoretical part of the project requires good understanding of logics (completing Knowledge Representation and Reasoning course could be beneficial), whereas the practical part is suited for those who have programming skills. |
|||||||||||
![]() |
Automated Synthesis of Norms in Multi-Agent Systems | Giuseppe De Giacomo | Data and Knowledge | C | MSc | Norms have been widely proposed to coordinate and regulate multi-agent systems (MAS) behaviour. In this project we want to consider the problem of synthesising and revising the set of norms in a normative MAS to satisfy a design objective expressed in logic. We focus on dynamic norms, that is, that allow and disallow agent actions depending on the history so far. In other words, the norm places different constraints on the agents' behaviour depending on its own state and the state of the underlying MAS. Having taken the Foundation of Self-Programming and Computer-Aided Formal Verification is not required but will be of help. Under suitable assumptions, devise and implement norm synthesis techniques against given logical specifications. Reactive Synthesis, Game-Theoretic Techniques, Temporal Logics, ATL, ATL*, Strategy Logics, Reasoning about Actions, Planning, Model Checking of Multi-Agent Systems Natasha Alechina, Giuseppe De Giacomo, Brian Logan, Giuseppe Perelli: Automatic Synthesis of Dynamic Norms for Multi-Agent Systems. KR 2022 (and reference there in) | ||||||||||
![]() |
Nondeterministic Situation Calculus | Giuseppe De Giacomo | Data and Knowledge | C | MSc | Background and focus The standard situation calculus assumes that atomic actions are deterministic. But many domains involve nondeterministic actions. The key point when dealing with nondeterministic actions is that we need to clearly distinguish between choices that can be made by the agent and choices that are made by the environment, i.e., angelic vs. devilish nondeterminism. “Nondeterministic Situation Calculus” is an extension to the standard situation calculus that accommodates nondeterministic actions and preserves Reiter’s solution to the frame problem. It allows for answering projection queries through regression. But it also provides the means to formalize, e.g., (first-order) planning in nondeterministic domains and to define execution of ConGolog high-level program in nondeterministic domains. Having taken the Foundation of Self-Programming and Computer-Aided Formal Verification is not required but will be of help. Research topic/question and expected contribution Under suitable assumptions (e.g., propositional or finite-object cases), devise and implement reasoning about action and strategic reasoning techniques for nondeterministic situation calculus based on formal methods methodologies Method Reasoning about Actions, Planning, Model Checking, reactive synthesis Existing work and references Giuseppe De Giacomo, Yves Lespérance: The Nondeterministic Situation Calculus. KR 2021: 216-226 Ray Reiter: Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems, MIT Press 2001 Giuseppe De Giacomo, Yves Lespérance, Hector J. Levesque: ConGolog, a concurrent programming language based on the situation calculus. Artif. Intell. 121(1-2): 109-169 (2000) Giuseppe De Giacomo, Yves Lespérance, Fabio Patrizi: Bounded situation calculus action theories. Artif. Intell. 237: 172-203 (2016) | ||||||||||
![]() |
Planning for Temporally Extended Goals in Linear Time Logics of Finite Traces | Giuseppe De Giacomo | Data and Knowledge | C | MSc | Planning for temporally extended goals expressed in Linear Time Logics on finite traces requires synthesizing a strategy to fulfil the temporal (process) specification expressed by the goal. To do so one can take advantage of techniques developed in formal methods, based on reduction to two-players game and forms model checking. It is crucial to keep in mind that planning is always performed in a domain, which can be thought of a specification of the possible environment reactions. Typically, such a domain specification is is Markovian, i.e., the possible reactions of the environment depend on the current state and the agent action only. However, forms of non-Markovian domains are also possible and, in fact, of great interest. Note that the domain can be fully observable, partially observable, or non-observable at all. Each of these settings leads to different forms of planning and different techniques for solving it. Having taken the Foundation of Self-Programming and Computer-Aided Formal Verification is not required but will be of help. Under suitable assumptions, devise and implement techniques for synthesizing strategies that fulfil temporally extend goals in a planning domain, based on both AI and formal methods methodologies Reasoning about Actions, Planning, Model Checking, reactive synthesis Giuseppe De Giacomo, Sasha Rubin: Automata-Theoretic Foundations of FOND Planning for LTLf and LDLf Goals. IJCAI 2018: 4729-4735 Giuseppe De Giacomo, Moshe Y. Vardi: LTLf and LDLf Synthesis under Partial Observability. IJCAI 2016: 1044-1050 Giuseppe De Giacomo, Moshe Y. Vardi: Synthesis for LTL and LDL on Finite Traces. IJCAI 2015: 1558-1564 Giuseppe De Giacomo, Moshe Y. Vardi: Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. IJCAI 2013: 854-860 Giuseppe De Giacomo, Moshe Y. Vardi: Automata-Theoretic Approach to Planning for Temporally Extended Goals. ECP 1999: 226-238 | ||||||||||
![]() |
Reactive Program Synthesis Under Environment Specifications in Linear Time Logics on Finite and Infinite Traces. | Giuseppe De Giacomo | Data and Knowledge | C | MSc | “Reactive synthesis” is the automated synthesis of programs for interactive/reactive ongoing computations (protocols, operating systems, controllers, robots, etc.). This kind of synthesis is deeply related to planning in nondeterministic environments. In this project we are interested in Reactive Synthesis under Environment Specifications, where the agent can take advantage of the knowledge about the environment's behaviour in achieving its tasks. We consider how to obtain the game arena out of the LTL and LTLf specifications of both the agent task and the environment behaviours, and how to solve safety games, reachability games, games for LTLf objectives, and for objectives expressed in fragments of LTL. We also want to understand how winning regions of such arenas are related to the notion of “being able” to achieve desired properties (without necessarily committing to a specific strategy for doing so). We focus on agent tasks that eventually terminate and hence are specified in LTLf. While for the environment we focus on safety specifications and limited forms of guarantee, reactivity, fairness, and stability. Having taken the Foundation of Self-Programming and Computer-Aided Formal Verification is not required but will be of help. Under suitable assumptions, devise and implement reactive synthesis techniques for fulfil agent task in environment with specified behaviour. Reactive Synthesis, Temporal Logics, 2-Player games, Reasoning about Actions, Planning, Model Checking Benjamin Aminof, Giuseppe De Giacomo, Aniello Murano, Sasha Rubin: Planning under LTL Environment Specifications. ICAPS 2019: 31-39 Giuseppe De Giacomo, Antonio Di Stasio, Moshe Y. Vardi, Shufang Zhu: Two-Stage Technique for LTLf Synthesis Under LTL Assumptions. KR 2020: 304-314 Giuseppe De Giacomo, Moshe Y. Vardi: Synthesis for LTL and LDL on Finite Traces. IJCAI 2015: 1558-1564 Giuseppe De Giacomo, Moshe Y. Vardi: Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. IJCAI 2013: 854-860 Benjamin Aminof, Giuseppe De Giacomo, Sasha Rubin: Best-Effort Synthesis: Doing Your Best Is Not Harder Than Giving Up. IJCAI 2021: 1766-1772 Shufang Zhu, Giuseppe De Giacomo: Synthesis of Maximally Permissive Strategies for LTLf Specifications. IJCAI 2022: 2783-2789 Shufang Zhu, Giuseppe De Giacomo: Act for Your Duties but Maintain Your Rights. KR 2022 Giuseppe De Giacomo, Antonio Di Stasio, Francesco Fuggitti, Sasha Rubin: Pure-Past Linear Temporal and Dynamic Logic on Finite Traces. IJCAI 2020: 4959-4965 | ||||||||||
![]() |
Reactive Program Synthesis and Planning under Multiple Environments | Giuseppe De Giacomo | Data and Knowledge | C | MSc | In this project we consider an agent that operates, with multiple models of the environment, e.g., two of them: one that captures expected behaviours and one that captures additional exceptional behaviours. We want to study the problem of synthesizing agent strategies that enforce a goal against environments operating as expected while also making a best effort against exceptional environment behaviours. We want to formalize these concepts in the context of linear-temporal logic (especially on finite traces) and give algorithms for solving this problem. More generally, we want to formalize and solve synthesis in the case of multiple, even contradicting, assumptions about the environment. One solution concept is based on ``best-effort strategies'' which are agent plans that, for each of the environment specifications individually, achieve the agent goal against a maximal set of environments satisfying that specification. Having taken the Foundation of Self-Programming and Computer-Aided Formal Verification is not required but will be of help. Under suitable assumptions, devise and implement reactive synthesis techniques that are effective under multiple environments. Reactive Synthesis, Temporal Logics, 2-Player games, Reasoning about Actions, Planning, Model Checking Benjamin Aminof, Giuseppe De Giacomo, Alessio Lomuscio, Aniello Murano, Sasha Rubin: Synthesizing Best-effort Strategies under Multiple Environment Specifications. KR 2021: 42-51 Benjamin Aminof, Giuseppe De Giacomo, Alessio Lomuscio, Aniello Murano, Sasha Rubin: Synthesizing strategies under expected and exceptional environment behaviors. IJCAI 2020: 1674-1680 Daniel Alfredo Ciolek, Nicolás D'Ippolito, Alberto Pozanco, Sebastian Sardiña: Multi-Tier Automated Planning for Adaptive Behavior. ICAPS 2020: 66-74 Benjamin Aminof, Giuseppe De Giacomo, Sasha Rubin: Best-Effort Synthesis: Doing Your Best Is Not Harder Than Giving Up. IJCAI 2021: 1766-1772 Benjamin Aminof, Giuseppe De Giacomo, Aniello Murano, Sasha Rubin: Planning under LTL Environment Specifications. ICAPS 2019: 31-39 Paolo Felli, Giuseppe De Giacomo, Alessio Lomuscio: Synthesizing Agent Protocols from LTL Specifications Against Multiple Partially-Observable Environments. KR 2012 Giuseppe De Giacomo, Moshe Y. Vardi: Synthesis for LTL and LDL on Finite Traces. IJCAI 2015: 1558-1564 Giuseppe De Giacomo, Moshe Y. Vardi: Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. IJCAI 2013: 854-860 | ||||||||||
![]() |
Reinforcement learning under safety non-Markovian Safety Specifications and Rewards expressed in Linear Temporal Logics on Finite Traces | Giuseppe De Giacomo | Data and Knowledge | C | MSc | In some cases, the agent has a simulator of the environment instead of a formal specification, so it needs to learn its strategies to achieve its task in the environment. Sometimes even the task is only implicitly specified through rewards. The key issue is that the type of properties we are often interested in are non-Markovian (e.g., specified in LTL or LTLf), and hence we need to introduce non-Markovian characteristics in decision processes and reinforcement learning. A particular promising direction is when such non-Markovian characteristics can be expressed in Pure Past Linear Temporal Logics. Having taken the Foundation of Self-Programming and Computer-Aided Formal Verification is not required but will be of help. Under suitable assumptions, devise and implement reinforcement learning techniques that remain safe wrt safety specification and achieve rewards specified in linear temporal logics on finite traces Reinforcement Learning, MDP, Non-Markovian Rewards, Non-Markovian Decision Processes, Linear Temporal Logics Ronen I. Brafman, Giuseppe De Giacomo, Fabio Patrizi: LTLf/LDLf Non-Markovian Rewards. AAAI 2018: 1771-1778Synthesis for LTL and LDL on Finite Traces. IJCAI 2015: 1558-1564 Giuseppe De Giacomo, Luca Iocchi, Marco Favorito, Fabio Patrizi: Foundations for Restraining Bolts: Reinforcement Learning with LTLf/LDLf Restraining Specifications. ICAPS 2019: 128-136 Mohammed Alshiekh, Roderick Bloem, Rüdiger Ehlers, Bettina Könighofer, Scott Niekum, and Ufuk Topcu. Safe reinforcement learning via shielding. AAAI-18: 2669–2678. Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi: Imitation Learning over Heterogeneous Agents with Restraining Bolts. ICAPS 2020: 517-521 Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi, Alessandro Ronca: Temporal Logic Monitoring Rewards via Transducers. KR 2020: 860-870 Giuseppe De Giacomo, Antonio Di Stasio, Francesco Fuggitti, Sasha Rubin: Pure-Past Linear Temporal and Dynamic Logic on Finite Traces. IJCAI 2020: 4959-4965 Giuseppe De Giacomo, Moshe Y. Vardi: Linear Temporal Logic and Linear Dynamic Logic on Finite Traces. IJCAI 2013: 854-860 | ||||||||||
![]() |
Analysis of Schelling segregation models. | Edith Elkind | Artificial Intelligence and Machine Learning | MSc | In Schelling's model of strategic segergation, agents are placed on a highly regular graph (such as a line of a grid), and each agent belongs to one of k types. Agents have a preference towards being surrounded by agents who belong to their own type, and may change locations if they are not happy at their current location (by moving to an empty location or swapping with another discontent agent). Many variants of this basic model have been considered over the years. The goal of this project is to investigate, theoretically and empirically, the degree of diversity of stable outcomes in Schelling's model, as well as to explore novel variants of the model where agents's preferences may evolve over time. | |||||||||||
![]() |
Stable roommates problem with structured preferences | Edith Elkind | Artificial Intelligence and Machine Learning | MSc | In the stable roommates problem, there are k rooms of varying sizes and n agents, who need to be allocated places in these rooms; it is sometimes assumed that the total number of places is exactly n. Agents may have preferences over rooms as well as potential rommates and may move between rooms so as to improve their assignment. The goal of the project is to understand the complexity of finding stable outcomes in such settings, assuming that agents' preferences over assignments have a simple structure (e.g., each agent wants to maximize the number of friends in their room), and to compare the best social welfare that can be achieved by a centralizes assignment and that achievable in a stable assignment. | |||||||||||
![]() |
"Beyond MCMC -- scalable and approximate Bayesian inference for computational statistics in global health" | Seth Flaxman | Artificial Intelligence and Machine Learning | B | C | MSc | Co-supervisor: Dr Swapnil Mishra (https://s-mishra.github.io/) In applied work, especially disease modeling, we have reached the limits of what standard MCMC samplers can solve in a reasonable amount of computational time. We will investigate a variety of recently proposed inference schemes, from variational Bayes to deep learning ensemble models to parallel implementations of Sequential Monte Carlo, applying them to popular models in biostatistics, especially multilevel / hierarchical models. The goal will not be just to figure out "what works" but to understand the shortcomings of existing tools, and come up with guidance for practitioners. Co-supervisor |
|||||||||
![]() |
"Why are there so many methods for time series prediction and forecasting?" | Seth Flaxman | Artificial Intelligence and Machine Learning | B | C | MSc | Co-supervised by Dr Swapnil Mishra (https://s-mishra.github.io/)
|
|||||||||
![]() |
Topics in Randomised Algorithms and Computational Complexity | Andreas Galanis | Algorithms and Complexity Theory | C | MSc | Description: Andreas Galanis is willing to supervise projects in the areas of randomised algorithms and computational complexity. Problems of interest include (i) the analysis of average case instances of hard combinatorial problems (example: can we satisfy a random Boolean formula?), and (ii) the computational complexity of counting problems (example: can we count approximately the number of proper colourings of a graph G?). The projects would suit mathematically oriented students, especially those with an interest in applying probabilistic methods to computer science. | ||||||||||
![]() |
Dynamic information design | Jiarui Gan | Artificial Intelligence and Machine Learning | MSc | In information design, a more-informed player (sender) influences a less-informed decision-maker (receiver) by signalling information about the state of the world. The problem for the sender is to compute an optimal signalling strategy, which leads to the receiver taking actions that benefit the sender. Dynamic information design, as a new frontier of information design, generalises the one-shot framework to dynamic settings that are modelled based on Markov decision processes. The goal of the project is to study several variants of the dynamic information design problem and it can be approached from both theoretical and empirical perspectives. Theoretically, the focus is on determining the computational complexity of the optimal information design in different dynamic settings and developing algorithms. A background in computational complexity and algorithm design is beneficial. Practically, the objective is to apply existing algorithms to novel applications, such as traffic management or board games, and to develop algorithms that work effectively in real-world scenarios using state-of-the-art methods. Knowledge in Markov Decision Processes, stochastic/sequential games, and Reinforcement Learning is preferred. Related work: J. Gan, R. Majumdar, G. Radanovic, A. Singla. Bayesian persuasion in sequential decision-making. AAAI '22 E. Kamenica, M. Gentzkow. Bayesian persuasion. American Economic Review, 2011 S. Dughmi. Algorithmic information structure design: a survey. ACM SIGecom Exchanges 15.2 (2017): 2-24. Wu, J., Zhang, Z., Feng, Z., Wang, Z., Yang, Z., Jordan, M. I., & Xu, H. (2022). Sequential Information Design: Markov Persuasion Process and Its Efficient Reinforcement Learning. arXiv preprint arXiv:2202.10678. |
|||||||||||
![]() |
Data Science and Machine Learning Techniques for Model Parameterisation in Biomedical and Environmental Applications | David Gavaghan, Martin Robinson, Michael Clerx | Computational Biology and Health Informatics | B | C | MSc | "Time series data arise as the output of a wide range of scientific experiments and clinical monitoring techniques. Typically the system under study will either be undergoing time varying changes which can be recorded, or the system will have a time varying signal as input and the response signal will be recorded. Familiar everyday examples of the former include ECG and EEG measurements (which record the electrical activity in the heart or brain as a function of time), whilst examples of the latter occur across scientific research from cardiac cell modelling to battery testing. Such recordings contain valuable information about the underlying system under study, and gaining insight into the behaviour of that system typically involves building a mathematical or computational model of that system which will have embedded within in key parameters governing system behaviour. The problem that we are interested in is inferring the values of these key parameter through applications of techniques from machine learning and data science. Currently used methods include Bayesian inference (Markov Chain Monte Carlo (MCMC), Approximate Bayesian Computation (ABC)), and non-linear optimisation techniques, although we are also exploring the use of other techniques such as probabilistic programming and Bayesian deep learning. We are also interested in developing techniques that will speed up these algorithms including parallelisation, and the use of Gaussian Process emulators of the underlying models Application domains of current interest include modelling of the cardiac cell (for assessing the toxicity of new drugs), understanding how biological enzymes work (for application in developing novel fuel cells), as well as a range of basic science problems. Application domains of current interest include modelling of the cardiac cell (for assessing the toxicity of new drugs), understanding how biological enzymes work (for application in developing novel fuel cells), as well as a range of basic science problems. " Prerequisites: some knowledge of Python | |||||||||
![]() |
Topics in Continuous Maths, Discrete Maths, Linear Algebra, Object-oriented Programming, Machine Learning and Probability and Computing | David Gavaghan | Computational Biology and Health Informatics | B | C | Prof Dave Gavaghan is willing to supervise in the area of Topics in Continuous Maths, Discrete Maths, Linear Algebra, Object-oriented Programming, Machine Learning and Probability and Computing. | ||||||||||
![]() |
Cake-cutting with low envy | Paul Goldberg | Algorithms and Complexity Theory | C | Cake-cutting refers to the design of protocols to share a divisible good amongst a collection of agents. A standard starting-point for cake-cutting protocols is the classical "I cut, you choose" rule. This rule is said to be "envy-free" since each player can ensure that they value their own share at least as much as they value the other player's share. Well-known further work has extended this idea to more than 2 players. In the paper at the URL below, we identify various classes of protocols, and show that one can convert protocols from one kind to another so as to maintain the worst-case level of envy that results. https://arxiv.org/abs/2108.03641 The project is to look at the computational complexity of evaluating the worst-case level of envy that can result from using a given protocol, and related questions about protocols belonging to classes of interest. The project is mainly based on mathematical analysis as opposed to computational experiment. But there is scope for some computational experiment, for example in searching for value functions that result in a high envy. | |||||||||||
![]() |
Learning probabilistic automata | Paul Goldberg | Algorithms and Complexity Theory | C | Probabilistic automata are a means of producing randomly-generated strings that are accepted/recognised by a finite automaton. (They are conceptually similar to hidden Markov models.) The general topic of the project is to find algorithms to reconstruct an automaton based on observations of strings generated by that automaton. It's a topic that has led to a steady stream of papers in the research literature, and the project would focus on one of the many variants of the problem. For example, "labelled automata" in which partial information is provided about the current state of the automaton in the form of random labels associated with states. Another general issue is how to efficiently extract extra information present in long strings generated by the unknown automaton. Within this project topic, there is scope for focusing either on experiments, or on mathematical analysis of algorithms. In the latter case, it would be helpful to know the basic concepts of computational learning theory. | |||||||||||
![]() |
Medical information extraction with deep neural networks | Paul Goldberg, Alejo Nevado-Holgado | Algorithms and Complexity Theory | MSc | Background. Electronic Health Records (EHRs) are the databases used by hospital and general practitioners to daily log all the information they record from patients (i.e. disorders, taken medications, symptoms, medical tests…). Most of the information held in EHRs is in the form of natural language text (written by the physician during each session with each patient), making it inaccessible for research. Unlocking all this information would bring a very significant advancement to biomedical research, multiplying the quantity and variety of scientifically usable data, which is the reason why major efforts have been relatively recently initiated towards this aim (e.g. I2B2 challenges https://www.i2b2.org/NLP/ or the UK-CRIS network of EHRs https://crisnetwork.co/uk-cris-programme) Project. Recent Deep Neural Networks (DNN) architectures have shown remarkable results in traditionally unsolved NLP problems, including some Information Extraction tasks such as Slot Filling [Mesnil et al] and Relation Classification [dos Santos et al]. When transferring this success to EHRs, DNNs offer the advantage of not requiring well formatted text, while the problem remains of labelled data being scarce (ranging on the hundreds for EHRs, rather than the tens of thousands used in typical DNN studies). However, ongoing work in our lab has shown that certain extensions of recent NLP-DNN architectures can reproduce the typical remarkable success of DNNs in situations with limited labelled data (paper in preparation). Namely, incorporating interaction terms to feed forwards DNN architectures [Denil et al] can rise the performance of relation classification in I2B2 datasets from 0.65 F1 score to 0.90, while the highest performance previously reported with the same dataset was 0.74. We therefore propose to apply DNNs to the problem of information extraction in EHRs, using I2B2 and UK-CRIS data as a testbed. More specifically, the DNNs designed and implemented by the student should be able to extract medically relevant information, such as prescribed drugs or diagnoses given to patients. This corresponds to some of the challenges proposed by I2B2 during recent years (https://www.i2b2.org/NLP/Medication/), and are objectives of high interest in UK-CRIS which have sometimes been addressed with older techniques such as rules [Iqbal et al, Jackson et al]. The student is free to use the extension of the feed forward DNN developed in our lab, or to explore other feed forwards or recurrent (e.g. RNN, LSTM or GRU) alternatives. The DNN should be implemented in Python’s Keras, Theano, Tensorflow, or PyTorch. Bibliography G. Mesnil et al., Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding, IEEEACM Trans. Audio Speech Lang. Process. 23 (2015) 530–539. doi:10.1109/TASLP.2014.2383614. C.N. dos Santos, B. Xiang, B. Zhou, Classifying Relations by Ranking with Convolutional Neural Networks, CoRR. abs/1504.06580 (2015). http://arxiv.org/abs/1504.06580. M. Denil, A. Demiraj, N. Kalchbrenner, P. Blunsom, N. de Freitas, Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network, CoRR. abs/1406.3830 (2014). http://arxiv.org/abs/1406.3830. E. Iqbal, R. Mallah, R.G. Jackson, M. Ball, Z.M. Ibrahim, M. Broadbent, O. Dzahini, R. Stewart, C. Johnston, R.J.B. Dobson, Identification of Adverse Drug Events from Free Text Electronic Patient Records and Information in a Large Mental Health Case Register, PLOS ONE. 10 (2015) e0134208. doi:10.1371/journal.pone.0134208. R.G. Jackson MSc, M. Ball, R. Patel, R.D. Hayes, R.J. Dobson, R. Stewart, TextHunter – A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research, AMIA. Ann. Symp. Proc. 2014 (2014) 729–738. |
|||||||||||
![]() |
Neural network genomics | Paul Goldberg, Alejo Nevado-Holgado | Algorithms and Complexity Theory | MSc | Background. For more than 10 years, GWAS studies have represented a revolution in the study of human disorders and human phenotypes. By measuring how your risk of suffering any given disease changes according to SNP mutations, GWAS studies can measure how relevant each gene is to the disease under study. However, this SNP-disease association is calculated “one SNP at a time”, and ignores all gene-gene interactions that are often crucially important to the causation of the disorder. If the interactions between a number of genes (e.g. insulin and its receptor in type 2 diabetes; or APP and alpha-, beta- and gamma- secretase in Alzheimer…) is what produces a given disorder, this interaction will not be detected in a GWAS analysis. This shortcoming may not be a problem in monogenetic hereditable disorders, such as Huntington disease, where mutations in a single gene by itself are enough for causing the disease. However, GWAS studies will likely not uncover the mechanisms of complex disorders, where the disease emerges from the interaction of a number of genes. This is likely the source of the so called “missing hereditability” problem observed in most complex traits or diseases, where all the SNP variations taken together cannot account for most of the hereditability of a given trait or disease [1]. In addition, it has been demonstrated that complex traits such as height and BMI are clearly and strongly hereditable [2], but GWAS studies simply cannot detect most of this hereditability. In summary, GWAS analyses detect simple individual genetic factors, but not interactions between genetic factors. Project. In this project, we propose to identify the interacting genetic factors behind Alzheimer’s disease and other neurodegenerations with neural networks, which are known for exploiting the interactions present in the to-be analysed data. While the linear models used in GWAS studies are able to identify only linear and monovariated contributions of each gene to a disorder, neural networks can analyse how genes interact with each other to explain the studied disorder. This versatility is what has allowed neural networks to find such widespread use in industry in the last decade, where they are revolutionizing image, sound and language analysis [3–5]. In our laboratories we already have the hardware and the expertise required to build neural networks, and we have used them in other research areas relevant to Alzheimer’s disease. In addition, we have access and experience using UK Biobank, which is the ideal dataset to implement this project. In UK Biobank they have measured wide array SNPs, all disorders and demographics in ~500,000 participants between the ages of 37 and 73, and more than 5000 of them suffer from Alzheimer’s disease, Parkinson’s disease or other neurodegenerations. We propose the MSc student to build a neural network to predict either diagnoses or disease-related endophenotypes (i.e. brain volumes of affected areas, cognitive scores…) of each one of these participants, using only the information present in the wide array SNPs and relevant demographics. The student is free to use the extension of the feed forward DNN developed in our lab, or to explore other feed forwards or recurrent (e.g. RNN, LSTM or GRU) alternatives. The DNN should be implemented in Python’s Keras (https://keras.io/), Theano (http://deeplearning.net/software/theano/), Tensorflow (https://www.tensorflow.org/), or PyTorch (http://pytorch.org/). Bibliography [1] T.A. Manolio, F.S. Collins, N.J. Cox, D.B. Goldstein, L.A. Hindorff, D.J. Hunter, M.I. McCarthy, E.M. Ramos, L.R. Cardon, A. Chakravarti, J.H. Cho, A.E. Guttmacher, A. Kong, L. Kruglyak, E. Mardis, C.N. Rotimi, M. Slatkin, D. Valle, A.S. Whittemore, M. Boehnke, A.G. Clark, E.E. Eichler, G. Gibson, J.L. Haines, T.F.C. Mackay, S.A. McCarroll, P.M. Visscher, Finding the missing heritability of complex diseases, Nature. 461 (2009) 747–753. doi:10.1038/nature08494. [2] D. Cesarini, P.M. Visscher, Genetics and educational attainment, Npj Sci. Learn. 2 (2017). doi:10.1038/s41539-017-0005-6. [3] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature. 521 (2015) 436–444. doi:10.1038/nature14539. [4] A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau, S. Thrun, Dermatologist-level classification of skin cancer with deep neural networks, Nature. 542 (2017) 115–118. doi:10.1038/nature21056. [5] V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness, M.G. Bellemare, A. Graves, M. Riedmiller, A.K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis, Human-level control through deep reinforcement learning, Nature. 518 (2015) 529–533. doi:10.1038/nature14236. | |||||||||||
![]() |
A Conceptual Model for Assessing Privacy Risk | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | Privacy is not a binary concept, the level of privacy enjoyed by an individual or organisation will depend upon the context within which it is being considered; the more data at attacker has access to the more potential there may be for privacy compromise. We lack a model which considers the different contexts that exist in current systems, which would underpin a measurement system for determining the level of privacy risk that might be faced. This project would seek to develop a prototype model – based on a survey of known privacy breaches and common practices in data sharing. The objective being to propose a method by which privacy risk might be considered taking into consideration the variety of (threat and data-sharing) contexts that any particular person or organisation might be subjected to. It is likely that a consideration of the differences and similarities of the individual or organisational points of view will need to be made, since the nature of contexts faced could be quite diverse. | |||||||||
![]() |
Computer Vision for Physical Security | Michael Goldsmith, Sadie Creese | Security | B | C | MSc | Computer Vision allows machines to recognise objects in real-world footage. In principle, this allows machines to flag potential threats in an automated fashion based on historical and present images in video footage of real-world environments. Automated threat detection mechanisms to help security guards identify threats would be of tremendous help to them, especially if they have to temporarily leave their post, or have to guard a significant number of areas. In this project, students are asked to implement a system that is able to observe a real environment over time and attempt to identify potential threats, independent of a number of factors, e.g. lighting conditions or wind conditions. The student is encouraged to approach this challenge as they see fit, but would be expected to design, implement and assess any methods they develop. One approach might be to implement a camera system using e.g. a web camera or a Microsoft Kinect to conduct anomaly detection on real-world environments, and flag any issues related to potential threats. Requirements: Programming skills required |
|||||||||
![]() |
Considering the performance limiters for cybersecurity controls | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | The behaviour, and so effectiveness, of security controls is dependent upon how they are used. One obvious example being the protection afforded by a firewall is dependent upon the maintenance of the rules that determine what the firewall stops and what it does not. The benefit (to the organisation or user seeking to manage risk) of various technical controls in an operational context lacks good evidence and data, so there is scope to consider the performance of controls in a lab environment. This mini-project would select one or more controls from the CIS Top 20 Critical Security Controls (CSC) (version 6.1) and seek to develop laboratory experiments (and implement them) to gather data on how the effectiveness of the control is impacted by its deployment context (including, for example, configuration, dependence on other controls, nature of the threat faced). Requirements: Students will need an ability to develop a test-suite and deploy the selected controls. |
|||||||||
![]() |
Cybersecurity visualization | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | Cybersecurity visualization helps analysts and risk owners alike to make better decisions about what to do when the network is attacked. In this project the student will develop novel cybersecurity visualizations. The student is free to approach the challenge as they see fit, but would be expected to design, implement and assess the visualizations they develop. These projects tend to have a focus on network traffic visualization, but the student is encouraged to visualize datasets they would be most interested in. Other interesting topics might for instance include: host-based activities (e.g. CPU/RAM/network usage statistics), network scans (incl. vulnerabilities scanning). Past students have visualized network traffic patterns and android permission violation patterns. Other projects on visualizations are possible (not just cybersecurity), depending on interest and inspiration. Requirements: Programming skills required. |
|||||||||
![]() |
Detecting disguised processes using Application-Behaviour Profiling | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | MSc | In order to avoid detection, malware can disguise itself as a legitimate program or hijack system processes to reach its goals. Commonly used signature-based Intrusion Detection Systems (IDS) struggle to distinguish between these processes and are thus only of limited use to detect these kind of attacks. They also have the shortcoming that they need to be updated frequently to possess the latest malware definitions. This makes them inherently prone to missing novel attacks. Misuse detection IDSs however overcome this problem by maintaining a ground truth of normal application behaviour and reporting deviations as anomalies. In this project, students will be tasked to investigate how a process’ behaviours can be profiled in the attempt to identify whether it is behaving anomalously and if it can be correctly identified as malware. This project will build on existing research in this area. Requirements: Programming skills required. |
|||||||||||
![]() |
Experimenting with anomaly detection features for detecting insider attacks | Michael Goldsmith, Ioannis Agrafiotis, Sadie Creese, Arnau Erola | Security | B | C | MSc | This project will use an anomaly detection platform being developed by the Cyber Security Analytics Group to consider relative detection performance using different feature sets, and different anomalies of interest, in the face of varying attacks. This research would be experimental in nature and conducted with a view to exploring the minimal sets that would result in detection of a particular threat. Further reflection would then be given to how generalisable this result might be and what we might determine about the critical datasets required for this kind of control to be effective. |
|||||||||
![]() |
Eyetracker data analysis | Michael Goldsmith, Ivan Martinovic | Security | MSc | Eyetracking allows researchers to identify where observers look on a monitor. A challenge in analysing eyetracker output
however is identifying patterns of viewing behaviour from the raw eyetracker logs over time and what they mean in a research
context, particularly as no semantic information about the pixel being viewed is considered. In practice, this means that
we know where someone is looking, but nothing about what it means to look at that pixel or the order of pixels. From a research
perspective, being able to tag areas of interest, and what those areas mean, or conduct other statistical analysis on viewing
patterns would be of significant use to analysing results, particularly if these results could be automated. The purpose of
this project is to conduct an eyetracker experiment, and design and implement useful methods to analyse eyetracker data. We
will be able to provide training for the student, so they are able to use the eyetracking tools themselves. Other eyetracking
projects also possible, depending on interest and inspiration. |
|||||||||||
![]() |
Formal threat and vulnerability analysis of a distributed ledger model | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | This project would utilise the process algebra CSP and associated model checker FDR to explore various types of threat and how they might successfully compromise a distributed ledger. This type of modelling would indicate possible attacks on a distributed ledger, and could guide subsequent review of actual designs and testing strategies for implementations. The modelling approach would be based on the crypto-protocol analysis techniques already developed for this modelling and analysis environment, and would seek to replicate the approach for a distributed ledger system. Novel work would include developing the model of the distributed ledger, considering which components are important, formulating various attacker models and also formulating the security requirements / properties to be assessed using this model-checking based approach. Requirements: In order to succeed in this project students would need to have a working knowledge of the machine readable semantics of CSP and also the model checker FDR. An appreciation of threat models and capability will need to be developed. |
|||||||||
![]() |
High Dynamic Range Imaging for Documentation Applications | Michael Goldsmith, Sadie Creese | Security | B | C | MSc | High dynamic range imaging (HDRI) allows more accurate information about light to be captured, stored, processed and displayed to observers. In principle, this allows viewers to obtain more accurate representations of real-world environments and objects. Naturally, HDRI would be of interest to museum curators to document their objects, particularly, non-opaque objects or whose appearance significantly alter dependent on amount of lighting in the environment. Currently, few tools exist that aid curators, archaeologists and art historians to study objects under user-defined parameters to study those object surfaces in meaningful ways. In this project the student is free to approach the challenge as they see fit, but would be expected to design, implement and assess any tools and techniques they develop. The student will then develop techniques to study these objects under user-specified conditions to enable curators and researchers study the surfaces of these objects in novel ways. These methods may involve tone mapping or other modifications of light exponents to view objects under non-natural viewing conditions to have surface details stand out in ways that are meaningful to curators. | |||||||||
![]() |
Intelligent user activity timelines | Michael Goldsmith | Security | B | C | MSc | "Operating system store temporal data in multiple locations. Digital investigators are often tasked with reconstructing timelines of user activities. Timeline generation tools such as log2timeline can aid in extracting temporal data, similarly, 'Professonal' tools such as Encase and Autopsy build and visualise low level timelines. Collectively, these tools: (1) provide (often high levels of) low level data, and (2) are not able to apply any form of reasoning. This project involves the extraction of temporal data and the application of reasoning algorithms to develop reliable event sequences of interest to an investigator. Useful references: Olsson, J. and Boldt, M., 2009. Computer forensic timeline visualization tool. digital investigation, 6, pp.S78-S87. Buchholz, F.P. and Falk, C., 2005, August. Design and Implementation of Zeitline: a Forensic Timeline Editor. In DFRWS." | |||||||||
![]() |
International Cybersecurity Capacity Building Initiatives | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | There is a large investment being made by the international community aimed at helping nations and regions to develop their capacity in cybersecurity. The work of the Global Cyber Security Capacity Centre (based at the Oxford Martin School) studies and documents this: https://www.sbs.ox.ac.uk/cybersecurity-capacity/content/front. There is scope to study in more detail the global trends in capacity building in cybersecurity, the nature of the work and the partnerships that exist to support it. An interesting analysis might be to identify what is missing (through comparison with the Cybersecurity Capacity Maturity Model, a key output of the Centre), and also to consider how strategic, or not, such activities appear to be. An extension of this project, or indeed a second parallel project, might seek to perform a comparison of the existing efforts with the economic and technology metrics that exist for countries around the world, exploring if the data shows any relationships exist between those metrics and the capacity building activities underway. This analysis would involve regression techniques. | |||||||||
![]() |
Modelling of security-related interactions, in CSP or more evocative notations such as Milner's bigraphs | Michael Goldsmith | Security | B | C | MSc | (Joint with Sadie Creese) Modelling of security-related interactions, in CSP or more evocative notations such as Milner's bigraphs (http://www.springerlink.com/content/axra2lvj4ph81bdm/; http://www.amazon.co.uk/The-Space-Motion-Communicating-Agents/dp/0521738334/ref=sr_1_2?ie=UTF8&qid=1334845525&sr=8-2). Concurrency and Computer Security a distinct advantage. Appropriate for good 3rd or 4th year undergraduates or MSc. * Open to suggestions for other interesting topics in Cybersecurity, if anyone has a particular interest they would like to pursue. | |||||||||
![]() |
Organisational Resilience and Self-* re-planning | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | MSc | Key Performance Indicator (KPI) is a type of performance measurement in organisations. They evaluate the success of activities in which an organisation or system engages. Often success is measured in terms of whether an activity has been able to complete and able to repeat (e.g. zero defects, customer satisfaction or similar), other times it is measured in terms of making progress towards a strategic goal. Well-chosen KPIs allows us to reflect upon performance of an organisation, and possibly identify potential future degradation issues in systems. In this project, students are tasked to investigate how KPIs can be used to measure and improve cyber-resilience in an organisation. Particularly, we are interested in investigating how the performance of an organisation, particularly with respects to security as well as mission performance. After identifying which aspects of an organisation or the mission are prone to error, it may be beneficial to propose solutions for how these issues can be addressed. With "self*-re-planning", we believe it would be possible for a system to suggest and automate certain aspects of how the mission, infrastructure or organisation can be repurposed to not fail. The student is encouraged to approach this challenge as they see fit, but would be expected to design, implement and assess any methods they develop. For instance, the project may be network-security focused, mission-processes focused or otherwise. It may also investigate more formal approaches to self-* re-planning methods. Requirements: Programming and/or Social Science Research Methods. |
|||||||||||
![]() |
Penetration testing for harm | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | Current penetration testing is typically utilised for discovering how organisations might be vulnerable to external hacks, and testing methods are driven by using techniques determined to be similar to approaches used by hackers. The result being a report highlighting various exploitable weak-points and how they might result in unauthorised access should a malign entity attempt to gain access to a system. Recent research within the cybersecurity analytics group has been studying the relationship between these kinds of attack surfaces and the kinds of harm that an organisation might be exposed to. An interesting question would be whether an orientation around intent, or harm, might result in a different test strategy? Would a different focus be given to the kinds of attack vectors explored in a test if a particular harm is aimed at? This mini-project would aim to explore these and other questions by designing penetration test strategies based on a set of particular harms, and then seek to consider potential differences with current penetration practices by consultation with the professional community. Requirements: Students will need to have a working understanding of penetration testing techniques. |
|||||||||
![]() |
Photogrammetry | Michael Goldsmith | Security | B | C | MSc | Photogrammetry is a set of techniques that allows for 3D measurements from 2D photographs, especially those measurements pertaining to geometry or surface colours. The purpose of this project is to implement one or more photogrammetry techniques from a series of 2D photographs. The student is free to approach the challenge as they see fit, but would be expected to design, implement and assess the tool they develop. | |||||||||
![]() |
Predicting exposure to risk for active tasks | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | Prior research has been considering how we might better understand and predict the consequences of cyber-attacks based on knowledge of the business processes, people and tasks and how they utilise the information infrastructure / digital assets that might be exposed to specific attack vectors. However, this can clearly be refined by moving to an understanding of those tasks live or active at the time of an attack propagating across a system. If this can be calculated, then an accurate model of where risk may manifest and the harm that may result can be constructed. This project would explore the potential for such a model through practical experimentation and development of software monitors to be placed on a network aimed at inferring the tasks and users that are active based from network traffic. If time allows then host-based sensors might also be explored (such as on an application server) to further refine the understanding of which users and live on which applications etc. Requirements: Students must be able to construct software prototypes and have a working knowledge of network architectures and computer systems. |
|||||||||
![]() |
Procedural Methods in Computer Graphics | Michael Goldsmith | Security | B | C | MSc | Procedural methods in computer graphics help us develop content for virtual environments (geometry and materials) using formal grammars. Common approaches include fractals and l-systems. Examples of content may include the creation of cities, planets or buildings. In this project the student will develop an application to use create content procedurally. The student is free to approach the challenge as they see fit, but would be expected to design, implement and assess the methods they develop. These projects tend to have a strong focus designing and implementing existing procedural methods, but also includes a portion of creativity. The project can be based on reality - e.g. looking at developing content that has some kind of basis on how the real world equivalent objects were created (physically-based approaches), or the project may be entirely creative in how it creates content. Past students for instance have built tools to generate cities based on real world examples and non-existent city landscapes, another example include building of procedural planets, including asteroids and earth-like planetary bodies. |
|||||||||
![]() |
Resilience – metrics and tools for understanding organisational resilience | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | MSc | Resilience in the face of cyber-attack is considered a key requirement for organisations. Prior work within the cybersecurity analytics group has been developing a resilience model for organisations, but there remains a commonly agreed set of metrics that organisations can use to determine just how resilient they are (at a given point in time and continuously). This mini-project would seek to propose a set of metrics, and if time allows tools for applying them (although the latter may well better suit a following DPhil). The approach will be to consider the development of metrics to measure the various characteristics or organisational capabilities / behaviours that are determined to be necessary in order to be resilient. It will be necessary to consider how the metrics might vary according to, or take account of, different threat environments. It may also be interesting to consider if there are refinements of the resilience requirements that apply to different types of organisations. Data analytics approaches will need to be considered, in order to apply the metrics, and students might consider the use of visualisation to help with this. The resulting metrics will be validated in consultation with security professionals and organisations possessing resilience related experience. In the context of the mini-project this is most likely achievable via a small number of interviews, with a more detailed validation and iterative design approach being supported by a longitudinal study working with 2 or 3 organisations which might adopt and test the metrics |
|||||||||||
![]() |
Smartphone security | Michael Goldsmith | Security | B | C | MSc | (Joint with Sadie Creese) Smartphone security: one concrete idea is the development of a policy language to allow the authors of apps to describe their behaviour, designed to be precise about the expected access to peripherals and networks and the purpose thereof (data required and usage); uses skills in formal specification, understanding of app behaviour (by studying open-source apps), possibly leading to prototyping a software tool to perform run-time checking that the claimed restrictions are adhered to. Suitable for good 3rd or 4th year undergraduates, or MSc, Concurrency, Concurrent Programming, Computer Security all possibly an advantage. Other projects within this domain possible, according to interest and inspiration. Prerequisites:Concurrency, Concurrent Programming, Computer Security all possibly an advantage. |
|||||||||
![]() |
Technology-layer social networks | Michael Goldsmith | Security | C | MSc | (Joint with Sadie Creese) Technology-layer social networks: investigate the potential to identify relationships between people via technology metadata - which machines their machines are "friendly" with. Research will involve identification of all metadata available from the network layer, app layers and the data layer, development of appropriate relationship models, and practical experimentation / forensic-style work exploring how to extract relationships between technologies and identities. Appropriate for 4th year undergraduates or MSc. | ||||||||||
![]() |
Trip-wires | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | At Oxford we have developed a framework for understanding the components of an attack, and documenting known attack patterns can be instrumental in developed trip-wires aimed at detecting the presence of insiders in a system. This project will seek to develop a library of such trip-wires based on a survey of openly documented and commented upon attacks, using the Oxford framework. There will be an opportunity to deploy the library into a live-trial context which should afford an opportunity to study the relative utility of the trip-wires within large commercial enterprises. The mini-project would also need to include experimenting with the trip-wires in a laboratory environment, and this would involve the design of appropriate test methods | |||||||||
![]() |
Understanding Enterprise Infrastructure Dependencies | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | There are many tools available for detecting and monitoring cyber-attacks based on network traffic and these are accompanied by a wide variety of tools designed to make alerts tangible to security analysts. By comparison, the impact of these attacks on an organisational level has received little attention. An aspect that could be enhanced further is the addition of a tool facilitating management and updating of our understanding of business processes, but also how those processes are dependent on a network infrastructure. This tool could facilitate the mapping between company strategies, activities needed to accomplish company goals and map these down to the network and people assets. At the top of the hierarchy lies the board, responsible for strategic decisions. These decision are interpreted in the managerial level and could be captured and analysed with business objective diagrams. These diagrams in return could be refined further to derive business processes and organisational charts, ensuring that decision made in the top level will be enforced in the lower levels. The combination of business processes and organisation charts could eventually provide the network infrastructure. For this project we suggest a student could develop novel algorithms for mapping of business processes to network infrastructures in an automated way (given the updated business process files). That said, the student is encouraged to approach this challenge as they see fit, but would be expected to design, implement and assess any methods they develop. Other projects on business process modelling also possible, depending on interest and inspiration. |
|||||||||
![]() |
Vulnerability analysis and threat models for Distributed Ledgers | Michael Goldsmith, Sadie Creese, Ioannis Agrafiotis, Arnau Erola | Security | B | C | MSc | This project would seek to study the general form of distributed ledgers to identify weak points that might make implementations open to compromise and attack, and to develop corresponding threat models. A number of different approaches might be considered, for example: formal modelling of protocols used to establish agreement or resolve conflicts; design of test suites using attack-graphs and then validation in a laboratory setting. |
|||||||||
![]() |
Bioinformatics Projects | Jotun Hein | Computational Biology and Health Informatics | B | C | MSc | Jotun Hein has been in Oxford since 2001 and has supervised many students from Computer Science. His main interest is computational biology and combinatorial optimization problems in this area, especially from phylogenetics, biosequence analysis, population genetics, grammars and the origin of life. Some idea for projects can be obtained by browsing these pages: https://heingroupoxford.com/learning-resources/topics-in-computational-biology/ https://heingroupoxford.com/previous-projects/ A student is also welcome to propose his or her own project. |
|||||||||
![]() |
Distributional prior in Inductive Logic Programming | Celine Hocquette | MSc | Inductive Logic Programming is a form (ILP) of Machine Learning based on Computational Logic. Given examples and some background knowledge, the goal of an ILP learner is to search for a hypothesis that generalises the examples. However, the search space in ILP is a function of the size of the background knowledge. All predicates are treated as equally likely, and current ILP systems cannot make use of distributional assumptions to improve the search. This project focuses on making use of an assumed given prior probability over the background predicates to learn more efficiently. The idea is to order subsets of the background predicates in increasing order of probability of appearance. This project is a mix of theory, implementation, and experimentation. Prerequisites: logic programming, probability theory, or interest to learn about it | ||||||||||||
![]() |
Identifying relevant background knowledge in Inductive Logic Programming | Celine Hocquette | MSc | Inductive Logic Programming (ILP) is a form of Machine Learning based on Computational Logic. Given examples and some background knowledge, the goal of an ILP learner is to search for a hypothesis that generalises the examples. The search space in ILP is a function of the size of the background knowledge. To improve search efficiency, we want to identify relevant areas of the search space as relevant background knowledge predicates. We propose to evaluate and compare several relevance identification methods such as compression of the examples or statistical approaches. This project is a mix of theory, implementation, and experimentation. Prerequisites: logic programming, statistical learning, or interest to learn about it |
||||||||||||
![]() |
Compilation of a CSP-like language | Geraint Jones | Programming Languages | B | C | MSc | This is a compiler project, also requiring familiarity with concurrency. The parallel programming language occam is essentially an implementable sublanguage of CSP. The aim of this project is to produce a small portable implementation of a subset of occam; the proposed technique is to implement a virtual machine based on the inmos transputer, and a compiler which targets that language. One of the aims of the project is to implement an extension to occam which permits recursion; more ambitious projects might implement a distributed implementation with several communicating copies of the virtual machine. Other possibilities are to produce separate virtual machines, optimised for displaying a simulation, or for efficiency of implementation, or translating the virtual machine code into native code for a real machine. |
|||||||||
![]() |
Logic circuit workbench | Geraint Jones | Programming Languages | B | C | MSc | This is an interactive programming project with some logic simulation behind it. The idea is to produce a simulator for a traditional logic breadboard. The user should be able to construct a logic circuit by drawing components from a library of standard gates and latches and so on, and connecting them together in allowable ways. It should then be possible to simulate the behaviour of the circuit, to control its inputs in various ways, and to display its outputs and the values of internal signals in various ways. The simulator should be able to enforce design rules (such as those about not connecting standard outputs together, or limiting fan-out) but should also cope with partially completed circuits; it might be able to implement circuits described in terms of replicated sub-circuits; it should also be able to some sort of standard netlist. It might be that this would make a project for two undergraduates: one implementing the interface, the other implementing a simulator that runs the logic. |
|||||||||
![]() |
Modelling of arithmetic circuits | Geraint Jones | Programming Languages | B | C | MSc | This is a project in the specification of hardware, which I expect to make use of functional programming. There is a great deal of knowledge about ways (that are good by various measures) of implementing standard arithmetic operations in hardware. However, most presentations of these circuits are at a very low level, involving examples, diagrams, and many subscripts, for example these descriptions. The aim of this project is to describe circuits like this in a higher-level way by using the higher order functions of functional programming to represent the structure of the circuit. It should certainly be possible to execute instances of the descriptions as simulations of circuits (by plugging in simulations of the component gates), but the same descriptions might well be used to generate circuit netlists for particular instances of the circuit, and even to produce the diagrams. |
|||||||||
![]() |
Toys for Animating Mathematics | Geraint Jones | Programming Languages | B | C | The aim is to take some mathematics that would be within the grasp of a mathematically-inclined sixth-former and turn it into some attention-grabbing web pages. Ideally this reveals a connection with computing science. I imagine that this project necessarily involves some sort of animation, and I have visions of Open University television maths lectures. The programming need not be the most important part of this project, though, because some of the work is in choosing a topic and designing problems and puzzles and the like around it. There's a lot of this sort of thing about, though, so it would be necessary to be a bit original. Think of GeomLab (and then perhaps think of something a little less ambitious). It might involve logic and proof, it might be about sequences and series, it might be about graphs, it might be about the mathematics of cryptography... you might have something else in mind. |
||||||||||
![]() |
Time-Bandits: | Varun Kanade, Thomas Orton | Artificial Intelligence and Machine Learning | C | Consider the following problem: at each time step, you receive a dataset D_t you would like to fit (using e.g. a neural network or statistical model). You need to choose a hyper-parameter p_t for the model, and then you can run an optimization procedure (e.g. stochastic gradient descent) for time T_{p_t} with this hyper-parameter. Afterwards, you can measure the validation loss l_{t} of the fitted model on the dataset. If you are happy with the validation loss, you can move to the next time step. Otherwise, you can try to choose a new hyper-parameter (spending more compute time) and re-fit the model until you have an acceptable validation loss. > > The field of Multi Armed Bandits ( MAB ) is a rich research area which broadly answers the question of how to make decisions while minimizing regret. This has many applications in Machine Learning, Reinforcement Learning, Portfolio Optimization and Economics. However, there is an obstacle in applying solutions to the MAB for the above problem. This is because MAB is typically formalized by considering a loss function over actions, and considering how to pick a single action at each time step to minimize regret. In contrast, in our problem you may pick as many actions as you like (provided you pay for them with time), and you receive the minimum loss over the actions which you took at each time step. The idea of modelling time trade-offs is also a fundamental issue in practical machine learning which has only more recently been receiving attention in a theoretical context. > > The following project is ideal for a mathematically inclined 4th year student with the following objectives: > > 1. Develop some practically motivated formalisms for modelling time resources in the MAB setting. > 2. Develop a good understanding of existing MAB literature and how it relates to the time-resource MAB variants considered. > 3. Provide algorithmic solutions and theoretical lower-bounds to the time-resource MAB variants considered. | |||||||||||
![]() |
A fast numerical solver for multiphase flow | David Kay | Computational Biology and Health Informatics | B | C | MSc | The Allen-Cahn equation is a differential equation used to model the phase separation of two, or more, alloys. This model may also be used to model cell motility, including chemotaxis and cell division. The numerical approximation, via a finite difference scheme, ultimately leads to a large system of linear equation. In this project, using numerical linear algebra techniques, we will develop a computational solver for the linear systems. We will then investigate the robustness of the proposed solver. |
|||||||||
![]() |
Efficient solution of one dimensional airflow models | David Kay | Computational Biology and Health Informatics | B | C | MSc | In this project we will investigate the use of numerical and computational methods to efficiently solve the linear system of equations arising from a one dimensional airflow model within a network of tree like branches. This work will build upon the methods presented within the first year Linear Algebra and Continuous Maths courses. All software to be developed can be in the student’s preferred programming language. General area: Design and computational implementation of fast and reliable numerical models arising from biological phenomena. | |||||||||
![]() |
"Extensions of the square root law of steganography" | Andrew Ker | Security | MSc | This is a theoretical project that aims to prove probabilistic results relating to the capacity of different types of cover to conceal a hidden message. It will use tools of probability, information theory, and mathematics. An advanced square root law can be found at http://www.cs.ox.ac.uk/andrew.ker/docs/ADK71B.pdf, and this project either aim for elementary proofs of special cases of that result, or elementary extensions, or to find counterexamples to illustrate that the hypotheses are necessary. Suitable for a student with a background in mathematics who has taken the Probability & Computing option. | |||||||||||
![]() |
General topics in the area of Steganography and Cryptography | Andrew Ker | Security | B | C | Supervision of general topics in the area of Steganography and Information theory? |
||||||||||
![]() |
Bots for a Board Game | Stefan Kiefer | Automated Verification | B | C | MSc | The goal of this project is to develop bots for the board game Hex. In a previous project, an interface was created to allow future students to pit their game-playing engines against each other. In this project the goal is to program a strong Hex engine. The student may choose the algorithms that underly the engine, such as alpha-beta search, Monte-Carlo tree search, or neural networks. The available interface allows a comparison between different engines. It is hoped that these comparisons will show that the students' engines become stronger over the years. The project involves reading about game-playing algorithms, selecting promising algorithms and datastructures, and design and development of software (in Java or Scala). | |||||||||
![]() |
Small matrices for irrational nonnegative matrix factorisation | Stefan Kiefer | Automated Verification | B | C | MSc | Recently it was shown (https://arxiv.org/abs/1605.06848) that there is a 6x11-matrix M with rational nonnegative entries so that for any factorisation M = W H (where W is a 6x5-matrix with nonnegative entires, and H is a 5x11-matrix with nonnegative entries) some entries of W and H need to be irrational. The goal of this project is to explore if the number of columns of M can be chosen below 11, perhaps by dropping some columns of M. This will require some geometric reasoning. The project will likely involve the use of tools (such as SMT solvers) that check systems of nonlinear inequalities for solvability. Prerequisites: The project is mathematical in nature, and a good recollection of linear algebra is needed. Openness towards programming and the use of tools is helpful. | |||||||||
![]() |
Using Virtual Reality to predict how we use memory in natural behaviour: collaborative interdisciplinary projects. | Stefan Kiefer | Automated Verification | B | C | MSc | Name and Email Address of Research Project Supervisor: Dr Dejan Draschkow, dejan.draschkow@psy.ox.ac.uk; Project Description: Although human knowledge and memories represent the past, we form and use them to support future behavior (Nobre & Stokes, 2019). Understanding which factors contribute to learning about the world and successfully finding the learned information in mind is of critical importance for developing methods for supporting this behavior in healthy individuals, but also in individuals with a range of neurocognitive and psychiatric conditions, such as stroke, Alzheimer’s, and Parkinson’s disease. Our novel virtual reality task (Draschkow, Kallmayer, & Nobre, 2021) combines the ecological validity, experimental control, and sensitive measures required to investigate the naturalistic interplay between memory and perception and opens the doors to investigating and supporting complex cognitive functions (https://www.youtube.com/watch?v=GT2kLkCJQbY). In the proposed interdisciplinary projects, computer science and experimental psychology students will be paired to develop and validate sophisticated virtual reality protocols for measuring and supporting complex cognitive mechanism. Specific projects will focus on selected sub-topics and vary in scope, depending on students' interests and what kind of project it is (3rd, 4th, or MSc). These include: • Programming and refining game-like cognitive VR tasks in C# (Unity) • Developing protocols for online-based assessments of cognitive functions in C#/JavaScript • Developing algorithms for detecting markers of neurocognitive symptoms (such as tremor for Parkinson’s disease) in multivariate VR data (R/Python) • Developing proof-of-concept multimodal (voice, visual, and touch) protocols for supporting learning and memory in VR (with implications for supporting dementia patients) (C#/JavaScript/R/Python) The projects are suitable for students who feel comfortable with highly interdisciplinary work/teams and have experience with (or be open to learn) scientific programming in C#/JavaScript/R/Python. Students will be fully integrated in a successful and collaborative research group and get hands-on experience with an interactive product-development cycle, including multiple stakeholders. Further related readings are: (Ballard et al., 1997; Hayhoe, 2017; Hayhoe & Ballard, 2014) Relevant readings from cognitive science: Ballard, D. H., Hayhoe, M. M., Pook, P. K., & Rao, R. P. N. (1997). Deictic codes for the embodiment of cognition. In Behavioral and Brain Sciences (Vol. 20, Issue 4, pp. 723–767). https://doi.org/10.1017/S0140525X97001611 Draschkow, D., Kallmayer, M., & Nobre, A. C. (2021). When Natural Behavior Engages Working Memory. Current Biology, 31(4), 869-874.e5. https://doi.org/10.1016/j.cub.2020.11.013 Hayhoe, M. M. (2017). Vision and Action. Annual Review of Vision Science, 3(1), 389–413. https://doi.org/10.1146/annurev-vision-102016-061437 Hayhoe, M. M., & Ballard, D. (2014). Modeling task control of eye movements. Current Biology : CB, 24(13), R622-8. https://doi.org/10.1016/j.cub.2014.05.020 Nobre, A. C. (Kia), & Stokes, M. G. (2019). Premembering Experience: A Hierarchy of Time-Scales for Proactive Attention. Neuron, 104(1), 132–146. https://doi.org/10.1016/j.neuron.2019.08.030 | |||||||||
![]() |
Computation Theory with Atoms | Bartek Klin | Programming Languages | B | C | MSc | Sets with atoms, also known as nominal sets, are an abstract foundational approach to computing over data structures that are infinite but highly symmetrical, so much so they are finitely presentable and amenable to algorithmic manupulation. Many basic results of classical mathematics and computation theory become more subtle, or even fail, in sets with atoms. For example, in sets with atoms not every vector space has a basis, and a Turing machine may not determinize. A variety of specific topics are available, aimed at mathematically oriented students. Prerequisities: Strong mathematical background is essential. Students who enjoy courses such as "Models of Computation", "Categories, Proofs and Processes" or "Automata, Logic and Games" will find this subject suitable. Some relevant literature: - M. Bojanczyk: Slightly Infinite Sets. draft available from https://www.mimuw.edu.pl/~bojan/upload/main-6.pdf - A. Pitts: Nominal Sets: Names and Symmetry in Computer Science. Cambridge University Press, 2013. | |||||||||
![]() |
Truthful scheduling for graphs | Elias Koutsoupias | Algorithms and Complexity Theory | B | C | MSc | The aim of the project is to advance our understanding of the limitations of mechanism design for the scheduling problem, the "holy grail" of algorithmic mechanism design. Specifically, we will consider the graphical scheduling problem, in which every task can be only allocated to two machines, and study the approximation ratio of mechanisms for this setting. The aim is to prove both lower and upper bounds. Both directions are hard problems, and we plan to try to gain some understanding by experimentally searching for lower bounds or trying to verify potentially useful statements about the upper bound. Of particular importance is the case of trees and their special case of stars, i.e., when every task can be given either to the root or to a particular leaf. We plan to study not only the standard objective of the makespan, but the more general class of objectives in which the mechanism minimizes the L^p norm of the times of the machines. The case L^infinity is to minimize the makespan, L^1 is to minimize the welfare, and the case L^0 corresponds to the Nash Social Welfare problem, all of which are interesting problems. Further possible directions include considering fractional scheduling and mechanisms without money. Bibliography: George Christodoulou, Elias Koutsoupias, Annamária Kovács: Truthful Allocation in Graphs and Hypergraphs. ICALP 2021: 56:1-56:20 (https://arxiv.org/abs/2106.03724) | |||||||||
![]() |
Truthful scheduling for graphs | Elias Koutsoupias | Algorithms and Complexity Theory | B | C | MSc | The aim of the project is to advance our understanding of the limitations of mechanism design for the scheduling problem, the "holy grail" of algorithmic mechanism design. Specifically, we will consider the graphical scheduling problem, in which every task can be only allocated to two machines, and study the approximation ratio of mechanisms for this setting. The aim is to prove both lower and upper bounds. Both directions are hard problems, and we plan to try to gain some understanding by experimentally searching for lower bounds or trying to verify potentially useful statements about the upper bound. Of particular importance is the case of trees and their special case of stars, i.e., when every task can be given either to the root or to a particular leaf. We plan to study not only the standard objective of the makespan, but the more general class of objectives in which the mechanism minimizes the L^p norm of the times of the machines. The case L^infinity is to minimize the makespan, L^1 is to minimize the welfare, and the case L^0 corresponds to the Nash Social Welfare problem, all of which are interesting problems. Further possible directions include considering fractional scheduling and mechanisms without money. Bibliography: George Christodoulou, Elias Koutsoupias, Annamária Kovács: Truthful Allocation in Graphs and Hypergraphs. ICALP 2021: 56:1-56:20 (https://arxiv.org/abs/2106.03724) | |||||||||
![]() |
Probabilistic Modelling Checking | Marta Kwiatkowska | Automated Verification | B | C | MSc | Professor Marta Kwiatkowska is happy to supervise projects involving probabilistic modelling, verification and strategy synthesis. This is of interest to students taking the Probabilistic Model Checking course and/or those familiar with probabilistic programming. Below are some concrete project proposals, but students’ own suggestions will also be considered:
|
|||||||||
![]() |
Safety Assurance for Deep Neural Networks | Marta Kwiatkowska | Automated Verification | B | C | MSc | Safety Assurance for Deep Neural Networks Professor Marta Kwiatkowska is happy to supervise projects in the area of safety assurance and automated verification for deep learning, including Bayesian neural networks. For recent papers on this topic see http://qav.comlab.ox.ac.uk/bibitem.php?key=WWRHK+19, http://qav.comlab.ox.ac.uk/bibitem.php?key=RHK18 and http://qav.comlab.ox.ac.uk/bibitem.php?key=CKLPPW+19, and also https://www.youtube.com/watch?v=XHdVnGxQBfQ. Below are some concrete project proposals, but students’ own suggestions will also be considered:
|
|||||||||
![]() |
Dashcam analysis | Harjinder Lallie | B | C | There exist no current guidelines and very few tools to aid the investigation of a dashcam device, particularly for the purpose of extracting and mapping geospatial data contained therein. This project aims to extract, map and chart geospatial data from a dashcam device and provide insights into the routes and speeds taken by a passenger.You will be provided with a number of dashcam forensic images (.E01 files which can easily be extracted back to MOV/MP4 files). You will be expected to develop the solution using Python, and if possible, integrate the solution with Autopsy, an open-source digital forensic tool. You might consider using an open-source tool such as Exiftool to process the EXIF data. You could choose to take this project to a somewhat different angle by choosing to extract watermark data from the dashcam footage and map that instead. To aid in understanding the background to this project, please see the reference below and the video here: https://warwick.ac.uk/fac/sci/wmg/mediacentre/events/dfls/previousevents/dashcamforensics/ Lallie, H.S., 2020. Dashcam forensics: A preliminary analysis of 7 dashcam devices. Forensic Science International: Digital Investigation, 33, p.200910. Prerequisite. You may want to use tools such as EXIFTOOL, an open-source EXIF data extraction tool. Although EXIFTOOL might serve your needs, you will most probably want to develop an excellent knowledge of the EXIF standard. Additional support can be provided by providing you with access to specific elements of my digital forensics course at the University of Warwick in the form of recorded lectures. That will comprise around 10 hours of learning. I will also provide you with sufficient background in the topic prior to you beginning the study. You will need good programming skills, preferably in Python. |
|||||||||||
![]() |
Keyword searching audio/video files | Harjinder Lallie | B | C | Digital forensic investigators often search for the existence of keywords on hard disks or other storage medium. Keywords are easily searchable in PDF/word/ text/other popular formats, however, current digital forensic tools do not allow for keyword searching through movies/audio. This is essential in cases which involve dashcam footage, recorded conversations etc. The aim of this project is to produce a tool which auto-transcribes audio data and then performs a keyword search on the transcribed file – pinpointing the point(s) in the file where the keyword(s) appear. You will be expected to develop the solution using Python, and if possible, integrate the solution with Autopsy, an open-source digital forensic tool. Prerequisite. Additional support can be provided by providing you with access to specific elements of my digital forensics course at the University of Warwick in the form of recorded lectures. That will comprise around 10 hours of learning. You are likely to use the Python SpeechRecognition and possibly the PyAudio libraries. For convenience and demonstration of practicality, you may want to integrate the solution with the open-source forensics tool – Autopsy – and hence will need to develop a good understanding of this tool particularly the keyword search facility. | |||||||||||
![]() |
Reasoning with Causality in Digital Forensic Cases. | Harjinder Lallie | B | C | Unlike traditional 'wet' forensics, digital forensics is heavily reliant on circumstantial evidence. Quite often a digital investigator is tasked with outlining the sequence of events that took place on a computer system to establish causality and a chain of events. These event sequences are reasonably straightforward to map conceptually, however, there is no solution which 'auto-links' these events. This project applies probabilistic techniques using an attack tree (or alternative) methodology to determining probable event sequences from sequence fragments available on a computer system. You will be supplied with a digital forensic image of a used hard disk, or you may choose to develop your own. You will most likely organize event sequences into categories (e.g. software installation, execution, file download, browsing etc) and then test your proposals on further hard disk images and possibly supplement your findings through qualitative feedback from law enforcement experts (I can connect you). Prerequisite. Knowledge of Autopsy, an open-source forensic tool. Additional support will be given, if required, by providing access to my online wiki self-guided tutorials and by providing access to specific elements of my digital forensics course at the University of Warwick in the form of recorded lectures. That will comprise around 10 hours of learning. I will also provide you with sufficient background in the topic prior to you beginning the study. You will need good programming skills, preferably in Python. | |||||||||||
![]() |
EV Charging Testbed | Ivan Martinovic | Security | B | C | MSc | Co-supervised by Systems Security Lab Right now, there is a format war for methods of charging electric vehicles (EVs). One of the most developed systems is the Combined Charging System (CCS). It is underpinned by an international standard called ISO 15118, which describes a complex web of communication that allows vehicle charging to take place. The car talks to the charger it is connected to, in order to report status and place requests for current. The charger, in turn, talks to the backend of the company that runs the charging site (the so-called charging provider), in order to authorise charging sessions and report any errors. It may also talk to a site controller at the same physical location, which is responsible for coordinating multiple chargers and ensuring capacity is shared appropriately between them. The charging provider's backend may federate to other charging providers to permit interoperability between networks. In the near future, it is also expected that the charging provider will be able to talk to energy companies for automated billing, or even to allow electricity to flow the other way -- from vehicle-to-grid (V2G). Exploring the operations and security of this interconnected system is crucially important, while it is still being developed, to avoid building insecure infrastructure that we have to live with for many years to come. A number of problems have already been discovered (see links below) and many more are still to come. Having access to realistic testbeds will accelerate this work substantially, by allowing researchers worldwide to evaluate ideas on a small scale, without having to gain access to costly real-world infrastructure. Our lab has already built a physical testbed for the car-to-charger communication, and successfully attacked it. Other researchers have built a system that can simulate multiple such links, but only virtually. Certainly, neither of these existing testbeds implement the full system end-to-end, particularly the backend or federation aspects, so it is hard to understand what all the effects of an attack would be. In this project, a student would work towards extending the existing testbed implementations to integrate backend functionality and therefore simulate the whole charging ecosystem. When complete, a user could create a virtual charging provider and register some customers, spawn charging sites with chargers, spawn vehicles -- and then simulate full charging sessions. In the ideal case, the implementation would be containerised so that it can be deployed on real hardware, in virtual machines, or a combination of both. It is important to say that there is a lot of scope in this project, and it is not expected that all of it would be implemented in any one year. In fact, there are many parts to suit different interests, across system development and scientific investigation. Meaningful contributions might include: * Build a working, low-power CCS charger for a toy car * Containerise and integrate several open projects into a connected system * Implement functionality for a charging provider as an open-source reference * Model the PKI operations described in ISO 15118 * Investigate the capability of attackers at various stages of the charging process * Make something crazy, like a car-to-car charger with offline crypto payments Some research links: * Our first paper on CCS (testbed in appendix) : https://www.usenix.org/conference/usenixsecurity19/presentation/baker * The virtual testbed from Padua : https://spritz.math.unipd.it/projects/MiniV2G/ * An overview of CCS : https://en.wikipedia.org/wiki/Combined_Charging_System Some open implementations of parts of the ecosystem: * OpenV2G : http://openv2g.org/ * RISE-V2G : https://github.com/SwitchEV/RISE-V2G * OCPP : https://www.openchargealliance.org/ | |||||||||
![]() |
Improved Intent Recognition | Ivan Martinovic | Security | B | C | MSc | Co-supervised by Systems Security Lab Using Markov chains to improve gesture recognition on current dataset. In recent work, we conducted a user study in which users made mobile payments using a smartwatch and we collected the inertial sensor (accelerometer, gyroscope, etc.) data generated on the smartwatch. Using this data, we showed that the tap gesture performed by the user while making such payments can be used to authenticate the user. We also found that the tap gesture is sufficiently distinct from other hand movements that, by recognising it, we can infer the user’s intention to pay (as opposed to it being an accidental payment or skimming attack). We achieved our results by using random forest classifiers on tap gestures represented by sliding windows consisting of between 0.5 and 4 seconds of inertial sensor data.It may be possible to strengthen our intent recognition model by treating the tap gesture as a 3-tuple of three separate parts, namely: (i) reaching towards the payment terminal, (ii) aligning the watch face with the terminal to establish an NFC connection, and (iii) withdrawing afterwards. The plan would be to attempt to recognise each constituent part separately and use Markov chains to stitch together 3-tuples that form a complete gesture. Our dataset will be provided. The research question is whether this approach is more or less effective at recognising tap gestures than our unseparated approach. | |||||||||
![]() |
Physically realizable adversarial machine learning models against on-screen classifiers | Ivan Martinovic | Security | B | C | MSc | Co-supervised by Systems Security Lab Taking into account the security-critical contexts in which computer vision models should be used in mixed reality / metaverse, it is of a great interest for us to explore ways of crafting defences against malicious actors using adversarial attacks. Your objective will be to test and improve upon the robustness of existing computer vision models by crafting state of the art adversarial examples in the form of phishing websites that are able to perform on photographs of computer screens captured by a smartphone or XR headset camera. | |||||||||
![]() |
Secure Networking in Small Satellite Operating Systems | Ivan Martinovic | Security | B | C | MSc | Co-supervised by Systems Operating Systems CCSDS standards describe the Space Data Link Security (SDLS) protocol for securing communications between spacecraft and ground stations for Telemetry, Tracking, and Command (TT&C). The extended procedures for this protocol detail systems for key management and security association management. This protocol has been implemented for use in NASA's Core Flight System (cFS), an open-source satellite operating system designed to run on a variety of hardware platforms. Recent years have seen increased interest in new types of space system, including satellite mega-constellations containing large numbers of satellites, multi-tenant spacecraft shared between multiple parties, and federated satellite constellations composed of independently operated satellites. Each of these require new security considerations, particularly in the areas of secure communication, key management, and key distribution. In this project a student would implement key management or key distribution systems to better facilitate communication in the context of one of the above use cases. This will take the form of an extension to NASA's "CryptoLib" implementation of SDLS in cFS, or extensions to a previous student's work to port the QUIC protocol to cFS. NASA Core Flight System: https://github.com/nasa/cFS NASA CryptoLib: https://github.com/nasa/CryptoLib CCSDS standard for SDLS: https://public.ccsds.org/Pubs/355x0b1.pdf CCSDS standard for SDLS-EP: https://public.ccsds.org/Lists/ CCSDS%203551R1/355x1r1.pdf Prerequisites: Students undertaking this project should be familiar with network security concepts and be comfortable writing in C. | |||||||||
![]() |
Security Monitoring and Control for the cFS Satellite Operating System | Ivan Martinovic | Security | B | C | MSc | Co-supervised by Systems Security lab Core Flight System (cFS) is an open-source satellite operating system provided by NASA, providing a standard interface and set of applications which can run on a variety of hardware platforms thanks to its abstraction of the underlying operating systems. Applications and devices within cFS communicate using a publish-subscribe model on a shared bus interface. These messages are not logged or authenticated so a malicious component could affect parts of the satellite it should not have access to. In this project a student would design and implement a security layer over the cFS software bus to provide security monitoring and per-application access control. Such a system would ideally be backwards-compatible with existing cFS applications. Students undertaking this project should be comfortable writing in C. Useful links: - NASA Core Flight System: https://github.com/nasa/cFS | |||||||||
![]() |
Signal Injection Attacks Against Modern Sensors | Ivan Martinovic | Security | B | C | MSc | Co-Supervisored by System Security Lab In recent years, the boundaries between the physical and the digital world have become increasingly blurry. Nowadays, many digital systems interact in some way with the physical world. Large and complex cyber-physical systems, such as autonomous and electric vehicles, combine the physical and the digital world and enable the interaction between those two domains. Usually, such systems are equipped with numerous sensors to measure physical quantities, such as temperature, pressure, light, and sound. These physical quantities are vital inputs for the computations and can influence the decision-making process of the system. However, the nature of an analog sensor makes it not easily possible to authenticate the physical quantity that triggered a stimulus [1]. For instance, a temperature sensor cannot detect if the stimulus was caused by a legitimate temperature increase or by an adversary using a hairdryer. This is a major concern because the integrity of sensor measurements is critical to ensuring that a system behaves as intended, and a violation of this principle can have serious security, safety and reliability consequences. In our research, we have shown that different sensors are vulnerable to signal injection attacks on the physical layer [2, 3]. In this project, a student would analyse the vulnerability of sensors as they are used in modern systems, such as cars, the smart grid and IoT devices. The project will enable the student to research signal injection attacks using different modalities, such as light, acoustic and electromagnetic waves. Moreover, the student will be able to assess the impact of a successful attack against a system as a whole and work on novel countermeasures that can help to improve the security of the next generation of systems. https://github.com/ssloxford/ccd-signal-injection-attacks https://github.com/ssloxford/they-see-me-rollin Prerequisites: Some familiarity in the area of digital signal processing and with Python. | |||||||||
![]() |
Transport Layer Security for Satellite Networks | Ivan Martinovic | Security | B | C | MSc | Co-Supervisored by System Security Lab Long-range satellite communications networks suffer from high network latencies due to the long distances to satellites. These high latencies have a detrimental effect on the performance of common protocols used for internet traffic, such as the Transmission Control Protocol (TCP). Currently widely-used tools to optimise TCP performance are incompatible with the encrypted traffic of current VPNs. This has led to many operators resorting to providing their communication services unencrypted, leaving customers exposed to eavesdropping attacks. In our research group, we have developed QPEP [1], a novel combination of a VPN and a satellite performance-enhancing proxy, which enables the use of encrypted traffic over satellite links without the usual performance drawbacks. This project would improve on this work in one or more of the following ways: Low Earth Orbit Evaluation Packet Loss Resilience Scalability & Multi-user Environments [1] Pavur, J. C., et al. "QPEP: An actionable approach to secure and performant broadband from geostationary orbit." (2021). https://github.com/ssloxford/qpep https://www.ndss-symposium.org/wp-content/uploads/2021-074-paper.pdf | |||||||||
![]() |
Using state-of-the-art computer vision models to build novel applications on recent XR platforms | Ivan Martinovic | Security | B | C | MSc | Co-supervised by Systems Security Lab In this project, you will be using state-of-the-art AI computer vision models that understand their environments (e.g. detect computer screens) and deploying them in novel scenarios on the latest XR (AR/VR) technologies: from NReal glasses, to Microsoft HoloLens, Oculus Quest 2. Your task will be to test and improve upon the robustness of computer vision models by deploying them in novel environments, evaluating their robustness, and building interesting applications around them. For example, this would allow you to build a “physical ad blocker” that would block out any digital advertisement, or privacy-preserving remote support application that allows sharing. | |||||||||
![]() |
Applied Formal Verification | Tom Melham | Automated Verification | B | C | MSc | I am happy to supervise projects at all levels in applied formal verification. These typically involve hands-on investigation and innovation to develop new, practical methods of correctness analysis and verification for various kinds of computer systems – either hardware, software, or a combination. They are ideally suited to students who have an interest in computer systems, logic, and hands-on practical work that has a solid theoretical basis. You don’t have to have taken the course in Computer-Aided Formal Verification, if you are willing to learn the theories and technologies required. All my projects are designed to have a strong element of research and sufficient challenge to allow a motivated student to make an excellent contribution. Projects are usually therefore quite ambitious, but they are also designed realistically to fit into the time available. And they always have some fall-back options that are less challenging but can still result in an excellent achievement. Rather than offer readymade project ideas, I encourage students with an interest in this area to meet me and together discuss what might align best with their background and interests. I always have several project ideas that link to my current research or research being done by my group. Often projects will have a connection to real-world verification problems in industry, and many of my students will have contact through their projects with leading verification researchers and engineers in industry. If interested to discuss, please contact me. |
|||||||||
![]() |
Developing computational tools to aid the design of CRISPR/Cas9 gene editing experiments | Peter Minary | Computational Biology and Health Informatics | B | C | MSc | At present, the most versatile and widely used gene-editing tool is the CRISPR/Cas9 system, which is composed of a Cas9 nuclease and a short oligonucleotide guide RNA (or guide) that guides the Cas9 nuclease to the targeted DNA sequence (on-target) through complementary binding. There are a large number of computational tools to design highly specific and efficient CRISPR-Cas9 guides but there is a great variation in performance and lack of consensus among the tools. We aim to use ensemble learning to combine the benefits of a selected set of guide design tools to reach superior performance compared to any single method in predicting the efficiency of guides (for which experimental data on their efficiency is available) correctly. Recommended for students who has done the Machine Learning and the Probability and Computing courses.
|
|||||||||
![]() |
Developing machine learning models for off-target prediction in CRISPR/Cas9 gene editing | Peter Minary | Computational Biology and Health Informatics | B | C | MSc | The CRISPR/Cas9 gene editing system is composed of a Cas9 nuclease and a short oligonucleotide guide RNA (or guide) that guides the Cas9 nuclease to the targeted DNA sequence (on-target) through complementary binding but the Cas9 nuclease may also cleave off-target genomics DNA sequences, which contain mismatches compared to the gRNA, therefore, undesired cleavage could occur. The obvious factors influencing off-target cleavage activity of the CRISPR/Cas9 gene editing system are the sequence identities of the guide RNA and the off-target DNA . Various 'basic features' derived from said sequences have been fueling the development of procedural and machine learning models for off-target cleavage activity prediction but there are numerous 'non-basic fetures' (such as the sequence context around the off-target DNA) that may also influence off-target cleavage activity. The project will aim for the development of novel off-target clavage activity predictions models using approaches that include but not limited to combining 'basic features' and 'non-basic features' to increase the accuracy of model predictions of experimental off-target cleavage activities. Prerequisites: Recommended for students who has done a Machine Learning course and has interest in molecular biology. | |||||||||
![]() |
Implementing a Datalog Reasoner | Boris Motik | Data and Knowledge | B | The objective of this project would be for a student to implement a simple Datalog reasoner. Depending on the student’s ambitions, the reasoner could be running in main memory (easier version) or on disk (more challenging). The student would be expected to design the data structures needed to store the data (in RAM or on disk, as agreed with me). On top of this, the student would implement the seminaive algorithm and evaluate the system on a couple of medium-sized datasets. We have in our group ready-made datasets that could be used for evaluation. A variation of this project where the student would reuse an existing relational database to store the data. Then, the seminaive algorithm would be implemented either as an extension of the database (assuming the database is open source), or on top of the database (by running SQL queries). This last variant would arguably be much easier as it would involve less design, and more just reusing existing technologies. More advanced students could extend their system to implement various incremental reasoning algorithms. For example, I would give them one of the many papers I’ve written on this topic, and they would have to (a) understand the formal material and (b) implement and evaluate the algorithms. Hence, this project would also give students the opportunity to go as far as they can. Having attended the Databases or Database System Implementation courses, and perhaps to a lesser extent the KRR course, would be a prerequisite for doing this project. | |||||||||||
![]() |
Implementing a Tableaux Reasoner for Description Logics | Boris Motik | Data and Knowledge | B | The objective of this project would be for a student to implement a simple tableau-based reasoner for description logics. The project would involve the student designing the core data structures, implementing the basic algorithm, realising backtracking, and implementing various optimisations typically used in practical reasoners. The student could use a preexisting API (e.g., the OWL API) to load a description logic ontology, so they could just focus on solving the core problem. The student would be expected to evaluate their implementation on the ontologies from the repository that we’re maintaining in the KRR group (http://www.cs.ox.ac.uk/isg/ontologies/). The project seems to me to provide good opportunity for the student to demonstrate how well they absorbed the CS material (e.g., algorithms, data structures, computational complexity analysis, etc.) taught in our degrees. Also, the project is sufficiently open-ended so that the student can go quite a long way before running out of options of things to do. Having attended the KRR course would be a prerequisite for doing this project. | |||||||||||
![]() |
Topics in Automata Theory, Program Verification and Programming Languages (including lambda calculus and categorical semantics) | Andrzej Murawski | Programming Languages | C | MSc | Prof Murawski is willing to supervise in the area of automata theory, program verification and programming languages (broadly construed, including lambda calculus and categorical semantics). For a taste of potential projects, follow this link. |
||||||||||
![]() |
Pruning Deep Learning Architectures for Medical Applications | Ana Namburete | Artificial Intelligence and Machine Learning | C | Much of the unprecedented success of deep learning models can be attributed to the construction of increasingly large convolutional neural networks (CNNs), containing several parameters that can capture the complexity of growing datasets and diverse tasks. The trend has been that the larger the network, the better the performance. However, large CNNs are computationally demanding as they require more expensive and specialised GPU hardware, take longer to run, and require vast memory stores. These limitations make them difficult to redistribute, and thus impractical for many real-world applications. One such scenario is in transferring validated algorithms to operate on standard medical workstations or to mobile devices that interface with portable, point-of-care ultrasound probes. The goal of this project is to devise strategies to compress the models in such a way that model size (parameter count) is reduced, while minimising the loss in accuracy. As part of this project, the student will explore network pruning strategies for intelligent model compression, thus enabling their application across medical imaging tasks. We will investigate the success of channel pruning, which involves removing entire filters from the network, thus explicitly making the model smaller during the training process. The student will investigate: (1) pruning criteria, to determine the functions that will best reduce tensor redundancy; (2) extent of pruning; and (3) identifying effective pruning locations. This project is suitable for a student interested in deep learning, algorithm development, and image analysis. Programming experience is desirable (ideally Python). References: To date, the OMNI group has explored model compression using separable convolution kernels (to reduce filter sizes) and knowledge distillation: a “teacher-student” network setup in which only the task-relevant weights learned by the larger teacher network are transferred to a much smaller student network. This preliminary work led to a reduction in network size, with negligible compromise in accuracy for processing ultrasound images on Android-based mobile devices. • Low-Memory CNNs enabling real-time ultrasound segmentation towards mobile deployment Vaze, S, Xie, W, Namburete, AIL. IEEE JBHI 2020 | |||||||||||
![]() |
Unbiased segmentation of fetal brain structures from multi-site image data | Ana Namburete | Artificial Intelligence and Machine Learning | C | Increasing digitisation of the medical domain has led to the creation of large amounts of data, which will continue to rise in the near future. As these datasets become increasingly available, there is an opportunity to combine data from multiple sites to generate unified “big data” stores. This has the advantage of improving the anatomical diversity of training datasets, thereby providing a better overall understanding of disease mechanisms, and increasing statistical power. However, multi-site medical data tends to contain hidden biases that are imperceptible to a human observer, but result in a drop in the predictive power of deep learning models. Heterogeneous data from multiple sites may also contain undesirable non-biological variance, even when attempts are made to reduce it by following standardised acquisition protocols or using identical medical equipment. The goal of this project is to design deep learning algorithms that are applicable to heterogeneous ultrasound image datasets. Specifically, the student will be tasked with implementing convolutional neural network (CNN) architectures to automatically delineate subcortical structures (e.g. cerebellum, lateral ventricles, choroid plexus) in 3D ultrasound images of the fetal brain (Fig. 1). The volumetric measurements will then be used to construct population growth charts that can serve as a reference for the global population. Domain adaptation techniques will also be explored to solve the “harmonisation problem” by remove scanner-specific (and/or site-specific) biases in the model’s predictions. This project is suitable for a student interested in deep learning, algorithm development, and image analysis. While programming experience is desirable (ideally Python), no prior knowledge of brain anatomy is necessary. References: The first two papers are on deep learning-based approaches to the harmonisation problem, and the last two papers demonstrate the feasibility of segmenting subcortical structures in 3D ultrasound images. Please visit our website and the project pages for further information. Model harmonisation [Project page] • Deep learning-based unlearning of dataset bias for MRI harmonisation and confound removal Dinsdale, NK, Jenkinson, M, Namburete, AIL. NeuroImage 2021 • Unlearning scanner bias for MRI harmonisation Dinsdale, NK, Jenkinson, M, Namburete, AIL. MICCAI 2020 Structural segmentation [Project page] • Subcortical Segmentation of the Fetal Brain in 3D Ultrasound using Deep Learning Hesse, LS, Aliasi, M, Moser, M, the INTERGROWTH-21st Consortium, Jenkinson, M, Haak, M, Namburete, AIL. NeuroImage 2021 • Cortical plate segmentation using CNNs in 3D fetal ultrasound Wyburd, MK, Jenkinson, M, Namburete, AIL. MIUA 2020 | |||||||||||
![]() |
Concurrent Algorithms and Data Structures | Hanno Nickau | Quantum | C | MSc | Projects in the area of Concurrent Algorithms and Data Structures | ||||||||||
![]() |
Concurrent Programming | Hanno Nickau | Quantum | B | C | MSc | Projects in the area of Concurrent Programming | |||||||||
![]() |
Model checking of POMDPs | David Parker | Automated Verification | B | C | MSc | Formal verification techniques have recently been developed for probabilistic models with partial observability, notably partially observable Markov decision processes (POMDPs), and implemented in software tools such as PRISM. This project will investigate extensions of these techniques, including for example the applicability of sampling based solution methods, adaptation to more expressive temporal logics or extension to partially observable stochastic games. | |||||||||
![]() |
Model checking of stochastic games | David Parker | Automated Verification | B | C | MSc | This project will develop formal verification techniques for stochastic games, in particular by considering techniques based on game-theoretic notions of equilibria. A range of such techniques are already implemented in the PRISM-games model checker. This project will consider extensions of these approaches to tackling new types of equilibria, such as Stackelberg equilibria, with potential directions including designing extended temporal logics and solution methods, or modelling new applications, for example from multi-robot coordination. | |||||||||
![]() |
Probabilistic Model Checking under Uncertainty | David Parker | Automated Verification | B | C | MSc | Formal methods for analysing models such as Markov chains and Markov decision processes can be extended to explicitly reason about model uncertainty, for example by building and analysing interval Markov decision processes. This project will investigate alternative approaches to tackling this problem, which could include alternative models of transition probability uncertainty, factoring in dependencies between different sources of uncertainty, or using bayesian inference to learn model parameters. | |||||||||
![]() |
Building databases of mathematical objects in Sagemath (Python) | Dmitrii Pasechnik | Algorithms and Complexity Theory | B | C | MSc |
Prerequisites: Interest in open source software, some knowledge of Python, some maths background. |
|||||||||
![]() |
Computing with semi-algebraic sets in Sage (math) | Dmitrii Pasechnik | Algorithms and Complexity Theory | B | C | MSc | Semi-algebraic sets are subsets of R^n specified by polynomial inequalities. The project will extend capabilities of Sage (http://www.sagemath.org) to deal with them, such as CADs computations or sums of squares based (i.e. semi-definite programming based) methods. There might be a possibility for participation in Google Summer of Code (GSoC) or Google Semester of Code with Sage as a GSoC organisation. Prerequisites Interest in open source software, some knowledge of Python, appropriate maths background. |
|||||||||
![]() |
Implementing and experimenting with variants of Weisfeiler-Leman graph stabilisation | Dmitrii Pasechnik | Algorithms and Complexity Theory | B | C | MSc | Weisfeiler-Leman graph stabilisation is a procedure that plays an important role in modern graph isomorphism algorithms (L.Babai 2015) and enjoyed attention in machine learning community recently (cf. "Weisfeiler-Lehman graph kernels"). While it is relatively easy to implement, efficient portable implementations seem to be hard to find. In this project we will work on producing such an open-source implementation, either by reworking our old code from http://arxiv.org/abs/1002.1921 or producing new one; we will also work on providing Python/Cython interfaces for a possible inclusion in SageMath (http://sagemath.org) library, and/or elsewhere. | |||||||||
![]() |
Automatic translation to GPGPU | Joe Pitt-Francis | Computational Biology and Health Informatics | B | C | This project involves running cardiac cell models on a high-end GPU card. Each model simulates the electrophysiology of a single heart cell and can be subjected to a series of computational experiments (such as being paced at particular heart rates). For more information about the science and to see it in action on CPU see "Cardiac Electrophysiology Web Lab" at https://travis.cs.ox.ac.uk/FunctionalCuration/ An existing compiler (implemented in Python) is able to translate from a domain specific XML language (http://models.cellml.org) into a C++ implementation. The goal of the project is to add functionality to the compiler in order to get OpenCL or CUDA implementations of the same cell models and to thus increase the efficiency of the "Web Lab". |
||||||||||
![]() |
General GPGPU and high performance computing projects | Joe Pitt-Francis | Computational Biology and Health Informatics | B | C | I am willing to supervise projects which fit into the general areas of General Purpose Graphics Processing Unit (GPGPU) programming and High-Performance Computing (HPC). Specific technologies used are likely to based around
All application areas considered although geometric algorithms are favourites of mine. |
||||||||||
![]() |
General graphics projects | Joe Pitt-Francis | Computational Biology and Health Informatics | B | C | I am interested in supervising general projects in the area of computer graphics. If you have a particular area of graphics-related research that you are keen to explore then we can tailor a bespoke project for you. Specific projects I have supervised in the past include
(Such a project is generally not suitable for MSc students. They should note that in order for this option to work as a potential MSc project then it should be combined with a taught-course topic such as machine learning, concurrent programming, linguistics etc.) |
||||||||||
![]() |
Graphics pipeline animator | Joe Pitt-Francis | Computational Biology and Health Informatics | B | Pre-requisites: Computer graphics, Object-oriented programming The idea behind this project is to build an educational tool which enables the stages of the graphics pipeline to be visualised. One might imagine the pipeline being represented by a sequence of windows; the user is able to manipulate a model in the first window and watch the progress of her modifications in the subsequent windows. Alternatively, the pipeline might be represented by an annotated slider widget; the user inputs a model and then she moves the slider down the pipeline, watching an animation of the process |
|||||||||||
![]() |
Intuitive exploration through novel visualisation | Joe Pitt-Francis | Computational Biology and Health Informatics | B | C | I am interested in novel visualisation as a way to represent things in a more appealing and intuitive way. For example the Gnome disk usage analyzer (Baobab) uses either a "ring chart" or "treemap chart" Representation to show us which sub-folders are using the most disk. In the early 1990s the IRIX file system navigator used a 3D skyscraper representation to show us similar information. There are plenty more ways of representing disk usage: from DAGs to centralised Voronoi diagrams. What kind of representation is most intuitive for finding a file which hogging disk-space and which is most intuitive for helping us to remember where something is located in the file-system tree? The aim is to explore other places where visualisation gain intuition: for example, to visualise the output of a profiler to find bottlenecks in software, to visual a code coverage tool in order to check that test-suites are are testing the appropriate functionality or even to visualise the prevalence of diabetes and heart disease in various regions of the country. | ||||||||||
![]() |
mmWAVE TECHNOLOGY | Kasper Rasmussen | Security | MSc | One of Stofl’s primary goals is to help facilitate the expansion of the Internet of Things out into the nooks and crannies of the world. The context of this project revolves around our vision for wildlife technology and poaching security. Having secured a partner through the Moholoholo Rehabilitation Centre in South Africa (home to Stoffel the Honey Badger, inspiration for our company, Stofl), we are providing them a LoRaWAN network to deploy these technologies. Together with the Director of the facility and his son, a well-known anti-poacher in the area, discussed a large number of possible LoRa based solutions for the centre, from soil moisture monitoring to feed monitoring, it became clear that the most pressing issue they were facing was regarding poachers breaking in and viciously killing the animals for their skin or tusks. The scope of this project expands on a project that has been started by a student at Imperial College London in an examination into the use of low-power mmWave sensors to detect people, movement, concealed weapons and detect the difference between them using AI algorithms. The purpose of this project is to design a lightweight, portable and battery operated sensor to assist in anti-poaching exercises in South Africa. While the initial solution surrounds the differentiation of various objects, it would need to consider future requirements such as the detection and differentiation between varying weapons made of metal and other materials. Objectives: The study should carry forward the research completed through the previous project with the goal of developing a lightweight, low-power device capable of detecting the metal casing of a bullet from a minimum range of 2 meters. Once a baseline is set for what is available currently on the market, the project focus should be on optimising the solution by increasing detection range, capabilities and power efficiency. Our overall objective is to provide a framework for potential solutions to be trialed at the Moholoholo Rehabilitation Centre to prevent poachers from killing the animals. Initial objectives involve research and categorisation of what is currently available on the market and comparing them against their inherent limitations. As the project develops, further investigation into potential avoidance methods and security threats of the solutions should be considered. Support: Stofl’s concept is openness and transparency and our schedule is available for anyone to book a meeting with us should a formal time be required. We would invite any interested persons to reach out to us and engage in a discussion and we are available through any communications platforms, whether this be by email, WhatsApp, Zoom, Google Meet, Facetime or face to face. We understand that each individual is unique and we are ready to engage with anyone who we feel is a good fit for us. We will provide regularly scheduled meetings, availability through whatsapp, discord, email, etc. as well as access to our extensive network of partners, companies and experts. Project slides can be viewed here
|
|||||||||||
![]() |
Fast Emulation of Cardiac Mechanics with Machine Learning towards Enabling Real-Time Digital Twinning of the Heart | Blanca Rodriguez, Julia Camps, Zhinuo (Jenny) Wang | Computational Biology and Health Informatics | MSc | Cardiac electromechanical simulations of the human heart are computationally expensive, with one single beat taking hours to simulate in a supercomputer. This has prevented realising the vision of the cardiac digital twins for real-time therapy testing in clinical practice. Cardiac electromechanical simulations of the human heart are computationally expensive, with one single beat taking hours to simulate in a supercomputer. This has prevented realising the vision of the cardiac digital twins for real-time therapy testing in clinical practice. Graph Neural Networks have recently demonstrated great promise in replacing finite-element methods for solving mesh-based PDEs. Several works have been presented demonstrating the emulation of complex dynamics in physical systems to a high level of fidelity. Generate training data through simulations of cardiac mechanics, design and train the machine learning model, and test the machine learning models under different conditions. Paper on Graph Neural Networks for emulating passive left ventricle mechanics https://www.sciencedirect.com/science/article/pii/S0045782522006004 (source code available). This project would consider a biventricular geometry of the heart. Paper on modelling and simulation of cardiac electromechanics for investigating mechanisms of disease in post-myocardial infarction https://www.biorxiv.org/content/10.1101/2022.02.15.480392v1.abstract Prerequisites: How accurately can machine learning emulate patient-specific cardiac mechanics? The expected contribution would be to train a machine learning algorithm on simulation data to provide a fast alternative to existing simulation frameworks. | |||||||||||
![]() |
Statistical shape atlas of cardiac anatomy | Blanca Rodriguez, Pablo Lamata | Computational Biology and Health Informatics | MSc | Description: Cardiac remodelling is the change in shape of the anatomy due to disease processes. 3D computational meshes encode shape variation in cardiac anatomy, and render higher diagnostic than conventional geometrical metrics. Better shape metrics will enable an earlier detection, a more accurate stratification of disease, and more reliable evaluation of remodelling response. This project will contribute to the development of a toolkit for the construction of anatomical atlases of cardiac anatomy, and its translation to clinical adoption. The student will learn about the challenges and opportunities of the cross-disciplinary field between image analysis, computational modelling and cardiology, and be part of a project with a big potential impact on the management of cardiovascular diseases. Prerequisites: Motivation. Good programming skills. Experience with computational graphics and image analysis is an advantage. |
|||||||||||
![]() |
A Light-Colour Geolocator | Alex Rogers | Systems | B | C | MSc | A light-level geolocator is a small device that uses measurements of the timing and duration of sunset and sunrise to estimate its location on the Earth. This is commonly used for tracking the movements of long-distance migratory animals such as birds. Modern sensors now provide the possibility of measuring light-colour as well as intensity. Widely used in laptops and smartphones to automatically adjust the screen colour to ambient conditions, these sensors provide intensity measurements in 11 spectral bands. In this project, you will explore whether this additional colour information can improve the accuracy of traditional light-level geolocators. You will model how ambient light colour changes over the course of the day and then use this model to estimate the location of a prototype sensor that collects hourly ambient light-colour measurements. The work will test the feasibility of this approach for use within future wildlife tracking tags. | |||||||||
![]() |
Acoustic source localisation with AudioMoth | Alex Rogers | Systems | B | C | MSc | AudioMoth (https://www.openacousticdevices.info) is an open-source low-cost acoustic recorder that is widely used for environmental monitoring. It has recently been extended to use the pulse-per-second signal from a GPS receiver to synchronise recordings. This allows large numbers of AudioMoth devices to be deployed, all generating precisely aligned recordings that can then be used to determine the location of acoustic events captured by multiple recordings. In this project, you will develop the necessary software to identify and localise acoustic events from such synchronised recordings. The resulting software will be used by conservation biologists who seek to use synchronised recordings to count individual animals and to study the spatial and temporal patterns of calls with groups of animals. | |||||||||
![]() |
Chaos and the LGP-30 | Alex Rogers | Systems | B | C | MSc | The Royal McBee LGP-30 was an early desk-sized computer developed in 1956. It used a clever design, with a large spinning drum holding 16KB of memory, from which bits were read and processed sequentially. Famously, this computer was used by Edward Lorenz, running code written by Ellen Fetter and Margaret Hamilton, to discover and demonstrate the core properties of chaotic systems: deterministic nonperiod flow and sensitive dependence on initial conditions. Lorenz’s original paper presented the first hand-drawn plots of what is now known as the Lorenz attractor (www.astro.puc.cl/~rparra/tools/PAPERS/lorenz1962.pdf). It described in detail how the calculations were performed and provided printouts of computed results over multiple iterations. Unfortunately, the original code that generated these results is lost. In this project, you will use a simulator of the LGP-30 to reverse engineer this lost code from the details in the paper. In doing so, we hope to rediscover the design choices, made by Ellen Fetter and Margaret Hamilton, that were necessary to perform this computation on such a limited computer. | |||||||||
![]() |
EXECUTIVE and WAITLIST for ARM M0+ | Alex Rogers | Systems | B | C | MSc | Most modern real-time operating systems for small micro-controllers, such as those using an ARM M0+ processor, allow for a small number of processes which are defined at compile time. In contrast, one of the first examples of a real-time operating system, deployed on the Apollo Guidance Computer (AGC), was more sophisticated. It allowed short WAITLIST tasks to be scheduled at set intervals. Longer EXECUTIVE jobs could be created with different priorities at runtime, typically by the WAITLIST tasks themselves, and these would be completed using whatever spare processor time was available. Famously, during the Apollo 11 landing, the AGC running in the lunar lander ran out of space for new EXECUTIVE jobs, triggering a 1202 alarm. Although low priority jobs updating the astronauts’ displays failed to run, the higher priority jobs controlling the space craft continued as normal and the crew landed safely. This approach to real-time operating system design appears to still offer some advantages. To explore these advantages, in this project, you will develop a real-time operating system for an ARM M0+ processor modelled on this approach. Prerequisites: Digital Systems useful but not essential. |
|||||||||
![]() |
Resurrecting Extinct Computers | Alex Rogers | Systems | B | C | MSc | While the architecture of current reduced instruction set processors is well established, and relatively static, the early days of computing saw extensive experimentation and exploration of alternative designs. Commercial processors developed during the 1960s, 1970s and 1980s included stack machines, LISP machines and massively parallel machines, such as the Connection Machine (CM-1) consisting of 65,536 individual one-bit processors connected as a 12-dimensional hypercube. This period also saw the development of the first single chip microprocessors, such as the Intel 4004, and the first personal computers, such as the Altair 8800 using the Intel 8080 microprocessor. This project will attempt to resurrect one of these extinct designs (or a scaled down version if necessary) using software simulation or a low-cost field-programmable gate array (FPGA). You will be required research the chosen processor, using both original and modern sources, and then develop a register level description of the device that can be implemented in software or on an FPGA. The final device should be able to run the software of the original. Prerequisites Digital Systems and Computer Architecture useful but not essential. |
|||||||||
![]() |
Applications of ad hoc security | Bill Roscoe | Security | MSc | My group have developed novel technology for developing security between pairs and groups of devices such as mobile phones. This technology clearly has many potential applications: the objective of this project is to build an implementation directed at an application of the student's choosing, for example email, healthcase or social networking. See my papers with Bangdao Chen, Ronald Kainda and Long Nguyen on my web page. Prerequisites: Security | |||||||||||
![]() |
Developing and using SVA | Bill Roscoe | Automated Verification | MSc | SVA is a front end for FDR that analyses shared variable programs. See Chapters 18 and 19 of Understanding Concurrent Systems. It has the potential to support a number of projects including: (a) Building a version that accepts a subset of some standard programming language as opposed to the custom language it presently uses. (b) Extending its functionality (c) Case studies using it. (a) and (b) would both involve programming in CSP and JAVA, the two languages SVA is written in. Prerequisites: Concurrency for (a) and (b) | |||||||||||
![]() |
Modelling and verifying systems in Timed CSP and FDR | Bill Roscoe | Automated Verification | B | C | MSc | Timed CSP reinterprets the CSP language in a real-time setting and has a semantics in which the exact times of events are recorded as well as their order. Originally devised in the 1980s, it has only just been implemented as an alternative mode for FDR. The objective of this project is to take one or more examples of timed concurrent system from the literature, implement them in Timed CSP, and where possible compare the performance of these models with similar examples running on other analysis tools such as Uppaal. References:(Reference Understanding Concurrent Systems, especially Chapter 15, and Model Checking Timed CSP, from AWR's web list of publications) |
|||||||||
![]() |
Category-theoretic syntactic models of programming languages | Philip Saville, Sam Staton | Programming Languages | C | MSc | Syntactic models -- categories constructed from the syntax of a programming language -- play a key role in category-theoretic denotational semantics. Showing such models exist and satisfy a suitable universal property amounts to giving a sound and complete semantic interpretation for the language in question. Often this involves carefully studying the interplay between program features and categorical structures. The three main aims of the project are as follows. Firstly, to construct syntactic models for two idealised effectful functional programming languages, namely Moggi's monadic metalanguage [1] and computational lambda calculus [2]. Next, to prove their universal properties, and finally to use these to give syntactic translations between the languages. The starting point would be to understand the categorical semantics of the simply-typed lambda calculus, the monadic metalanguage and the computational lambda calculus. Extensions would include exploring extensionality / well-pointedness and constructing fully abstract syntactic models of these languages. [1] E. Moggi, "Notions of computation and monads," 1991 [2] E. Moggi, Pre-requisite courses: - Categories, proofs and processes Useful background courses: - Principles of programming languages - Lambda calculus and types
|
||||||||||
![]() |
A fast portable grep | Simon Smith | Human Centered Computing | B | C | MSc | A classic paper by Ken Thompson describes a translation of regular expressions into nondeterministic finite automata, with the transition functions of the automota represented as dynamically generated machine code for the IBM 7090 computer. The aim of this project is to recreate Thompson's presentation using a more recent JIT framework called Thunder, yielding a fast implementation of the grep utility for pattern matching in files. Another page has more details and hints. | |||||||||
![]() |
Adding timeouts to micro:bian | Michael Spivey | Programming Languages | B | C | MSc | For some of the tasks that follow, it will help to add a new form of the receive system call that provides a timeout. This does express something that can't be done with a timer process and ordinary receive. That's because it isn't possible to cancel an alarm call from the timer process without risk of deadlock: if the client process decides to cancel the alarm call at the same time the timer process decides to deliver it, then they are stuck in a deadly embrace, each refusing to receive a message the other is committed to sending. Moving the timeout into the kernel allows it to be cancelled automatically when the receive completes. A suitable API will allow micro:bian to continue to function without a timer process, but allows a timer process to be installed and, via a special system call, to inform the kernel of the passage of time. | |||||||||
![]() |
Better register allocation | Michael Spivey | Programming Languages | B | C | MSc | The compiler we currently study in the last part of the Compilers course allocates registers greedily during the instruction selection phase. Even on the ARM, this can lead to unnecessary register-to-register moves, but a variant of the compiler for AMD64 suffers from the problem more severely. The goal of this project is to implement a better register allocation scheme that operates as a separate pass after instruction selection. The pass will work from an explicit representation of the instruction sequence without assigned registers, and will be able to take into account move instructions marked in the instruction sequence, wherever possible assigning the same register to both the source and target of the move, so that the instruction itself can be eliminated. Source material for the project can be found in the book, A retargetable C compiler by Fraser and Hanson, and in articles referenced in that book. | |||||||||
![]() |
Bluetooth for micro:bian | Michael Spivey | Programming Languages | B | C | MSc | A module already exists that implements radio communication using a proprietary on-air protocol that is shared with the standard micro:bit firmware. Open source implementations of the Bluetooth Low Energy protocol also exist, for example as part of Apache MyNewt. Port one of these to work with micro:bian. [The Nordic proprietary softdevice or 'blob' is not likely to help us much with this, being oblivious to any operating system.] | |||||||||
![]() |
Fast micro:bian context switch with floating point | Michael Spivey | Programming Languages | B | C | MSc | At present, the context-switch code in micro:bian does not save and restore the floating point registers on chips that have them. That means client programs cannot use the floating point unit, or more accurately, that at most one thread can use it. ARM provide a method, documented [somewhere] that allows for lazy saving of the floating-point registers. At the least, the kernel should save the floating point state for a process only if it has used the floating point unit; but better than that, it's possible to exploit the hardware to save the floating point state only if some other process wants to use floating point. This makes for efficient treatment of interrupts: if the device driver process doesn't use floating point, then there is not need to save the floating point registers when control passes from a background process to the driver, or to restore them when control returns to the background process. | |||||||||
![]() |
Handy links | Michael Spivey | Programming Languages | B | C | MSc | Concise documentation of the micro:bian API. A page about context switch for the Cortex-M3. Datasheet and reference manual for the FRDM-KL25Z. An app note about interrupt-driven I2C in Kinetis microcontrollers. The truth about timing for neopixels. | |||||||||
![]() |
Multiprocessor micro:bian | Michael Spivey | Programming Languages | B | C | MSc | The RP2040 chip on the Raspberry Pi Pico board has two Cortex-M0 cores. Make micro:bian run on the two cores symmetrically, using spinlocks to provide mutual exclusion for the kernel. | |||||||||
![]() |
Porting micro:bian to new boards | Michael Spivey | Programming Languages | B | C | MSc | Port micro:bian to an ARM-based microcontroller other than the Nordic chips and to boards other than the micro:bit. This implies adapting the micro:bian microkernel to the features of the other chip core, and writing device drivers for the peripherals, beginning with the serial port. For one-chip boards, it's probably best to begin by using an external USB-to-serial adapter. | |||||||||
![]() |
Projects related to micro:bian | Michael Spivey | Programming Languages | B | C | MSc | micro:bian is a very small embedded operating system that is used in the first year course on Digital Systems. At present, it runs on the ARM-based microcontroller on the BBC micro:bit board, but it would be good to port it to other boards and other processors. Unlike most other embedded operating systems, micro:bian provides process synchronization based on message passing, with interrutps converted into messages at the lowest level in the system. This encourages a style of programming where maximum advantage is made of concurrent processes, with minimum use of shared variables. The design is inspired by Andrew Tanenbaum's operating system Minix, an implementation of the Unix system call interface that uses message passing inside. At present, micro:bian is specific to the BBC micro:bit. But plenty of other inexpensive ARM-based development boards exist, such as the Freescale FRDM-KL25Z and FRDM-K64F boards, and many other boards that are part of the MBED ecosystem. The aim of these projects is to port micro:bian to one or more of these other boards, identifying and isolating the parts that are board-specific and providing multiple implementations of device driver processes so that micro:bian applications are as far as possible portable between boards. In addition, we would like to support a wider range of peripherals and communication methods, such as USB or Bluetooth: for complicated protocols like these, a good way forward would be to obtains an open-source software stack and modify it to work with micro:bian. Microcontroller development boards are divided into what I call "one-chip" and "two-chip" boards. On a two-chip board like the micro:bit, there are two microcontrollers: one that we program, and another microcontroller with fixed firmware that looks after USB communication, and is able to program the first chip using its debugging interface. The second chip also acts as a serial-to-USB adapter for the serial port on the main chip. The micro:bit and the Freescale boards are like this. On a one-chip board, there is no such second microcontroller, and the main chip contains bootloader firmware. These boards can be programmed over USB using the bootloader, or by using a separate debug probe (which probably contains an ARM-based microcontroller of its own). For a serial interface, you can either use a separate USB-to-serial adapter connected to the UART pins on the board, or include a USB driver in the firmware. Other boards don't have the switches and lights of the micro:bit but some, including the FRDM boards, support Arduino-compatible add-on 'shield' boards, and we could easily make a shield that replicates the features of the micro:bit. | |||||||||
![]() |
SD card storage for micro:bian | Michael Spivey | Programming Languages | B | C | MSc | The interface for SD cards is a variant of SPI, and they have a well-documented protocol. Implement a block storage service that uses this interface, and on top of it a file system, choosing either the MS-DOS disk format or the one used by Minix, either of which will allow the cards to be mounted on a Linux host. Open source implementations of the MS-DOS file system exist in the Arduino world as a guide. | |||||||||
![]() |
USB for micro:bian | Michael Spivey | Programming Languages | B | C | MSc | Simple, portable libraries exist that implement USB communications on microcontrollers, and some of them claim to be adaptable to multiple real-time operating systems. Choose one of them and incorporate it into micro:bian, adapting the OS interface to fit in with the message-passing style of concurrency. For a feasibility study, it's not necessary to implement every kind of USB device – a simulated serial port is enough. For additional hints, https://github.com/Spivoxity/avr-usb has examples of simple USB devices implemented with bit-banging on ATtiny microcontrollers, using a freely available USB library. | |||||||||
![]() |
micro:bian under pthreads | Michael Spivey | Programming Languages | B | C | MSc | Unix provides libraries that allow processes to consist of multiple threads in the same address space. For development, simulation and debugging, it is desirable to have an implementation of the micro:bian API on top of such a library, so that concurrent micro:bian programs can run on a host computer. There's a lab exercise for an ancient operating systems course that exploits this idea. | |||||||||
![]() |
### *CHOP*: Composable Higher-Order Processes | Bernard Sufrin | Programming Languages | B | C | MSc | The **Go** language runtime system has deservedly gained a reputation for efficiency and scaleability across architectures with numbers of processors ranging from one to hundreds. But the **Go** language itself leaves something to be desired by those who seek higher level ways of directly realizing systems as concurrent compositions of *well-specified* components communicating by means of channels. The goal of this project is to define a concurrent language that, like **Go**, encourages *inter-module information sharing by channel communication* (in contrast to encouraging the implementation of *intermodule communication by memory sharing*). Like [Communicating Scala Objects](CSO/cpa2008-cso-2016revision.pdf), which was modelled, to a certain extent, on [*OCCAM 3*](http://en.wikipedia.org/wiki/Occam_%28programming_language%29), it should incorporate robust language constructs for concurrent composition, alternation, and termination of composites. *CHOP* programming should have something of the flavour of *CSO* programming, but its design should be semantically conservative -- in the sense that it should be possible for programs in *CHOP* to be simply translateable into efficient *Go* programs. Recent developments in the *Go* type system may make this easier than would have been possible earlier in the language's development. We have done some preliminary development of a *Go* library that supports CSO-like parallel composition, and CSO-like termination of networks of connected components using the propagation of channel closures. |
|||||||||
![]() |
### *GoOsE*: an extensible operating system kernel | Bernard Sufrin | Programming Languages | B | C | MSc | The goal of this project is to attempt to answer the question: *What would a small extensible operating system kernel implemented in* **Go** *look like?* We are not particularly ambitious, but we think it might be possible to build something useful for Raspberry-Pi-like machines along the lines of Martin Richards's venerable microkernel \*TRIPOS\* <http://www.cl.cam.ac.uk/\~mr10/Cintpos.html>\_\_. |
|||||||||
![]() |
### *Gobol*: dynamically-typed, prototype-based, object oriented language | Bernard Sufrin | Programming Languages | B | C | MSc | The goal of this project is to reimplement, in *Go*, an interpreter (or compiler+virtual machine) for a concurrent programming language based on the prototype-based object-oriented language [*Obol*](https://www.cs.ox.ac.uk/people/bernard.sufrin/personal/obol.pdf). |
|||||||||
![]() |
CSO-F -- Communicating Scala Objects with Fibres | Bernard Sufrin | Programming Languages | B | C | MSc | The [Scala](http://www.scala-lang.org/) language is a powerful object oriented language, with a type system, that is much more flexible and expressive than Java's. Scala's compilers translate into code for the Java Virtual machine. It is particularly convenient to express new programming constructs in Scala. In 2008 we designed and implemented an OCCAM/CSP-style concurrency library for Scala, called CSO: (see [this paper](http://users.comlab.ox.ac.uk/bernard.sufrin/CSO/cpa2008-cso.pdf), and [the CSO distribution.](http://users.comlab.ox.ac.uk/bernard.sufrin/CSO/)) The library has been used in a variety of applications as well as for teaching the principles of concurrent programming. The present CSO implementation maps each running process to a (kernel) thread -- usually taken from a thread-pool. But kernel threads -- even when pooled -- are heavyweight entitites, and their multiplexing across the resources of a real machine is not as efficient as we would like. The time is more than ripe for a reimplementation of CSO using a lighter-weight basis. Project [LOOM](https://cr.openjdk.java.net/~rpressler/loom/Loom-Proposal.html) has provided an implementation of **Fibres** that may do the trick. The aim of this project is to explore the use of Fibres as the basis for CSO processes. Much of present CSO implementation should be reusable. |
|||||||||
![]() |
Modeless Structure Editing | Bernard Sufrin | Programming Languages | B | C | MSc | Oege de Moor and I wrote a paper called [Modeless Structure Editing](http://web.comlab.ox.ac.uk/oucl/work/bernard.sufrin/edit.pdf) for Tony Hoare's retirement symposium. What we had in mind was a family of modeless editors whose underlying model of the document being edited is akin to its *abstract syntax tree*. It's clear what this means if the document represents, for example, a program in a programming language or a proof in a proof calculus; but we conjecture that structured prose documents will (to some extent) be amenable to this approach. The first phase of this project will construct (in Scala) a prototype structure editing toolkit implementing the model explained in this paper. Subsequent phases will provide one or two case studies in its use. |
|||||||||
![]() |
"Parallel reasoning in Sequoia" | David Tena Cucala, Bernardo Cuenca Grau | Artificial Intelligence and Machine Learning, Data and Knowledge | C | MSc | Sequoia is a state-of-the-art system developed in the Information Systems Group of the University of Oxford for automated reasoning in OWL 2, a standard ontology language in the Semantic Web. Sequoia uses consequence-based calculus, a novel and promising approach towards fast and scalable reasoning. Consequence-based algorithms are naturally amenable to parallel reasoning; however, the current version of Sequoia does not take advantage of this possibility. This project aims at implementing and testing parallel reasoning capabilities in Sequoia. The student involved in this project will acquire knowledge of state-of-the-art techniques for reasoning in OWL 2. They will develop an implementation that exploits the characteristics of Sequoia's underlying algorithm to achieve parallel reasoning, and they will perform an evaluation of the system's performance against other state-of-the-art reasoners. Prerequisites: only suitable for someone who has done the |
||||||||||
![]() |
Multi-Modal Partially Labelled Stream | Philip Torre | MSc | Data on large systems is often stream lined and multi modal, e.g., textual, images, videos, and or sound. All this data is being accumulated while jointly changing in distribution. Moreover, much of this data presented from the stream is only partially labelled. We seek to study the problem of training models on a partially labelled streams in multi-modal setting. In particular, we seek to find new effective algorithms to performing joint self-supervised continual learning on the unlabelled data while learning in supervised fashion the labelled portion of the stream. | ||||||||||||
![]() |
Proposal. Robust Continual Learning: Effective Learning from Adversarial Streams   | Philip Torre | MSc | Continual learning is an open problem with various real-world applications. While there is a large body of work on learning from large corpus of static well-curated data, learning from streams where the distribution of the data varies per time step is a folds more challenging problem. We seek to formulate settings where there is an adversary in the loop that can potentially alter or perturb the samples presented by the stream. The adversaries goal is break continually learning algorithms without being detected. On the other hand, efficient robust continual learners can preserve a high accuracy even under the worst case scenario where every sample presented by the stream has been adversarially manipulated. We want to formulate this new exciting problem and propose effective robust continual learning algorithms under such challenging setting. | ||||||||||||
![]() |
ARETHA: A Transparent and Respectful Virtual Assistant for the Home | Max Van Kleek | Human Centered Computing | B | C | MSc | Virtual assistants like Alexa and Siri have become hugely popular conveniences in homes worldwide. However, they also pose unprecedented privacy risks by being black-box vehicles of corporate surveillance by the likes of Amazon, Apple and others. Is it possible to design a virtual assistant that can be guaranteed to protect user privacy, and be fully transparent and controllable by end-users? This project will involve designing and developing a fully modular, open source virtual assistant called ARETHA from the ground up, using open source components, which can provide strong guarantees about being prirvacy-preserving and pro-user. | |||||||||
![]() |
Privopticon: A privacy-preserving mesh OS for self-observation | Max Van Kleek | Human Centered Computing | B | C | MSc | Self-monitoring has many potential applications within the home, such as the ability to understand important health and activity rhythms within a household automatically. But such monitoring activities also have associated extreme privacy risks. Can we design new kinds of sensing architectures that are designed to preserve inhabitants' privacy? We have made initial inroads of a new mesh operating system prototype for raspberry π hardware for new classes of privacy-preserving self-monitoring applications for the home, which will provide user-configurable degrees of information fidelity, have built-in forgetting and are accountable by design. We need your help to try and evaluate different kinds of methods for achieving these goals. |
|||||||||
![]() |
3D demos of geometric concepts | Irina Voiculescu | Computational Biology and Health Informatics | B | The Geometric Modelling course (Part B) deals with several interesting concepts which would be best visualised in a suite of applications. A coherent suite of 3D demos could easily become a useful tool for this course, as well as for users worldwide. This project would be most suitable for a candidate who already have some experience using a 3D graphics library of their choice and want to improve this skill. The mathematical concepts are well-documented. | |||||||||||
![]() |
3D environment for Hand Physiotherapy | Irina Voiculescu | Computational Biology and Health Informatics | B | C | After hand surgery, it is almost always necessary for patients to have physiotherapy afterwards to help with their recovery. As part of this, the patient will need to perform hand exercises at home. However, the patient may not always do the exercises correctly, or they might forget to do their exercises. The goal of this project is to use the Leap Motion to create a user-friendly GUI which a patient could use to aid them with their home exercises. The interface would show the user where their hand should be and they would then need to follow the movements. It could work from a web-based software or a downloaded software. It would need to be tailored to the patient so it contained their specific required exercises, which could be input by the physiotherapist. It would need to store data on how the patient is doing and feedback this data to the patient, and possibly also to the physiotherapist via the internet. If internet-based, patient confidentiality and security would need to be considered. This project would be performed in close collaboration with a physiotherapist, an orthopaedic hand surgeon, and a post-doctoral researcher based at the Nuffield Orthopaedic Centre. | ||||||||||
![]() |
3D printing medical scan data | Irina Voiculescu | Computational Biology and Health Informatics | B | C | MSc | Computed tomography (CT) scanning is a ubiquitous scanning modality. It produces volumes of data representing internal parts of a human body. Scans are usually output in a standard imaging format (DICOM) and come as a series of axial slices (i.e. slices across the length of the person's body, in planes perpendicular to the imaginary straight line along the person's spine.) The slices most frequently come at a resolution of 512 x 512 voxels, achieving an accuracy of about 0.5 to 1mm of tissue per voxel, and can be viewed and analysed using a variety of tools. The distance between slices is a parameter of the scanning process and is typically much larger, about 5mm. During the analysis of CT data volumes it is often useful to correct for the large spacing between slices. For example when preparing a model for 3D printing, the axial voxels would appear elongated. These could be corrected through an interpolation process along the spinal axis. This project is about the interpolation process, either in the raw data output by the scanner, or in the post-processed data which is being prepared for further analysis or 3D printing. The output models would ideally be files in a format compatible with 3D printing, such as STL. The main aesthetic feature of the output would be measurable as a smoothness factor, parameterisable by the user. Existing DICOM image analysis software designed within the Spatial Reasoning Group at Oxford is available to use as part of the project. |
|||||||||
![]() |
3D stereo display of medical scan data | Irina Voiculescu, Stuart Golodetz | Computational Biology and Health Informatics, Systems | B | C | The Medical Imaging research group has been working with a variety of data sourced from CT and MRI scans. This data comes in collections of (generally greyscale) slices which together make up 3D images. Our group has developed software to generate 3D models of the major organs in these images. This project aims to develop a simple augmented reality simulation for the Oculus Rift which will render these organs within a transparent model of a human and allow the user to walk around the model so as to view the organs from any angle. This has a number of possible applications, including to train medical students and to help surgeons to explain medical procedures to their patients. | ||||||||||
![]() |
Exact Algorithms for Complex Root Isolation | Irina Voiculescu | Computational Biology and Health Informatics | B | C | MSc | Not available in 2013/14 Isolating the complex roots of a polynomial can be achieved using subdivision algorithms. Traditional Newton methods can be applied in conjunction with interval arithmetic. Previous work (jointly with Prof Chee Yap and MSc student Narayan Kamath) has compared the performance of three operators: Moore's, Krawczyk's and Hansen-Sengupta's. This work makes extensive use of the CORE library, which is is a collection of C++ classes for exact computation with algebraic real numbers and arbitrary precision arithmetic. CORE defines multiple levels of operation over which a program can be compiled and executed. Each of these levels provide stronger guarantees on exactness, traded against efficiency. Further extensions of this work can include (and are not limited to): (1) Extending the range of applicability of the algorithm at CORE's Level 1; (2) Making an automatic transition from CORE's Level 1 to the more detailed Level 2 when extra precision becomes necessary; (3) Designing efficiency optimisations to the current approach (such as confirming a single root or analysing areas potentially not containing a root with a view to discarding them earlier in the process); (4) Tackling the isolation problem using a continued fraction approach. The code has been included and is available within the CORE repository. Future work can continue to be carried out in consultation with Prof Yap at NYU. |
|||||||||
![]() |
Gesture recognition using Leap Motion | Irina Voiculescu | Computational Biology and Health Informatics | B | C | MSc | Scientists in the Experimental Psychology Department study patients with a variety of motor difficulties, including apraxia - a condition usually following stroke which involves lack of control of a patient over their hands or fingers. Diagnosis and rehabilitation are traditionally carried out by Occupational Therapists. In recent years, computer-based tests have been developed in order to remove the human subjectivity from the diagnosis, and in order to enable the patient to carry out a rehabilitation programme at home. One such test involves users being asked to carry out static gestures above a Leap Motion sensor, and these gestures being scored according to a variety of criteria. A prototype has been constructed to gather data, and some data has been gathered from a few controls and patients. In order to deploy this as a clinical tool into the NHS, there is need for a systematic data collection and analysis tool, based on machine learning algorithms to help classify the data into different categories. Algorithms are also needed in order to classify data from stroke patients, and to assess the degree of severity of their apraxia. Also, the graphical user interface needs to be extended to give particular kinds of feedback to the patient in the form of home exercises, as part of a rehabilitation programme. This project was originally set up in collaboration with Prof Glyn Humphreys, Watts Professor of Experimental Psychology. Due to Glyn's untimely death a new co-supervisor needs to be found in the Experimental Psychology Department. It is unrealistic to assume this project can run in the summer of 2016. |
|||||||||
![]() |
Identifying features in MRI scan data | Irina Voiculescu | Computational Biology and Health Informatics | B | C | MSc | In recent years, medical diagnosis using a variety of scanning modalities has become quasi-universal and has brought about the need for computer analysis of digital scans. Members of the Spatial Reasoning research group have developed image processing software for CT (tomography) scan data. The program partitions (segments) images into regions with similar properties. These images are then analysed further so that particular features (such as bones, organs or blood vessels) can be segmented out. The team's research continues to deal with each of these two separate meanings of medical image segmentation. The existing software is written in C++ and features carefully-crafted and well-documented data structures and algorithms for image manipulation. In certain areas of surgery (e.g. orthopaedic surgery involving hip and knee joint) the magnetic resonance scanning modality (MRI) is preferred, both because of its safety (no radiation involved) and because of its increased visualisation potential. This project is about converting MRI scan data into a format that can become compatible with existing segmentation algorithms. The data input would need to be integrated into the group's analysis software in order then to carry out 3D reconstructions and other measurements. This project is co-supervised by Professor David Murray MA, MD, FRCS (Orth), Consultant Orthopaedic Surgeon at the Nuffield Orthopaedic Centre and the Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), and by Mr Hemant Pandit MBBS, MS (Orth), DNB (Orth), FRCS (Orth), DPhil (Oxon)Orthopaedic Surgeon / Honorary Senior Clinical Lecturer, Oxford Orthopaedic Engineering Centre (OOEC), NDORMS. | |||||||||
![]() |
Reinforcement learning techniques for games | Irina Voiculescu | Computational Biology and Health Informatics | B | C | This project is already taken for 2017-2018
Psychology has inspired and informed a number of machine learning methods. Decisions within an algorithm can be made so as to improve an overall aim of maximising a (cumulative) reward. Supervised learning methods in this class are known as Reinforcement Learning. A basic reinforcement learning model consists of establishing a number of environment states, a set of valid actions, and rules for transitioning between states. Applying this model to the rules of a board game means that the machine can be made to learn how to play a simple board game by playing a large number of games against itself. The goal of this project is to set up a reinforcement learning environment for a simple board game with a discrete set of states (such as Backgammon). If time permits, this will be extended to a simple geometric game (such as Pong) where the states may have to be parameterised in terms of geometric actions to be taken at each stage in the game. |
||||||||||
![]() |
Simple drawing analysis | Irina Voiculescu | Computational Biology and Health Informatics | B | C | MSc | Scientists in the Experimental Psychology Department study patients with a variety of motor difficulties, including apraxia - a condition usually following stroke which involves lack of control of a patient over their hands or fingers. Diagnosis and rehabilitation are traditionally carried out by Occupational Therapists. In recent years, computer-based tests have been developed in order to remove the human subjectivity from the diagnosis, and in order to enable the patient to carry out a rehabilitation programme at home. One such test involves users drawing simple figures on a tablet, and these figures being scored according to a variety of criteria. Data has already been gathered from 200 or so controls, and is being analysed for a range of parameters in order to assess what a neurotypical person could achieve when drawing such simple figures. Further machine learning analysis could help classify such data into different categories. Algorithms are also needed in order to classify data from stroke patients, and to assess the degree of severity of their apraxia. This project was originally co-supervised by Prof Glyn Humphreys, Watts Professor of Experimental Psychology. Due to Glyn's untimely death a new co-supervisor needs to be found in the Experimental Psychology Department. It is unrealistic to assume this project can run in the summer of 2016. |
|||||||||
![]() |
Efficient linear algebra for block structured matrices | Jonathan Whiteley | Computational Biology and Health Informatics | B | C | Many large systems of linear equations are sparse, i.e. if the matrix that describes the linear system is of size N times N, where N is large, there are very few non-zero entries in each row. Under these conditions there may not be enough memory to store the whole matrix, and so only the non-zero entries are stored. This prevents techniques such as LU decomposition being used to solve the linear system; instead an iterative technique such as the conjugate gradient technique for symmetric positive definite matrices, or GMRES for more general matrices is used. The number of iterations needed with these techniques can be large, rendering these techniques inefficient. To prevent this, preconditioning techniques are used - if the linear system is defined by Ax=b, then a preconditioner P is used and the system solved is instead PAx = Pb, where P is cheap to calculate and both PAx and Pb are cheap to evaluate. In this project we will investigate matrices with a block structure that arises in many fields, such as constrained optimisation and continuum mechanics. We will utilise the block structure of these matrices to heuristically derive candidate preconditioners, and compare their performances. Prerequisites: linear algebra, continuous mathematics |
||||||||||
![]() |
Parameter recovery for models described by differential equations | Jonathan Whiteley | Computational Biology and Health Informatics | B | C | Phenomena in many fields are described by differential equations where a quantity of interest (for example, depending on the phenomena modelled, a reaction rate, a capacitance, or a subject's cardiac output) appears as a parameter in the differential equation. We then determine the quantity of interest by choosing parameters in the differential equation so that the computed solution of the differential equation using these parameters matches experimental data as closely as possible. This may be posed as an optimisation problem that may be tackled by either a classical optimisation approach (as seen in the continuous mathematics course) or a Bayesian optimisation approach. The aim of this project is to compare these approaches. Prerequisites: linear algebra, continuous mathematics, probability | ||||||||||
![]() |
Model Checking LTL on Markov Chains | James Worrell | Automated Verification | B | C | The goal of this project is to write a program that model checks a Markov chain against an LTL formula, i.e., calculates the probability that formula is satisfied. The two main algorithmic tasks are to efficiently compile LTL formulas into automata and then to solve systems of linear equations arising from the product of the Markov chain and the automaton. An important aspect of this project is to make use of an approach that avoids determinising the automaton that represents the LTL formula. This project builds on material contained in the Logic and Proof and Models of Computation Courses. Reading: J-M. Couvreur, N. Saheb and G. Sutre. An optimal automata approach to LTL model checking of probabilistic systems. Proceedings of LPAR'03, LNCS 2850, Springer 2003. |
||||||||||
![]() |
Topics in Linear Dynamical Systems | James Worrell | Automated Verification | C | MSc | A linear dynamical system is a discrete- or continuous-time system whose dynamics is given by a linear function of the current state. Examples include Markov chains, linear recurrence sequences (such as the Fibonacci sequence), and linear differential equations. This project involves investigating the decidability and complexity of various reachability problems for linear dynamical systems. It would suit a mathematically oriented student. Linear algebra is essential, and number theory or complexity theory is desirable. A relevant paper is Ventsislav Chonev, Joël Ouaknine, James Worrell: The orbit problem in higher dimensions. STOC 2013: 941-950. | ||||||||||
![]() |
Distributed Programming with Distributed Protocols in Scala | Nobuko Yoshida | Programming Languages | C | Session type systems facilitate the development of concurrent and distributed software by enabling programmers to describe communication protocols in types. Not only does this aid understanding and correctness of communicating code, it can guarantee desirable behavioural properties (e.g. deadlock-freedom). Concurrency libraries, e.g. Effpi, enable the use of session types in real-world programming languages and systems. Unfortunately, in practice, such libraries can require a significant number of channels between protocol participants, potentially hindering maintenance and be a source of error. The goal of this project is to raise the level of abstraction in the Effpi concurrency library. It will first focus on safely abstracting away low-level binary channels that are currently exposed to the programmer. The project will explore possible abstractions, e.g. session endpoints and skeletons of communication. Further work would be to explore additional abstractions to improve the correctness and maintainability of concurrent software using the Effpi library. Reference: http://dx.doi.org/10.1145/3314221.3322484 |
|||||||||||
![]() |
Distributed Programming with Distributed Protocols in Scala | Nobuko Yoshida | Programming Languages | B | Session type systems facilitate the development of concurrent and distributed software by enabling programmers to describe communication protocols in types. Not only does this aid understanding and correctness of communicating code, it can guarantee desirable behavioural properties (e.g. deadlock-freedom). Concurrency libraries, e.g. Effpi (Scala), enable the use of session types in real-world programming languages and systems. The goal of this project is to understand basic of session types, to implement distributed protocols in Scala and investigate its usability. Reference: [1] Session Types https://dl.acm.org/doi/pdf/10.1145/2873052 [2] http://dx.doi.org/10.1145/3314221.3322484 |
|||||||||||
![]() |
Formalism of The Go Language | Nobuko Yoshida | Programming Languages | B | Go is a popular statically typed programming language designed and implemented by Google, and used, among others, by Uber, Netflix, and Docker. The recent release of Go 1.18 finally realised the long awaited addition of Generics (parametric polymorphism) to the Go type system. Much academic work has been conducted to ensure that generics are correctly implemented, but there is still more to do [1,2,3,4]. This project aims to give a survey of those four papers, and take one of mini-projects: (1) build some examples using the existing prototype of [1,2,3,4] or (2) survey important features included in Go 1.18 such as type inference and type sets [5] but not included in [1,2,3,4]. [1] https://dl.acm.org/doi/pdf/10.1145/3428217 [2] https://dl.acm.org/doi/10.1007/978-3-030-89051-3_7 [3] https://dl.acm.org/doi/10.1007/978-3-031-16912-0_7 [4] https://arxiv.org/abs/2208.06810v2 [5] https://go.googlesource.com/proposal/+/master/design/43651-type-parameters.md |
|||||||||||
![]() |
Mechanisation of Distributed Protocol Specifications | Nobuko Yoshida | Programming Languages | C | Multiparty session types (MPST) offer a specification and verification framework for concurrency: communicating systems can be safely implemented in a distributed fashion, when well-typed against local session types, provided that such local types are obtained by projection of a single choreographic protocol (global type). Multiple projects are available, please get in touch with nobuko.yoshida@cs.ox.ac.uk for more detail. Possible aims are: (i) exploring and implementing an algorithms for MPST protocol compositionality or (ii) formalising correctness of advanced MPST systems (e.g., featuring merge, subtyping, or delegation). Students with a strong interest in the mechanisation of MPST in proof assistants (Coq, Isabelle, Idris) are very welcome to reach out. [1] Mechanisation Paper https://dl.acm.org/doi/10.1145/3453483.3454041 [2] Multiparty Session Types Paper https://link.springer.com/chapter/10.1007/978-3-030-36987-3_5 |
|||||||||||
![]() |
Program Transformation of Distributed Protocol Specification | Nobuko Yoshida | Programming Languages | C | Session types [1] describe patterns of communication in concurrent and distributed systems. Moreover, they guarantee certain desirable behavioural properties, such as deadlock freedom. Although session type libraries exist for a range of programming languages, there is little programmer support for their integration into existing and legacy code. This necessarily raises the level of required expertise to introduce and maintain session typed systems, and can consequently be a source of error. The goal of this project is to develop tools and techniques that can assist programmers in the introduction and manipulation of session types in legacy code. Inspired by similar work for parallelism [2], the project will focus on program transformation techniques, including refactoring [2,3,4], to develop safe and semi-automatic means to not only introduce session types, but modify existing session types in situ. [1] N. Yoshida and L. Gheri: A Very Gentle Introduction to Multiparty Session Types. ICDCIT 2020: 73-93 [2] C. Brown, et al.: Refactoring GrPPI: Generic Refactoring for Generic Parallelism in C++. Int. J. Parallel Program. 48(4): 603-625 (2020) [3] T. Mens and T. Tourwé: A Survey of Software Refactoring. IEEE Trans. Software Eng. 30(2): 126-139 (2004) [4] S. J. Thompson and H. Li: Refactoring tools for functional languages. J. Funct. Program. 23(3): 293-350 (2013) [5] R. N. S. Rowe, et al.: Characterising renaming within OCaml's module system: theory and implementation. PLDI 2019: 950-965 |
|||||||||||
![]() |
Rust programming language for communication and distribution. | Nobuko Yoshida | Programming Languages | C | Rust is the most beloved programming language since 2016 according to the annual survey by Stack Overflow. Thanks to its efficiency and memory safety, it is now one of the most popular languages of large-scale concurrent applications such as Servo, a browser engine by Firefox, Stratis, a file system manager for Fedora, and Microsoft Azure. Memory safety is one of the core principles of Rust but does not extend to communication safety, making the implementation of deadlock-free protocols challenging. Our group has been working on implementing library for asynchronous Rust programming and we wish to tackle several challenges such as refinement types, dependent types, and reversible computing with Rust. Such projects can be both theoretical and practical. EXAMPLE (1): Dependent types and asynchronous message reordering In this project, we are interested in developing the dependent type theory for asynchronous message reordering [2] and use Rumpsteak to implement the theory [1]. EXAMPLE (2): Reversible computing In this project, we are interested in using Rust to implement a reversible process calculus. Reversible process calculi are widely studied, in particular for debugging [3]. In this project, we want to explore the usefulness of reversibility as a programming primitive, similar to [4]. Direct applications include fault tolerance (e.g. when merged with affine MPST [5]) or speculative execution. Reference: [1] Zak Cutner, Nobuko Yoshida, Martin Vassor: Deadlock-Free Asynchronous Message Reordering in Rust with Multiparty Session Types. PPoPP '22 : 261 - 246. https://dl.acm.org/doi/10.1145/3503221.3508404 [2] Silvia Ghilezan, Jovanka Pantovic, Ivan Prokic, Alceste Scalas, Nobuko Yoshida: Precise Subtyping for Asynchronous Multiparty Sessions. POPL 2021 : 16:1 - 16:28. https://dl.acm.org/doi/10.1145/3434297 [3] See for instance the CauDEr: https://github.com/mistupv/cauder [4] Controlling Reversibility in Higher-Order Pi, Ivan Lanese, Claudio Antares Mezzina, Alan Schmitt & Jean-Bernard Stefani, https://doi.org/10.1007/978-3-642-23217-6_20 [5] http://mrg.doc.ic.ac.uk/publications/affine-rust-programming-with-multiparty-session-types/ |
|||||||||||
![]() |
Rust programming language for communication and distribution. | Nobuko Yoshida | Programming Languages | B | Rust is the most beloved programming language since 2016 according to the annual survey by Stack Overflow. Thanks to its efficiency and memory safety, it is now one of the most popular languages of large-scale concurrent applications such as Servo, a browser engine by Firefox, Stratis, a file system manager for Fedora, and Microsoft Azure. Memory safety is one of the core principles of Rust but does not extend to communication safety, making the implementation of deadlock-free protocols challenging. Our group has been working on implementing library for asynchronous Rust programming. The mini-projects focus on either: (1) survey recent work on Rust with session types and implement an example using [1]; or (2) survey recent work on Rust with session types and study a theory of asynchronous message reordering in Rust [2] Reference: [1] Zak Cutner, Nobuko Yoshida, Martin Vassor: Deadlock-Free Asynchronous Message Reordering in Rust with Multiparty Session Types. PPoPP '22 : 261 - 246. https://dl.acm.org/doi/10.1145/3503221.3508404 [2] Silvia Ghilezan, Jovanka Pantovic, Ivan Prokic, Alceste Scalas, Nobuko Yoshida: Precise Subtyping for Asynchronous Multiparty Sessions. POPL 2021 : 16:1 - 16:28. https://dl.acm.org/doi/10.1145/3434297 |
|||||||||||
![]() |
Survey of Mechanisation of Distributed Protocol Specifications, Session Types | Nobuko Yoshida | Programming Languages | B | Session type systems for distributed and concurrent computations, encompass concepts such as interfaces, communication protocols, and contracts. The session type of a software component specifies its expected patterns of interaction using expressive type languages, so types can be used to determine automatically whether the component interacts correctly with other components. There are several recent work on mechanisation (proof assistants) of session types (Coq, Isabelle, Idris). This project gives an updated survey on mechanisation of session types. [1] Example of mechanisation papers in Coq https://dl.acm.org/doi/10.1145/3453483.3454041 [2] Multiparty Session Types Paper https://link.springer.com/chapter/10.1007/978-3-030-36987-3_5 |
|||||||||||
![]() |
Survey of Session Types Literature | Nobuko Yoshida | Programming Languages | B | This project produces an updated literature survey of session types. Session type systems for distributed and concurrent computations, encompass concepts such as interfaces, communication protocols, and contracts. The session type of a software component specifies its expected patterns of interaction using expressive type languages, so types can be used to determine automatically whether the component interacts correctly with other components. Session type systems are studied in the context of theory (including automata theory, semantics, linear logic and type theories), programming languages (Java, Scala, Go, Rust, TypeScripts, PureScripts, MPI-C, C, Python, Erlang, F*, F#, Haskell, OCaml, and more) and system applications. There are four surveys in past but they are already outdated. The project produces an updated survey focusing on theory, programming languages and/or applications. [1] Theory https://dl.acm.org/doi/pdf/10.1145/2873052 [2] Tools https://www.awesomebooks.com/book/9788793519824/behavioural-types-from-theory-to-tools-river-publishers-series-in-automation-control-and-robotics [3] Security https://www.sciencedirect.com/science/article/pii/S2352220815000851?via%3Dihub [4] Programming Languages https://doi.org/10.1561/2500000031 |
|||||||||||
![]() |
The Go Language with Generic Types | Nobuko Yoshida | Programming Languages | C | Go is a popular statically typed programming language desinged and implemented by Google, and used, among others, by Uber, Netflix, and Docker. The recent release of Go 1.18 finally realised the long awaited addition of Generics (parametric polymorphism) to the Go type system. Much academic work has been conducted to ensure that generics are correctly implemented, but there is still more to do [1,2,3,4]. This project aims to further reduce the gap between theory and practice, allowing the Go Team to improve their implementation of generics in Go. Project 1: Formalise (and prove correct) a dictionary-passing translation from Featherweight Generic Go (FGG) to a lambda-calculus with pattern matching or other suitable low-level language. Project 2: The current model of generics in Go (FGG) does not include important features included in Go 1.18 such as type inference and type sets [5]. The proposed project would be to formalise a true FGG (including aforementioned features) along with a correct translation. [1] https://dl.acm.org/doi/pdf/10.1145/3428217 [2] https://dl.acm.org/doi/10.1007/978-3-030-89051-3_7 [3] https://dl.acm.org/doi/10.1007/978-3-031-16912-0_7 [4] https://arxiv.org/abs/2208.06810v2 [5] https://go.googlesource.com/proposal/+/master/design/43651-type-parameters.md |
|||||||||||
![]() |
Ethical Web and Data Architecture for Gig Workers | Jun Zhao, Rui Zhao | Human Centered Computing | MSc | Description: The Oxford Ethical Web and Data Architecture (EWADA) project (ewada.ox.ac.uk) explores how to re-decentralise the power on the web and provide individuals with more data autonomy and transparency. EWADA works closely with the Solid project (solidproject.org), an open-source project started in MIT, to investigate how to build such an ethical web for everyone. One area we are looking into is to support Gig Workers (such as Uber driver) gaining more transparency about how their pay rate is calculated by large platforms and how they can gain more autonomy. Prerequisites: In this project, you will explore what tools and technologies are already out there to support gig workers free from the monopoly of large platforms. You will then design and implement a Solid-based prototype to give gig workers more control of their data and allow them to gain value from their data without losing control of data privacy. You will evaluate the project by assessing the robustness and soundness of the prototype. For able students, you could also work with Gig workers to assess the usefulness of the system. | |||||||||||
![]() |
Fairness in Distributed/Decentralised Machine Learning | Jun Zhao, Naman Goel | Human Centered Computing | MSc | An emerging paradigm of machine learning in distributed settings is federated learning. The idea is that data of different agents stays with them and learning happens in a relatively "decentralised" manner. One notable aspect of federated learning is the non-iid nature of the data across different clients. In the first part of the project, the student will identify the technical challenges that this poses for fairness properties of the trained machine learning model. The goal of fair machine learning is to ensure that decisions taken by machine learning systems don't discriminate based on sensitive attributes like race and gender. This part will involve getting familiar with the literature and some experiments or simulations. The second part of the project will involve proposing, implementing and evaluating solutions to address the problem. Further, the student will also have the opportunity to work on proving formal guarantees for their algorithm. Prerequisites: Good understanding and hands-on experience with machine learning and statistics, proficiency in Python, interest in ML fairness. Students are encouraged to reach out to Naman Goel (naman.goel@cs.ox.ac.uk) to discuss more project ideas related to above topics. | |||||||||||
![]() |
Fairness in Dynamic Settings | Jun Zhao, Naman Goel | Human Centered Computing | MSc | The goal of fair machine learning is to ensure that decisions taken by machine learning systems don't discriminate based on sensitive attributes like race and gender. Most of the fair machine learning literature focuses on settings in which individuals can't change their features that a model takes as input. However, in many situations, the individuals can modify their features to get a positive decision based on the apriori information or feedback/explanation provided by the model/decision-maker. How should fairness be defined in these settings, such that certain individuals or groups do not face disproportionate obstacles in achieving better outcomes? In this project, the student will start by empirically evaluating the existing approaches to fairness. In collaboration with the supervisor, the student will then define the appropriate notion(s) of fairness for the above settings and develop techniques to satisfy those notions. The student will implement the technique and use both simulations and real-world datasets to evaluate the effectiveness of the technique. Further, depending on interest and time, there will also be the opportunity to work on proving formal guarantees for the algorithm and/or a human subject experiment. Prerequisites: Familiarity with machine learning, proficiency in Python, interest in ML fairness . Students are encouraged to reach out to Naman Goel (naman.goel@cs.ox.ac.uk) to discuss more project ideas related to above topics. | |||||||||||
![]() |
Fairness in Human-AI Collaboration | Jun Zhao, Naman Goel | Human Centered Computing | MSc | The goal of fair machine learning is to ensure that decisions taken by machine learning systems don't discriminate based on sensitive attributes like race and gender. A limitation/weakness of current approaches to fair machine learning is that they do not account for the behaviour of the humans who use these ML systems as decision-support tools. This may result in fairness being violated in the final decisions. In this project, the student will explore this aspect more formally. In particular, we would be interested in understanding under what conditions, fairness outcomes can improve if the human has more control over the output and internal design of the ML system. The student will be encouraged to use simulations and/or analytical methods to derive the conditions. Depending on time and interest, there will also be an opportunity to conduct human subject experiments. Prerequisites: Familiarity with machine learning, proficiency in Python, interest in ml fairness and behavioural studies. Students are encouraged to reach out to Naman Goel (naman.goel@cs.ox.ac.uk) to discuss more project ideas related to above topics. | |||||||||||
![]() |
Improving the experience of MPC in decentralized contexts | Jun Zhao, Rui Zhao | Human Centered Computing | C | MSc | SoLiD (Social Linked Data) or other Personal Data Storage (PDS) systems. Such decentralized PDS is gaining attention in recent years, and supporting MPC in such scenarios is crucial for privacy and user control. We have benchmarked two different models of applying MPC in decentralized contexts, and developed an architecture implementation to perform MPC on SoLiD, a popular framework for enabling decentralised data autonomy for web users. This implementation demonstrates the architecture well, but is not easy to use by normal users. This project aims to investigate a more usable privacy-preserving computation for the users by optimising the efficiency of the computation or pre-configure MPC parameters (protocol, number of clients, number of players, etc) for the users. | ||||||||||
![]() |
User Interface for understanding and reasoning on data Terms of Use | Jun Zhao, Rui Zhao | Human Centered Computing | C | MSc | Description Ticking without reading “I have read and agree to the Terms of Service” when registering online accounts is known as “the biggest lie on the Internet”. We are all guilty of ticking without reading it, but we recognize the necessity of the existence of these terms. Data Terms of Use (DToU) is a similar but more approachable concept. It specifies what the data users need to follow before, during, and after using the data. They tend to be human-understandable, to promote more data users comply with them. In a decentralised web context - where everyone becomes a data provider, everyone will need to write their own DToU. In addition, not only do they need to govern the direct use of data, the use of derived data is also subject to such DToU. Handling such DToU is often labour-consuming and error-prone. Formalisation and automated reasoning have the potential to improve the handling and understanding of such DToU. We have previously developed a formal DToU language and the reasoning engine for decentralised contexts, with potential to be used in wider scenarios. In this project, the student is expected to investigate how usable this formal logic-based DToU language is and assess the performance of the framework in the decentralised setting at a large scale. |
||||||||||
![]() |
Topics in Algorithms, Complexity, and Combinatorial Optimisation | Standa Živný | Algorithms and Complexity Theory | C | MSc | Prof Zivny is willing to supervise in the area of algorithms, complexity, and combinatorial optimisation. In particular, on problems related to convex relaxations (linear and semidefinite programming relaxations), submodular functions, and algorithms for and complexity of homomorphisms problems and Constraint Satisfaction Problems. Examples of supervised projects involve extensions of min-cuts in graphs, analysis of randomised algorithms for graph and hypergraph colourings, and sparsification of graphs and hypergraphs. The projects would suit mathematically oriented students, with interest in rigorous analysis of algorithms and applications of combinatorics and probabilistic methods to computer science. |