Skip to main content

Reducing memory consumption of the cryptographic protocol verifier ProVerif with hash consing techniques

Supervisor

Suitable for

MSc in Advanced Computer Science
Mathematics and Computer Science, Part C
Computer Science and Philosophy, Part C
Computer Science, Part C
Computer Science, Part B

Abstract

Security protocols aim at securing communications. They are used in various applications (e.g. establishment of secure channels over the Internet, secure messaging, electronic voting, mobile communications, etc.) Their design is known to be error prone and flaws are difficult to fix once a protocol is largely deployed. Hence a common practice is to analyze the security of a protocol using formal techniques and in particular automatic tools. This approach has lead to the development of several verification tools such as ProVerif (https://proverif.inria.fr), Tamarin (https://tamarin-prover.github.io) and DeepSec (https://deepsec-prover.github.io) that rely on symbolic models to express protocols and their security properties. Symbolic, automated verification has been a very successful approach for revealing vulnerabilities or proving the absence of entire families of attacks. For instance, these tools were used to verify the security of several major protocols (e.g. TLS 1.3, 5G protocols, Belenios electronic voting protocol).

While the recent overhaul of ProVerif [1] significantly improved its efficiency, it did not really affect its memory consumption. Indeed complex real-life protocols, such as TLS, required between 10 to 100GB of memory depending on the type of security properties being proved. This problem of memory is crucial as we reach the limits of the capabilities of our computers. We aim to change the internal representations of messages within ProVerif by using hash consing techniques [2] that allow to maximize the sharing of memory. In particular, two messages semantically equal will thus be ensured to be also physically equal (they will point to the same address in the memory). It will require to rework most of the implemented algorithms operating on messages (resolution, unification, matching, subsumption,…). The internal algorithm of ProVerif being based mostly on the saturation of Horn clauses, a maximal sharing of memory within each Horn clause but also within the set of Horn clauses generated by ProVerif will allow us to considerably reduce the memory consumption of ProVerif. Maximizing the memory sharing within one clause is fairly straightforward. However, for sets of Horn clauses, the problem becomes much harder due to the presence of variables that sometimes need to be considered distinct within two Horn clauses.

Objectives:         

  • get familiar with hash consing techniques and implement a library for managing the new representation of terms.         
  • study the classical algorithm for unification on DAG terms and propose an adaptation for our new structure of terms.         
  • create similar algorithms for resolution, matching, subsumption, unification modulo and equational theory        
  • propose an algorithm for sharing memory on a set of Horn clauses and find the complexity upper bound of the problem
  • implement these algorithms and test them on provided benchmarks to evaluate their efficiency.

Bibliography: Reading the papers and slides in the bibliography is not mandatory to complete the project but it will provide a good context for the project.

[1] Bruno Blanchet, Vincent Cheval, and Cortier Véronique. Proverif with lemmas, induction, fast subsumption, and much more. In Proceedings of the 43th IEEE Symposium on Security and Privacy (S&P’22). IEEE Computer Society Press, May 2022.

[2] Jean-Christophe Filliâtre, Sylvain Conchon. Type-safe modular hash-consing. ML 2006: 12-19

[3] https://www3.risc.jku.at/education/courses/ss2018/unification/slides/02_Syntactic_Unification_Improved_Algorithms_handout.pdf

Pre-requisites: Logic and Proof, Functional Programming