Giorgio Orsi

Personal photo - Giorgio Orsi

Dr Giorgio Orsi

Senior Researcher

Leaving date: 6th January 2016


Semantic-Web, Knowledge Representation and Reasoning, Mobile and Context-aware Databases, Information and Data Extraction.


About me

I was born in Magenta near Milan (Italy) in 1982. Magenta was also the location of the Napoleonic battle known as the Battle of Magenta in 1859 and the reason why the color Magenta was named in this way.

I received a BSc (2004) and MSc (2006) in Computer Science and Engineering, and a PhD (2011) in Information Engineering from the Polytechnic University of Milan (Politecnico di Milano).

Track record in research

My research lies in the area of Information Engineering and deals with the algorithmic aspects of large-scale data processing and with the logical foundations of knowledge representation and reasoning.

I started my research career in 2006 when I was granted an MSc Research Assistant position at the Department of Electronics, Information and Bio-Engineering (DEIB) of the Technical University of Milan. At that time, my research dealt with problems in ontology-based information integration; in particular, schema and ontology matching.

In January 2008, I was awarded a full (3 years) PhD scholarship from the Italian Ministry of Education, Universities and Research (MIUR). My doctoral studies have been carried out at the Technical University of Milan under the supervision of Prof. Letizia Tanca and have been devoted to the investigation of the logical foundations of ontology-based information integration and personalization. This research was partially carried out at the University of California at Los Angeles (UCLA), under the supervision of Prof. Carlo Zaniolo, and at the University of Oxford, under the supervision of Prof. Georg Gottlob, FRS. The main outcome of the research has been a series of algorithms for context-aware schema and ontology matching implemented in the X-SOM system. The research outcomes have been published in the proceedings of major application-oriented computer science conferences (ESWC’07, MDM’07, FQAS ’09, EDBT ’11, CIKM’11), and in journals such as the Communications of the ACM. For this research, I also received a Doctor Europaeus award in 2011. This engineering- oriented research had practical impact in industrial products. Among these, the one with major impact has been the Metoda SAFE-Card system for time-critical support to paramedic personnel during emergency-care of patients with chronic cardio-vascular diseases. SAFE has been awarded the 4th Innovation Best Practices Award from the Confederation of Italian Industry in 2009 (the counterpart of the CBI). Two byproducts of this research have been MicroJena (2007) and, later, AndroJena (2011). MicroJena was the world-first open-source APIs for ontological data processing on Java ME devices.

In January 2011, I was granted a post-doctoral research position at the Department of Computer Science of the University of Oxford and a James Martin fellowship at the Institute for the Future of Computing of the Oxford-Martin School. My research was concerned with large-scale, automated and probabilistic reasoning with particular emphasis on applications to ontological databases and web data extraction. The major results have been the definition of a general algorithm for ontological query answering under existential data dependencies (ICDE’11, TODS’14) and its applications to conceptual modeling (FoSSaCS’12, RuleML’15). I also contributed to the definition of a new theoretical framework for probabilistic reasoning in expressive knowledge representation languages (UAI’12).

In 2013 I was selected by the Department of Computer Science of the University of Oxford for submission to the REF 2014 exercise as an independent Early-Career Researcher (ECR).

Since April 2013, my research focuses on large-scale web data extraction with major contributions made in the area of entity recognition (PVLDB’13), Web Automation (WWW’12, VLDBJ’13), and autonomous distributed data preparation processes (WWW’12, PVLDB’14). A major contribution in this area has been a formalism supporting modeling and verification of distributed, large-scale data processes based on synchronised networks of relational transducers.

Since April 2015 I am co-investigator of the EPSRC Programme Grant VADA (Value-Added Data System), dealing with problems in the area of autonomous, distributed big data preparation processes. The consortium includes the data management groups of the University of Oxford, the University of Edinburgh, and the University of Manchester. Industrial partners include Facebook, Microsoft, Huawei, and Alliance Bernstein.

I am co-founder and Head of Data Engineering of Wrapidity Ltd, a big data extraction company.

My recent results on autonomous distributed data wrangling processes lay the foundations of the research in web data management I intend to pursue in the coming years. Research on Big Data Wrangling is attracting substantial interest in the US industry and research communities as it enables value- added applications such as data analytics, mobile personalised assistants (i.e., Siri, Google Now, Cortana), and IoT.


Selected Publications

