Skip to main content

A Recommender System for Scientific Referees Based on Bibliographic Databases and Knowledge Graphs

Supervisors

Suitable for

Mathematics and Computer Science
Computer Science and Philosophy
Computer Science
Mathematics and Computer Science, Part C
Computer Science and Philosophy, Part C
Computer Science, Part C

Abstract

Specify, design, and implement a recommender system for referees in Computer Science and other areas. The system should have the following functionality: A user submits to the recommender system the title and list of authors of a paper or project to be reviewed, along with some keywords. The system then compiles an ordered list of suggestions for referees, and for each referee also provides some hints as to why this referee is deemed to be competent. The system should also handle various constraints such as, for example, that referees should not have recent joint publications with the authors.

The design and implementation should follow a knowledge-based approach and use reasoning techniques and graph/network analysis to derive appropriate referees. Data from publicly available databases such as DBLP and ORCID shall be entered into a large integrated knowledge graph (KG). This KG should be enriched by further attributes, and appropriate recursive queries (e.g. in the form of Datalog programs) that select referees  should be designed and tested.

We envisage that the student will engage in the following steps and activities:

  •  Elaborate a more precise problem definition and specify the desired system functionality in more detail.
  • Search for open data that can be used, initially only for the field of Computer Science. Understand how this data can be accessed and downloaded. Assess how this data can best be used towards the above goals.
  • Get acquainted with knowledge graph technology and software (e.g. the VADALOG system, which will be freely available).
  • Experiment with recursive queries and establish various ways of obtaining good referees. For example, envisage a similarity-based approach in which an author and a referee are defined to be ‘similar’ if they publish in similar venues, where, however,  the definition of  venue ‘similarity’ is in part based on the similarity of authors publishing in those venues.  Find efficient algorithms (possibly probabilistic) for approximately computing such similarities.
  • Evaluate various referee-finding methods by interviewing domain experts (e.g. Computer Science academics, who can judge the appropriateness of the computed lists of referees, or can suggest better referees). Based on this evaluation, improve the system.
  • Design and Implement the final system and add a simple but pleasant user interface, and make it available as a Web service.  Evaluate the overall result.
  • Carry out all further necessary tasks which have not been listed here. 

Prerequisites: Strong and motivated student with an interest in several of the following topics: databases, software, knowledge graphs, AI, logic, reasoning, and Big Data. Good practical skills, but also a deep understanding of algorithms and recursion so as to be able to develop new efficient methods for querying and processing large amounts of data efficiently.