Skip to main content

Information access through search engines and digital libraries: past, present, and future

Maristella Agosti

Information Retrieval (IR) identifies the activities that a person – the user – has to conduct in order to choose from a collection of documents those that will satisfy a specific and contingent information need. The aim of IR therefore is to help and support the user in choosing those documents from among the available ones that are more likely to satisfy their information need. When the collection of documents reaches a size that makes manual inspection of the documents prohibitive, as is the case with web pages and web documents, the collection and the application of the retrieval function are automatically managed through an information retrieval system, also called a search engine. This means that an information retrieval system models and implements the retrieval process based on user input in order to produce an output, i.e. a ranked list of documents most likely to be of interest to the user.

Digital Libraries (DL) have been steadily progressing over time and they now determine how citizens and organizations study, learn, access and interact with their cultural heritage collections. Despite their name, DL are not only the digital counter-part of traditional libraries, but they also deal with other kinds of cultural heritage institutions, such as archives and museums, i.e. institutions typically referred to as Libraries, Archives and Museums (LAM). In the context of LAM, unifying a variety of organizational settings and providing more integrated access to their contents are aspects of the utmost importance. Although the type of materials may differ and professional practices may vary, LAM share an overlapping set of functions. The user needs have propelled the evolution of the Digital Library System (DLS) towards systems that allow us to design and implement the overlapping set of functions of LAM.

IR and DL have the common aim of satisfying user information needs across huge document collections. However, they deal with document collections that differ in the way they are built, in the type of documents that are managed and in the contents that are made available. In particular, Digital Libraries are also concerned with the management, preservation, curation and enrichment of collections describing and/or constituting cultural heritage artifacts.

In particular, we will show how in the early days both IR and DL were mostly concerned with retrieving and providing access to relevant information, which was then typically processed and used outside the systems providing it. Today, however, there is an increasing need to engage users with the digital contents through personalized search or by allowing them to add their own content in the form of annotations to collection resources.

The presentation is organized as follows: the first part sets the scene by addressing the peculiar aspects of IR and DL, with emphasis on the mutual influence that was discernable at the time between the two areas. The second part presents the results achieved in the two areas and highlights the possibilities, but also the problems, of access to information that the end users experience today. The third and final part shows the issues which would need to be urgently addressed to overcome some limitations in the current systems and to contribute to more effective access to information by end users. Some of the relevant issues include: supporting complex user tasks, where information search is just one step of a more complex process; understanding how users interact with search systems and managed resources in order to personalize and adapt search strategies and algorithms; developing the scientific paradigm for IR and DL to embrace data science and the fourth paradigm. 

Speaker bio

Professor Maristella Agosti is full professor in computer science at the Department of Information Engineering, University of Padua, Italy. In 1987 she set up within the Department the Information Management Systems (IMS) research group that she currently leads. Since October 2014 she has been Advisor of the First and Second Level Degrees in Computer Engineering offered by the Department. Her research focuses primarily on information retrieval, databases and digital libraries.

Since 2015 she has been Member of the Class of Mathematical and Natural Sciences of the Galileiana Academy of Arts and Science, formerly Patavina Academy (founded on 1599); from 2012 to 2015 she was National Correspondent Member of the same Academy.

From 2011 to 2013 she was a member of the national evaluation team of experts for the Industrial and Information Engineering area of the Italian Research Assessment Exercise (VQR 2004-10) performed by the National Agency for the Evaluation of Universities and Research Institutes (ANVUR). She is also currently a member of ACM, IEEE-CS, AICA (Italian computer science association) and AIUCD (Italian digital humanities association).

Her research interests are information access through search engines and digital libraries, user-oriented keyword-based search systems for structured data, annotation of digital contents, evaluation of digital libraries and archives, design and development of advanced services for archives and digital libraries, and user interaction with digital cultural heritage collections.

She has worked in computing science research for more than 35 years, edited 18 books and published over 200 articles in journals and conference proceedings. She has been and is currently involved in a number of research projects funded by the EU, MIUR, CNR and other funding bodies.

She is or has been principal investigator or senior researcher of several European Commission research projects including CULTURA (CULTivating Understanding and Research through Adaptivity), PROMISE European network of excellence (Participative Research labOratory for Multimedia and Multilingual Information Systems Evaluation), EuropeanaConnect, TrebleCLEF, TELplus, SAPIR, DELOS, and IDOMENEUS.

She was chair of the Steering Committee of the International Conference on Theory and Practice of Digital Libraries (TPDL) for the mandate from 2009 to 2012. She is a member of the Editorial board of the International Journal on Digital Libraries (Springer-Verlag) and for many years has been a member of the Editorial boards of the Information Processing and Management and Information Retrieval journals. In 1990 she designed and launched the European Summer School in Information Retrieval (ESSIR). In 2005, together with Costantino Thanos and other Italian experts, she designed and launched the Italian Research Conference on Digital Library Systems (IRCDL). She is and has been a member of programme committees of relevant conferences and events, including ACM SIGIR, CIKM, ECDL, TPDL, ACM/IEEE JCDL, ICADL, CoLIS, ECIR. EDBT.

Web page in the department web site, URL: http://www.dei.unipd.it/persona/EE67F726BB7CFF2CBB92409D8CC021EB

Partial list of publications in DBLP, URL: http://www.informatik.uni-trier.de/~ley/pers/hd/a/Agosti:Maristella.html. Google Scholar Citations: http://scholar.google.it/citations?user=gItyeokAAAAJ&hl=it.

Address: Department of Information Engineering, University of Padua, Via Gradenigo 6/a, I-35131 Padova, Italy Tel: +39 049 827 7650 email: maristella.agosti@unipd.it

 

 

Share this: