OAEI 2012::Large BioMed Track

NEWS:

2012 results available now!

General description

This track consists of finding alignments between the Foundational Model of Anatomy (FMA), SNOMED CT, and the National Cancer Institute Thesaurus (NCI). These ontologies are semantically rich and contain tens of thousands of classes.

UMLS Metathesaurus has been selected as the basis for the track reference alignments (see oaei2012_umls_reference for details). UMLS is currently the most comprehensive effort for integrating independently-developed medical thesauri and ontologies, including FMA, SNOMED CT, and NCI. The integration of new UMLS sources combines automatic techniques, expert assessment, and auditing protocols.

Data sets

The Large BioMed Track consists of three matching problems. The complete datasets for the OAEI 2012 campaign can be downloaded as a zip file.

Modalities and SEALS support

This track has two main objectives. On the one hand, it intends to evaluate the performance of matching systems when matching real large scale ontologies. On the other hand, it aims at creating an error-free "silver standard" reference alignment by "harmonising" the output of different matching and debugging systems, together with the current UMLS mapping sets. See OAEI 2011.5 harmonisation.

Regarding the use of background knowledge, the OAEI rules state that a resource (i.e. a third biomedical ontology) especially designed for the test is not allowed. Particularly, matching systems using UMLS as background knowledge will have an advantage since the reference alignment is also based on UMLS. Nevertheless, it will be interesting to evaluate the performance of a system with and without specialised background knowledge. Moreover, matching systems using UMLS may be specially helpful in the creation of the proposed "silver standard" reference alignment.

Modality 1: standard matching

For this modality the generated alignment should be an optimal solution to the matching problem with respect to both recall and precision. In the evaluation we will focus on the F-measure. Furthermore, we also motivate the creation of an error-free output, that is, the extracted mappings together with the ontologies should not lead to (many) unsatisfiabilities.

The evaluation of Modality 1 will be run with support of SEALS. This requires that you wrap your matching system in a way that allows us to execute it on the SEALS platform.

Modality 2: mapping debugging (optional)

Mapping debugging systems are also welcome to provide a revised version of the original UMLS mappings, similar to the current provided refinements.

We aim at harmonising different revised subsets of the UMLS mappings together with the outputs of the participants from Modality 1 in order to create an error-free "silver standard" reference alignment. Participant outputs will also be compared against the silver standard in order to analyse how different they are w.r.t. the other systems.

Modality 2 will be optional and will be run in an 'off-line' way.