OAEI 2019::Large BioMed Track

General description

This track consists of finding alignments between the Foundational Model of Anatomy (FMA), SNOMED CT, and the National Cancer Institute Thesaurus (NCI). These ontologies are semantically rich and contain tens of thousands of classes.

UMLS Metathesaurus has been selected as the basis for the track reference alignments. UMLS is currently the most comprehensive effort for integrating independently-developed medical thesauri and ontologies, including FMA, SNOMED CT, and NCI.

Datasets and Evaluation

The generated system alignment should be an optimal solution to the matching problem with respect to both recall and precision with respect to the 2019 reference alignment.

The complete datasets for the OAEI 2019 campaign can be downloaded as a zip file (LargeBioMed_dataset_oaei2019.zip [17Mb]) or accessed via the HOBBIT or SEALS platforms.

Since 2014, instead of repairing the original UMLS reference alignments by removing mappings leading to unsatisfiable classes, we flagged the incoherence-causing mappings in the alignment by setting the relation to "?" (unknown). These "?" mappings will neither be considered as positive nor as negative when evaluating the participating ontology matching systems, but will simply be ignored. This way, systems that do not perform mapping repair are not penalized for finding mappings that (despite causing incoherences) may or may not be correct, and systems that do perform mapping repair are not penalized for removing such mappings either.

Nevertheless, we will still give a special attention to the number of unsatisfiabilities caused by the mappings computed by a participating system. Thus, we encourage system developers to implement mapping repair techniques or reuse state-of-the art techniques.

Regarding the use of background knowledge, the OAEI rules state that a resource (i.e. a third biomedical ontology) especially designed for the test is not allowed. Particularly, matching systems using the UMLS Metathesaurus as background knowledge will have a notable advantage since the reference alignment is also based on the UMLS Metathesaurus.

Modalities

HOBBIT Support and System Preparation

The evaluation under the HOBBIT platform requires that you wrap your matching system in a way that allows us to execute it using the HOBBIT infrastructure (see OAEI 2019 HOBBIT evaluation details here).

A description of how the HOBBIT largebio benchmark was prepared is available here.

A system participating in the largebio benchmark should explicitly mention that implements its API: bench:LargebioAPI. In practical terms the system adapter should interpret the information that is sent from the benchmark and communicate back the results (a file containing the alignment in RDF Alignment format). See the implementation of the LogMap system and its metadata description (system.ttl).

Benchmark API: bench:LargebioAPI

SEALS Support and System Preparation

The evaluation under the SEALS platform requires that you wrap your matching system in a way that allows us to execute it using the SEALS client (see OAEI 2019 SEALS evaluation details).

Required input for the SEALS OMT client:

Repository: http://repositories.seals-project.eu/tdrs/
Suite-ID: largebio
Version-ID: see specific task

All Largebio matching tasks

With the following Version-ID all 6 largebio matching tasks are executed. For individual matching tasks refer to the ids below.

Version-ID: largebio-all_tasks_2016

FMA-NCI matching problem

Task 1: FMA-NCI small fragments

This task consists of matching two (relatively) small fragments of FMA and NCI. The FMA fragment contains 3,696 classes (5% of FMA), while the NCI fragment contains 6,488 classes (10% of NCI).

Version-ID: largebio-fma_nci_small_2016

Task 2: FMA-NCI whole ontologies

This task consists of matching the whole FMA and NCI ontologies, which contains 78,989 and 66,724 classes, respectively.

Version-ID: largebio-fma_nci_whole_2016

FMA-SNOMED matching problem

Task 3: FMA-SNOMED small fragments

This task consists of matching two (relatively) small fragments of FMA and SNOMED. The FMA fragment contains 10,157 classes (13% of FMA), while the SNOMED fragment contains 13,412 classes (5% of SNOMED).

Version-ID: largebio-fma_snomed_small_2016

Task 4: FMA whole ontology with SNOMED large fragment

This task consists of matching the whole FMA that contains 78,989 classes with a large SNOMED fragment that contains 122,464 classes (40% of SNOMED).

Version-ID: largebio-fma_snomed_whole_2016

SNOMED-NCI matching problem

Task 5: SNOMED-NCI small fragments

This task consists of matching two (relatively) small fragments of SNOMED and NCI. The SNOMED fragment contains 51,128 classes (17% of SNOMED), while the NCI fragment contains 23,958 classes (36% of NCI).

Version-ID: largebio-snomed_nci_small_2016

Task 6: NCI whole ontology with SNOMED large fragment

This task consists of matching the whole NCI that contains 66,724 classes with a large SNOMED fragment that contains 122,464 classes (40% of SNOMED).

Version-ID: largebio-snomed_nci_whole_2016