OAEI 2012::Large BioMed Track

OAEI 2012::Large BioMed Track

SNOMED-NCI matching problem

We have split the SNOMED-NCI matching problem in three tasks involving different fragments of SNOMED and NCI. The reference alignments will be the same for the three tasks, however the complexity will be different, in terms of both performance and scalability, since larger ontologies will also involve more possible candidate mappings.

Note that ontologies have been normalised for the OAEI, as a result the synonyms of concept names are provided as "rdfs:label" annotations.

The complete datasets for the OAEI 2012 campaign can be downloaded as a zip file.

Reference alignments

There are 2 UMLS-based reference alignments for the SNOMED-NCI matching tasks. Note that, at the time of creating the datasets, we could not compute a refined UMLS alignment set with Alcomo. The new version of Alcomo, however, has shown to be able to cope with SNOMED-NCI.

Original UMLS mappings: 18,844 mappings ("=")
Refined UMLS mappings (LogMap): 18,324 mappings ("=", "<", ">")

Test Suite Information

Required input for SEALS OMT client:

Repository: http://seals-test.sti2.at/tdrs-web/
Suite-ID: cf0378d9-da30-4b58-b937-192028ed4961
Version-ID: see specific task

Note that, if you are using the OWL API, the following parameter "-DentityExpansionLimit=100000000" should be given to the JVM in order to be able to load large ontologies.

Task 1: SNOMED-NCI small fragments

This task consists of matching two (relatively) small fragments of SNOMED and NCI. The SNOMED fragment contains 51,128 classes (17% of SNOMED), while the NCI fragment contains 23,958 classes (36% of NCI).

Version-ID: d4721f5f-0bb1-4b59-8e81-e3c7ad38f06b

Task 2: SNOMED-NCI large fragments

This task consists of matching two (relatively) large fragments of SNOMED and NCI. The SNOMED fragment contains 122,464 classes (40% of SNOMED), while the NCI fragment contains 49,795 classes (75% of NCI).

Version-ID: 611c0450-5230-4b2c-a8fb-80280292e9e5

Task 3: SNOMED-NCI whole ontologies

This task consists of matching the whole NCI that contains 66,724 classes with a large SNOMED fragment that contains 122,464 classes (40% of SNOMED).

Version-ID: f85f75d6-2b63-440d-bfb8-bc239fa12f2c