Ontology Alignment Evaluation Initiative - OAEI-2020 Campaign

Large BioMed Track

Results OAEI 2020::Large BioMed Track

Contact

If you have any question/suggestion related to the results of this track or if you notice any kind of error (wrong numbers, incorrect information on a matching system, etc.), feel free to write an email to ernesto [.] jimenez [.] ruiz [at] gmail [.] com

Evaluation setting

We have run the evaluation in a Ubuntu 18 Laptop with an Intel Core i5-6300HQ CPU @ 2.30GHz x 4 and allocating 15Gb of RAM.

Precision, Recall and F-measure have been computed with respect to a UMLS-based reference alignment. Systems have been ordered in terms of F-measure.

Unique mappings show the number of mappings computed by a system that are not predicted by any of the other participants (including variants).

Check out the supporting scripts to reproduce the evaluation: https://github.com/ernestojimenezruiz/oaei-evaluation

Participation and success

In the OAEI 2020 largebio track 8 participating systems have been able to complete at least one of the tasks of the largebio track within a 8 hours timeout. Six systems were able to complete all six largebio tracks. DESKMatcher and ALOD2Vec finished with an "OutOfMemoryException" with the larger tasks.

Use of background knowledge

LogMapBio uses BioPortal as mediating ontology provider, that is, it retrieves from BioPortal the most suitable top-10 ontologies for the matching task.

LogMap uses normalisations and spelling variants from the general (biomedical) purpose SPECIALIST Lexicon.

AML has three sources of background knowledge which can be used as mediators between the input ontologies: the Uber Anatomy Ontology (Uberon), the Human Disease Ontology (DOID) and the Medical Subject Headings (MeSH).

Alignment coherence

Together with Precision, Recall, F-measure and Runtimes we have also evaluated the coherence of alignments. We have reported (1) number of unsatisfiabilities when reasoning with the input ontologies together with the computed mappings, and (2) the ratio/degree of unsatisfiable classes with respect to the size of the union of the input ontologies.

We have used the OWL 2 reasoner HermiT to compute the number of unsatisfiable classes. For the cases in which HermiT could not cope with the input ontologies and the mappings (in less than 2 hours) we have provided a lower bound on the number of unsatisfiable classes (indicated by ≥) using the OWL 2 EL reasoner ELK.

As in previous OAEI editions, only two systems have shown mapping repair facilities, namely: AML and LogMap (including its LogMapBio variant). Both systems produce relatively clean outputs in FMA-NCI and FMA-SNOMED cases; however in the SNOMED-NCI cases AML mappings lead to a number of unsatisfiable classes. The results also show that even the most precise alignment sets may lead to a large amount of unsatisfiable classes. This proves the importance of using techniques to assess the coherence of the generated alignments.


1. System runtimes and task completion

System FMA-NCI FMA-SNOMED SNOMED-NCI Average # Tasks
Task 1 Task 2 Task 3 Task 4 Task 5 Task 6
LogMapLt 2 9 2 15 9 18 9 6
ATBox 8 41 8 54 34 75 37 6
AML 38 82 101 181 629 381 235 6
LogMap 9 130 52 624 211 719 291 6
LogMapBio 1,238 1,447 1,643 7,046 2,489 4,069 2,989 6
Wiktionary 258 14,135 697 24,379 2,926 18,360 10,126 6
ALOD2Vec 178 - - - - - 178 1
DESKMatcher 816 - - - - - 816 1
# Systems 8 6 6 6 6 6 1,835 38
Table 1: System runtimes (s) and task completion.


2. Results for the FMA-NCI matching problem

Task 1: FMA-NCI small fragments

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 38 2,723 71 0.958 0.910 0.933 2 0.020%
LogMap 2 2,766 4 0.945 0.902 0.923 2 0.020%
LogMapBio 1,238 2,882 68 0.923 0.918 0.920 2 0.020%
Wiktionary 258 2,610 3 0.967 0.864 0.913 2,552 25.1%
ALOD2Vec 178 2,751 129 0.918 0.868 0.892 7,671 75.3%
LogMapLt 2 2,480 10 0.967 0.819 0.887 2,104 20.7%
ATBox 8 2,317 5 0.981 0.781 0.870 314 3.1%
DESKMatcher 816 2,181 1,441 0.309 0.241 0.271 10,129 99.5%
Table 2: Results for the largebio task 1.

Task 2: FMA-NCI whole ontologies

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 82 3,109 442 0.806 0.881 0.842 2 0.013%
LogMap 9 2,668 33 0.867 0.805 0.835 3 0.019%
LogMapBio 1,447 2,855 88 0.830 0.828 0.829 2 0.013%
LogMapLt 9 3,458 70 0.676 0.819 0.741 5,554 36.1%
Wiktionary 14,136 4,067 507 0.601 0.863 0.709 8,128 52.8%
ATBox 41 2,807 265 0.696 0.688 0.692 9,313 60.5%
Table 3: Results for the largebio task 2.


3. Results for the FMA-SNOMED matching problem

Task 3: FMA-SNOMED small fragments

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 101 6,988 905 0.923 0.762 0.835 0 0%
LogMapBio 1,643 6,530 112 0.930 0.702 0.800 1 0.004%
LogMap 52 6,282 1 0.947 0.690 0.798 1 0.004%
ATBox 8 6,179 231 0.972 0.648 0.778 8,271 35.1%
Wiktionary 697 1,723 32 0.960 0.218 0.355 774 3.3%
LogMapLt 2 1,642 3 0.968 0.208 0.342 771 3.3%
Table 4: Results for the largebio task 3.

Task 4: FMA whole ontology with SNOMED large fragment

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
LogMapBio 7,046 6,470 162 0.832 0.648 0.729 0 0%
LogMap 624 6,540 271 0.811 0.642 0.717 0 0%
AML 181 8,163 2,818 0.685 0.710 0.697 0 0%
Wiktionary 24,379 2,034 227 0.782 0.218 0.341 989 3.0%
LogMapLt 15 1,820 26 0.851 0.208 0.334 974 2.9%
ATBox 54 1,880 124 0.801 0.207 0.329 958 2.9%
Table 5: Results for the largebio task 4.


4. Results for the SNOMED-NCI matching problem

Task 5: SNOMED-NCI small fragments

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 629 14,740 2,428 0.906 0.746 0.818 ≥3,967 ≥5.3%
LogMapBio 2,489 13,595 767 0.913 0.694 0.789 ≥0 ≥0%
LogMap 9 12,462 12 0.957 0.666 0.785 ≥0 ≥0%
Wiktionary 2,926 11,260 216 0.941 0.578 0.716 ≥51,724 ≥68.9%
LogMapLt 9 10,921 86 0.949 0.566 0.709 ≥60,447 ≥80.5%
ATBox 34 9,763 53 0.970 0.517 0.674 ≥55,868 ≥74.4%
Table 6: Results for the largebio task 5.


Task 6: NCI whole ontology with SNOMED large fragment

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 381 14,196 2,209 0.862 0.687 0.765 ≥535 ≥0.6%
LogMap 719 13,230 105 0.874 0.650 0.746 ≥1 ≥0.001%
LogMapBio 4,069 13,495 929 0.825 0.625 0.711 ≥0 ≥0%
LogMapLt 18 12,864 525 0.798 0.566 0.662 ≥72,865 ≥87.1%
Wiktionary 18,361 13,668 1,188 0.765 0.577 0.658 ≥68,466 ≥81.8%
ATBox 75 10,621 245 0.870 0.509 0.642 ≥65,543 ≥78.3%
Table 7: Results for the largebio task 6.