Results OAEI 2012::Large BioMed Track
The satisfiability results, since currently no OWL 2 reasoner has shown to cope with the integration of SNOMED and NCI via mappings [url], have been estimated using the Dowling-Gallier algorithm [url] for propositional Horn satisfiability (implemented in LogMap's repair facility).
The SNOMED-NCI matching problem moves to a next level of difficulty with respect to the FMA-SNOMED matching problem and, in general, runtimes and results are slightly worse. Furthermore, Hertuda and HotMatch, which were able to complete the small FMA-NCI and the small FMA-SNOMED tasks, failed to complete the small SNOMED-NCI task in less than 24 hours.
Six systems provided an F-measure higher than our baseline LogMapLt and their F-measures were very close to each other. On the other hand, GOMMA, MapSSS and AROMA failed to top LogMapLt results. LogMap-noe provided the best results in terms of recall and F-measure while ServOMap generated the most precise mappings.
As in the FMA-NCI and FMA-SNOMED matching problems, precision tend to increase when comparing against the original UMLS mapping set, while recall decreases.
The runtimes were also positive in general and 7 systems completed the task in less than 4 minutes. YAM++ required more than 30 minutes, while AROMA and MapSSS needed 4 and 8 hours to complete the task, respectively.
LogMap (with its two variants) generated a set of output mappings that did not lead to any unsatisfiable class when reasoning (using Dowling-Gallier algorithm) together with the input ontologies. The rest of the systems generated mapping sets that lead to a degree of incoherence greater than 50%.
System | Time (s) | # Mappings | Original UMLS | Refined UMLS (LogMap) | Average | Incoherence Analysis | ||||||||
Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | All Unsat. | Degree | Root Unsat. | |||
LogMap-noe | 211 | 13,525 | 0.897 | 0.644 | 0.750 | 0.893 | 0.659 | 0.758 | 0.895 | 0.652 | 0.754 | 0 | 0% | 0 |
LogMap | 221 | 13,454 | 0.899 | 0.642 | 0.749 | 0.895 | 0.657 | 0.758 | 0.897 | 0.649 | 0.753 | 0 | 0% | 0 |
GOMMA_Bk | 226 | 12,294 | 0.946 | 0.617 | 0.747 | 0.931 | 0.625 | 0.748 | 0.939 | 0.621 | 0.747 | 48,681 | 64.83% | 863 |
YAM++ | 1,901 | 11,961 | 0.951 | 0.604 | 0.739 | 0.940 | 0.614 | 0.743 | 0.946 | 0.609 | 0.741 | 50,089 | 66.71% | 471 |
ServOMapL | 147 | 11,730 | 0.960 | 0.598 | 0.737 | 0.947 | 0.606 | 0.739 | 0.954 | 0.602 | 0.738 | 62,367 | 83.06% | 657 |
ServOMap | 153 | 10,829 | 0.972 | 0.558 | 0.709 | 0.959 | 0.567 | 0.713 | 0.965 | 0.563 | 0.711 | 51,020 | 67.95% | 467 |
LogMapLt | 54 | 10,947 | 0.953 | 0.554 | 0.700 | 0.938 | 0.560 | 0.701 | 0.945 | 0.557 | 0.701 | 61,269 | 81.60% | 801 |
GOMMA | 197 | 10,555 | 0.948 | 0.531 | 0.680 | 0.931 | 0.536 | 0.680 | 0.939 | 0.533 | 0.680 | 42,813 | 57.02% | 851 |
AROMA | 15,624 | 11,783 | 0.861 | 0.538 | 0.662 | 0.848 | 0.545 | 0.664 | 0.854 | 0.542 | 0.663 | 70,491 | 93.88% | 1,286 |
MapSSS | 27,381 | 9,608 | 0.795 | 0.405 | 0.537 | 0.783 | 0.411 | 0.539 | 0.789 | 0.408 | 0.538 | 46,083 | 61.37% | 794 |
MapSSS and AROMA failed to complete the task involving the big fragments of FMA and SNOMED after more than 24 hours of execution.
There were not big differences, in general, in terms of F-measure with respect to the small SNOMED-NCI task. Only LogMap decreased their recall and lost its second position and GOMMA-bk generated less precise mappings and was relegated to the sixth position. As in previous task, LogMap-noe provided the best results in terms of recall and F-measure while ServOMap generated the most precise mappings.
Runtimes were between 2 and 3 orders of magnitude bigger than in the small task, but in the most of the cases the task was finished in less than 10 minutes.
Regarding mapping coherence, LogMap-noe provided a clean output while LogMap, since it computes an estimation of the overlapping (fragments) between the input ontologies, failed to detect and repair 3 unsatisfiable classes, which were outside the computed ontology fragments.
System | Time (s) | # Mappings | Original UMLS | Refined UMLS (LogMap) | Average | Incoherence Analysis | ||||||||
Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | All Unsat. | Degree | Root Unsat. | |||
LogMap-noe | 575 | 13,184 | 0.882 | 0.617 | 0.726 | 0.877 | 0.631 | 0.734 | 0.879 | 0.624 | 0.730 | 0 | 0% | 0 |
YAM++ | 6,127 | 13,083 | 0.864 | 0.600 | 0.708 | 0.854 | 0.610 | 0.712 | 0.859 | 0.605 | 0.710 | 104,492 | 60.66% | 618 |
ServOMapL | 363 | 12,784 | 0.870 | 0.590 | 0.703 | 0.858 | 0.599 | 0.705 | 0.864 | 0.594 | 0.704 | 136,909 | 79.48% | 1,101 |
LogMap | 514 | 12,142 | 0.877 | 0.565 | 0.687 | 0.872 | 0.578 | 0.695 | 0.874 | 0.571 | 0.691 | 3 | 0.002% | 2 |
ServOMap | 282 | 11,632 | 0.896 | 0.553 | 0.684 | 0.885 | 0.562 | 0.687 | 0.891 | 0.558 | 0.686 | 110,253 | 64.00% | 820 |
GOMMA_Bk | 638 | 15,644 | 0.730 | 0.606 | 0.662 | 0.718 | 0.613 | 0.662 | 0.724 | 0.610 | 0.662 | 116,451 | 67.60% | 2,741 |
LogMapLt | 104 | 12,741 | 0.819 | 0.553 | 0.660 | 0.805 | 0.560 | 0.661 | 0.812 | 0.557 | 0.661 | 131,073 | 76.09% | 2,201 |
GOMMA | 527 | 12,320 | 0.802 | 0.524 | 0.634 | 0.787 | 0.529 | 0.633 | 0.795 | 0.527 | 0.634 | 96,945 | 56.28% | 1,621 |
AROMA | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
MapSSS | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
The precision and recall slightly decreased in all systems and none of them could reach an F-measure of 0.7. YAM++ produced the best mapping set in terms of F-measure, while ServOMap and GOMMA-bk generated the mappings with best precision and recall, respectively. LogMap-noe lost its first position since it provided less comprehensive mappings.
ServOMap, ServOMapL and LogMapwere the fastest tools and required 11, 12 and 16 minutes respectively. GOMMA (with its two variations) required more than 30 minutes, while YAM++ required more than 8 hours.
As in previous task, LogMap-noe provided a clean output while LogMap failed to detect and repair a few unsatisfiable classes due to the computation of the overlapping between the input ontologies.
System | Time (s) | # Mappings | Original UMLS | Refined UMLS (LogMap) | Average | Incoherence Analysis | ||||||||
Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | All Unsat. | Degree | Root Unsat. | |||
YAM++ | 30,155 | 14,103 | 0.794 | 0.594 | 0.680 | 0.785 | 0.604 | 0.683 | 0.790 | 0.599 | 0.681 | 238,593 | 63.91% | 979 |
ServOMapL | 738 | 13,964 | 0.796 | 0.590 | 0.678 | 0.785 | 0.598 | 0.679 | 0.791 | 0.594 | 0.678 | 286,790 | 76.82% | 1,557 |
LogMap | 955 | 13,011 | 0.816 | 0.564 | 0.667 | 0.812 | 0.577 | 0.674 | 0.814 | 0.570 | 0.671 | 16 | 0.004% | 10 |
LogMap-noe | 1,505 | 13,058 | 0.813 | 0.563 | 0.666 | 0.809 | 0.577 | 0.673 | 0.811 | 0.570 | 0.670 | 0 | 0% | 0 |
ServOMap | 654 | 12,462 | 0.835 | 0.552 | 0.664 | 0.824 | 0.560 | 0.667 | 0.829 | 0.556 | 0.666 | 230,055 | 61.63% | 1,546 |
GOMMA_Bk | 1,940 | 17,045 | 0.669 | 0.605 | 0.635 | 0.658 | 0.612 | 0.634 | 0.663 | 0.608 | 0.635 | 239,708 | 64.21% | 4,297 |
LogMapLt | 178 | 14,043 | 0.743 | 0.553 | 0.634 | 0.731 | 0.560 | 0.634 | 0.737 | 0.557 | 0.634 | 305,648 | 81.87% | 3,160 |
GOMMA | 1,820 | 13,693 | 0.720 | 0.523 | 0.606 | 0.707 | 0.528 | 0.605 | 0.714 | 0.526 | 0.606 | 215,959 | 57.85% | 2,614 |
AROMA | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
MapSSS | - | - | - | - | - | - | - | - | - | - | - | - | - | - |