Results OAEI 2012::Large BioMed Track
As it is depicted in the following tables, the FMA-SNOMED matching problem was harder than the FMA-NCI problem both in size and in complexity. Thus, matching systems required more time to complete the task and provided, in general, worse results in terms of F-measure. Furthermore, MaasMatch, Wmatch and AUTOMSv2, which were able to complete the small FMA-NCI task, failed to complete the small FMA-SNOMED task in less than 24 hours.
Six systems provided an on average an F-measure greater than 0.75. However, the other 6 systems that completed the task (including our baseline) failed to provide a recall higher than 0.4. GOMMA-bk provided the best results in terms of both recall and F-measure, while the baseline LogMapLt provided the best precision closely followed by ServOMapL. GOMMA-bk is a bit ahead with respect the other systems since managed to provide a mapping set with very high recall. The use of background knowledge was key in this matching task.
As in the FMA-NCI matching problem, precision tend to increase when comparing against the original UMLS mapping set, while recall decreases.
The runtimes were also very positive in general and 8 systems completed the task in less than 6 minutes. MapSSS required almost 1 hour, while Hertuda, HotMatch and AROMA needed 5, 9 and 14 hours to complete the task, respectively.
LogMap, unlike LogMap-noe, failed to detect and repair two unsatisfiable classes since they were outside the computed ontology fragments (overlapping). The rest of the systems, even when providing highly precise mappings like ServOMapL, generated mapping sets with a high incoherence degree.
System | Time (s) | # Mappings | Original UMLS | Refined UMLS (LogMap) | Refined UMLS (Alcomo) | Average | Incoherence Analysis | ||||||||||
Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | All Unsat. | Degree | Root Unsat. | |||
GOMMA_Bk | 148 | 8,598 | 0.958 | 0.914 | 0.935 | 0.860 | 0.912 | 0.885 | 0.862 | 0.912 | 0.886 | 0.893 | 0.913 | 0.903 | 13,685 | 58.06% | 4,674 |
ServOMapL | 39 | 6,346 | 0.985 | 0.694 | 0.814 | 0.884 | 0.691 | 0.776 | 0.892 | 0.696 | 0.782 | 0.920 | 0.694 | 0.791 | 10,584 | 44.91% | 3,056 |
YAM++ | 326 | 6,421 | 0.972 | 0.693 | 0.809 | 0.870 | 0.688 | 0.769 | 0.879 | 0.694 | 0.776 | 0.907 | 0.692 | 0.785 | 14,534 | 61.67% | 3,150 |
LogMap-noe | 63 | 6,363 | 0.964 | 0.681 | 0.799 | 0.877 | 0.688 | 0.771 | 0.889 | 0.696 | 0.781 | 0.910 | 0.688 | 0.784 | 0 | 0% | 0 |
LogMap | 65 | 6,164 | 0.965 | 0.660 | 0.784 | 0.876 | 0.666 | 0.756 | 0.889 | 0.674 | 0.767 | 0.910 | 0.667 | 0.769 | 2 | 0.01% | 2 |
ServOMap | 46 | 6,008 | 0.985 | 0.657 | 0.788 | 0.880 | 0.652 | 0.749 | 0.888 | 0.656 | 0.755 | 0.918 | 0.655 | 0.764 | 8,165 | 34.64% | 2,721 |
GOMMA | 54 | 3,667 | 0.926 | 0.377 | 0.536 | 0.834 | 0.377 | 0.520 | 0.865 | 0.390 | 0.538 | 0.875 | 0.381 | 0.531 | 2,058 | 8.73% | 206 |
MapSSS | 3,129 | 3,458 | 0.798 | 0.306 | 0.442 | 0.719 | 0.307 | 0.430 | 0.737 | 0.313 | 0.440 | 0.751 | 0.309 | 0.438 | 9,084 | 38.54% | 389 |
AROMA | 51,191 | 5,227 | 0.555 | 0.322 | 0.407 | 0.507 | 0.327 | 0.397 | 0.519 | 0.333 | 0.406 | 0.527 | 0.327 | 0.404 | 21,083 | 89.45% | 2,296 |
HotMatch | 31,718 | 2,139 | 0.875 | 0.208 | 0.336 | 0.812 | 0.214 | 0.339 | 0.842 | 0.222 | 0.351 | 0.843 | 0.214 | 0.342 | 907 | 3.85% | 104 |
LogMapLt | 14 | 1,645 | 0.975 | 0.178 | 0.301 | 0.902 | 0.183 | 0.304 | 0.936 | 0.189 | 0.315 | 0.938 | 0.183 | 0.307 | 773 | 3.28% | 21 |
Hertuda | 17,625 | 3,051 | 0.578 | 0.196 | 0.292 | 0.533 | 0.201 | 0.292 | 0.555 | 0.208 | 0.303 | 0.555 | 0.201 | 0.296 | 1,020 | 4.33% | 47 |
MapSSS, HotMatch and Hertuda failed to complete the task involving the big fragments of FMA and SNOMED after more than 24 hours of execution.
ServOMapL provided the best results in terms of F-measure and precision, whereas GOMMA-bk got the best recall. As in the FMA-NCI matching task involving big fragments, the F-measures suffered, in general, a decrease with respect to the small matching task. The most important variations were suffered by GOMMA-bk and GOMMA where their average precision decreased from 0.893 and 0.875 to 0.571 and 0.389, respectively. This is an interesting fact, since the background knowledge used by GOMMA-bk could not avoid the decrease in precision while keeping a high recall. Furthermore, runtimes were from 4 to 10 times higher for all the systems, with the exception of AROMA's runtime that increased from 14 to 17 hours.
LogMap (with its two variants) generated a clean output where the mappings together with the input ontologies did not lead to any unsatisfiable class.
System | Time (s) | # Mappings | Original UMLS | Refined UMLS (LogMap) | Refined UMLS (Alcomo) | Average | Incoherence Analysis | ||||||||||
Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | All Unsat. | Degree | Root Unsat. | |||
ServOMapL | 234 | 6,563 | 0.945 | 0.689 | 0.797 | 0.847 | 0.686 | 0.758 | 0.857 | 0.692 | 0.766 | 0.883 | 0.689 | 0.774 | 55,970 | 32.36% | 1,192 |
ServOMap | 315 | 6,272 | 0.941 | 0.655 | 0.773 | 0.841 | 0.650 | 0.734 | 0.849 | 0.655 | 0.740 | 0.877 | 0.654 | 0.749 | 143,316 | 82.85% | 1,320 |
YAM++ | 3,780 | 7,003 | 0.879 | 0.684 | 0.769 | 0.787 | 0.679 | 0.729 | 0.797 | 0.686 | 0.737 | 0.821 | 0.683 | 0.746 | 69,345 | 40.09% | 1,360 |
LogMap-noe | 521 | 6,450 | 0.886 | 0.635 | 0.740 | 0.805 | 0.640 | 0.713 | 0.821 | 0.651 | 0.726 | 0.837 | 0.642 | 0.727 | 0 | 0% | 0 |
LogMap | 484 | 6,292 | 0.883 | 0.617 | 0.726 | 0.800 | 0.621 | 0.699 | 0.815 | 0.631 | 0.711 | 0.833 | 0.623 | 0.712 | 0 | 0% | 0 |
GOMMA_Bk | 636 | 12,614 | 0.613 | 0.858 | 0.715 | 0.548 | 0.852 | 0.667 | 0.551 | 0.855 | 0.670 | 0.571 | 0.855 | 0.684 | 75,910 | 43.88% | 3,344 |
GOMMA | 437 | 5,591 | 0.412 | 0.256 | 0.316 | 0.370 | 0.255 | 0.302 | 0.386 | 0.265 | 0.314 | 0.389 | 0.259 | 0.311 | 7,343 | 4.25% | 480 |
AROMA | 62,801 | 2,497 | 0.684 | 0.190 | 0.297 | 0.638 | 0.197 | 0.300 | 0.660 | 0.203 | 0.310 | 0.661 | 0.196 | 0.303 | 54,459 | 31.48% | 271 |
LogMapLt | 96 | 1,819 | 0.882 | 0.178 | 0.296 | 0.816 | 0.183 | 0.299 | 0.846 | 0.189 | 0.309 | 0.848 | 0.183 | 0.302 | 2,994 | 1.73% | 24 |
MapSSS | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
HotMatch | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
Hertuda | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
AROMA failed to complete the matching task involving the whole FMA and SNOMED ontologies in less than 24 hours.
The results in terms of both precision and recall did not suffer important changes and, as in the previous task, ServOMapL provided the best results in terms of F-measure and precision while GOMMA-bk got the best recall.
Runtimes for ServOMap, ServOMapL LogMapLt and LogMap (with its two variations) were in line with the previous matching task; the computation times for GOMMA, GOMMA-bk and YAM++, however, suffered and important increase. GOMMA (with its two variations) required more than 30 minutes, while YAM++ required more than 6 hours.
LogMap and LogMap-noe mappings, as in previous tasks, had a very low incoherence degree.
System | Time (s) | # Mappings | Original UMLS | Refined UMLS (LogMap) | Refined UMLS (Alcomo) | Average | Incoherence Analysis | ||||||||||
Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | Precision | Recall | F-measure | All Unsat. | Degree | Root Unsat. | |||
ServOMapL | 517 | 6,605 | 0.939 | 0.688 | 0.794 | 0.842 | 0.686 | 0.756 | 0.851 | 0.691 | 0.763 | 0.877 | 0.688 | 0.772 | 99,726 | 25.86% | 2,862 |
ServOMap | 532 | 6,320 | 0.933 | 0.655 | 0.770 | 0.835 | 0.650 | 0.731 | 0.842 | 0.655 | 0.737 | 0.870 | 0.653 | 0.746 | 273,242 | 70.87% | 2,617 |
YAM++ | 23,900 | 7,044 | 0.872 | 0.682 | 0.765 | 0.780 | 0.678 | 0.725 | 0.791 | 0.685 | 0.734 | 0.814 | 0.681 | 0.742 | 106,107 | 27.52% | 3,393 |
LogMap | 612 | 6,312 | 0.877 | 0.615 | 0.723 | 0.795 | 0.619 | 0.696 | 0.811 | 0.629 | 0.708 | 0.828 | 0.621 | 0.710 | 10 | 0.003% | 0 |
LogMap-noe | 791 | 6,406 | 0.866 | 0.616 | 0.720 | 0.782 | 0.617 | 0.690 | 0.801 | 0.631 | 0.706 | 0.816 | 0.621 | 0.706 | 10 | 0.003% | 0 |
GOMMA_Bk | 1,893 | 12,829 | 0.602 | 0.858 | 0.708 | 0.538 | 0.852 | 0.660 | 0.542 | 0.855 | 0.663 | 0.561 | 0.855 | 0.677 | 119,657 | 31.03% | 5,289 |
LogMapLt | 171 | 1,823 | 0.880 | 0.178 | 0.296 | 0.814 | 0.183 | 0.299 | 0.844 | 0.189 | 0.309 | 0.846 | 0.183 | 0.301 | 4,938 | 1.28% | 37 |
GOMMA | 1,994 | 5,823 | 0.370 | 0.239 | 0.291 | 0.332 | 0.239 | 0.278 | 0.347 | 0.248 | 0.289 | 0.350 | 0.242 | 0.286 | 10,752 | 2.79% | 609 |
AROMA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
MapSSS | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
HotMatch | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
Hertuda | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |