OAEI 2012::Large BioMed Track

Results OAEI 2012::Large BioMed Track

Results OAEI 2012 FMA-SNOMED matching problem

As it is depicted in the following tables, the FMA-SNOMED matching problem was harder than the FMA-NCI problem both in size and in complexity. Thus, matching systems required more time to complete the task and provided, in general, worse results in terms of F-measure. Furthermore, MaasMatch, Wmatch and AUTOMSv2, which were able to complete the small FMA-NCI task, failed to complete the small FMA-SNOMED task in less than 24 hours.

FMA-SNOMED small fragments

Six systems provided an on average an F-measure greater than 0.75. However, the other 6 systems that completed the task (including our baseline) failed to provide a recall higher than 0.4. GOMMA-bk provided the best results in terms of both recall and F-measure, while the baseline LogMapLt provided the best precision closely followed by ServOMapL. GOMMA-bk is a bit ahead with respect the other systems since managed to provide a mapping set with very high recall. The use of background knowledge was key in this matching task.

As in the FMA-NCI matching problem, precision tend to increase when comparing against the original UMLS mapping set, while recall decreases.

The runtimes were also very positive in general and 8 systems completed the task in less than 6 minutes. MapSSS required almost 1 hour, while Hertuda, HotMatch and AROMA needed 5, 9 and 14 hours to complete the task, respectively.

LogMap, unlike LogMap-noe, failed to detect and repair two unsatisfiable classes since they were outside the computed ontology fragments (overlapping). The rest of the systems, even when providing highly precise mappings like ServOMapL, generated mapping sets with a high incoherence degree.

System	Time (s)	# Mappings	Original UMLS			Refined UMLS (LogMap)			Refined UMLS (Alcomo)			Average			Incoherence Analysis
System	Time (s)	# Mappings	Precision	Recall	F-measure	Precision	Recall	F-measure	Precision	Recall	F-measure	Precision	Recall	F-measure	All Unsat.	Degree	Root Unsat.
GOMMA_Bk	148	8,598	0.958	0.914	0.935	0.860	0.912	0.885	0.862	0.912	0.886	0.893	0.913	0.903	13,685	58.06%	4,674
ServOMapL	39	6,346	0.985	0.694	0.814	0.884	0.691	0.776	0.892	0.696	0.782	0.920	0.694	0.791	10,584	44.91%	3,056
YAM++	326	6,421	0.972	0.693	0.809	0.870	0.688	0.769	0.879	0.694	0.776	0.907	0.692	0.785	14,534	61.67%	3,150
LogMap-noe	63	6,363	0.964	0.681	0.799	0.877	0.688	0.771	0.889	0.696	0.781	0.910	0.688	0.784	0	0%	0
LogMap	65	6,164	0.965	0.660	0.784	0.876	0.666	0.756	0.889	0.674	0.767	0.910	0.667	0.769	2	0.01%	2
ServOMap	46	6,008	0.985	0.657	0.788	0.880	0.652	0.749	0.888	0.656	0.755	0.918	0.655	0.764	8,165	34.64%	2,721
GOMMA	54	3,667	0.926	0.377	0.536	0.834	0.377	0.520	0.865	0.390	0.538	0.875	0.381	0.531	2,058	8.73%	206
MapSSS	3,129	3,458	0.798	0.306	0.442	0.719	0.307	0.430	0.737	0.313	0.440	0.751	0.309	0.438	9,084	38.54%	389
AROMA	51,191	5,227	0.555	0.322	0.407	0.507	0.327	0.397	0.519	0.333	0.406	0.527	0.327	0.404	21,083	89.45%	2,296
HotMatch	31,718	2,139	0.875	0.208	0.336	0.812	0.214	0.339	0.842	0.222	0.351	0.843	0.214	0.342	907	3.85%	104
LogMapLt	14	1,645	0.975	0.178	0.301	0.902	0.183	0.304	0.936	0.189	0.315	0.938	0.183	0.307	773	3.28%	21
Hertuda	17,625	3,051	0.578	0.196	0.292	0.533	0.201	0.292	0.555	0.208	0.303	0.555	0.201	0.296	1,020	4.33%	47

FMA-SNOMED big fragments

MapSSS, HotMatch and Hertuda failed to complete the task involving the big fragments of FMA and SNOMED after more than 24 hours of execution.

ServOMapL provided the best results in terms of F-measure and precision, whereas GOMMA-bk got the best recall. As in the FMA-NCI matching task involving big fragments, the F-measures suffered, in general, a decrease with respect to the small matching task. The most important variations were suffered by GOMMA-bk and GOMMA where their average precision decreased from 0.893 and 0.875 to 0.571 and 0.389, respectively. This is an interesting fact, since the background knowledge used by GOMMA-bk could not avoid the decrease in precision while keeping a high recall. Furthermore, runtimes were from 4 to 10 times higher for all the systems, with the exception of AROMA's runtime that increased from 14 to 17 hours.

LogMap (with its two variants) generated a clean output where the mappings together with the input ontologies did not lead to any unsatisfiable class.

System	Time (s)	# Mappings	Original UMLS			Refined UMLS (LogMap)			Refined UMLS (Alcomo)			Average			Incoherence Analysis
System	Time (s)	# Mappings	Precision	Recall	F-measure	Precision	Recall	F-measure	Precision	Recall	F-measure	Precision	Recall	F-measure	All Unsat.	Degree	Root Unsat.
ServOMapL	234	6,563	0.945	0.689	0.797	0.847	0.686	0.758	0.857	0.692	0.766	0.883	0.689	0.774	55,970	32.36%	1,192
ServOMap	315	6,272	0.941	0.655	0.773	0.841	0.650	0.734	0.849	0.655	0.740	0.877	0.654	0.749	143,316	82.85%	1,320
YAM++	3,780	7,003	0.879	0.684	0.769	0.787	0.679	0.729	0.797	0.686	0.737	0.821	0.683	0.746	69,345	40.09%	1,360
LogMap-noe	521	6,450	0.886	0.635	0.740	0.805	0.640	0.713	0.821	0.651	0.726	0.837	0.642	0.727	0	0%	0
LogMap	484	6,292	0.883	0.617	0.726	0.800	0.621	0.699	0.815	0.631	0.711	0.833	0.623	0.712	0	0%	0
GOMMA_Bk	636	12,614	0.613	0.858	0.715	0.548	0.852	0.667	0.551	0.855	0.670	0.571	0.855	0.684	75,910	43.88%	3,344
GOMMA	437	5,591	0.412	0.256	0.316	0.370	0.255	0.302	0.386	0.265	0.314	0.389	0.259	0.311	7,343	4.25%	480
AROMA	62,801	2,497	0.684	0.190	0.297	0.638	0.197	0.300	0.660	0.203	0.310	0.661	0.196	0.303	54,459	31.48%	271
LogMapLt	96	1,819	0.882	0.178	0.296	0.816	0.183	0.299	0.846	0.189	0.309	0.848	0.183	0.302	2,994	1.73%	24
MapSSS	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
HotMatch	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Hertuda	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-

FMA-SNOMED whole ontologies

AROMA failed to complete the matching task involving the whole FMA and SNOMED ontologies in less than 24 hours.

The results in terms of both precision and recall did not suffer important changes and, as in the previous task, ServOMapL provided the best results in terms of F-measure and precision while GOMMA-bk got the best recall.

Runtimes for ServOMap, ServOMapL LogMapLt and LogMap (with its two variations) were in line with the previous matching task; the computation times for GOMMA, GOMMA-bk and YAM++, however, suffered and important increase. GOMMA (with its two variations) required more than 30 minutes, while YAM++ required more than 6 hours.

LogMap and LogMap-noe mappings, as in previous tasks, had a very low incoherence degree.

System	Time (s)	# Mappings	Original UMLS			Refined UMLS (LogMap)			Refined UMLS (Alcomo)			Average			Incoherence Analysis
System	Time (s)	# Mappings	Precision	Recall	F-measure	Precision	Recall	F-measure	Precision	Recall	F-measure	Precision	Recall	F-measure	All Unsat.	Degree	Root Unsat.
ServOMapL	517	6,605	0.939	0.688	0.794	0.842	0.686	0.756	0.851	0.691	0.763	0.877	0.688	0.772	99,726	25.86%	2,862
ServOMap	532	6,320	0.933	0.655	0.770	0.835	0.650	0.731	0.842	0.655	0.737	0.870	0.653	0.746	273,242	70.87%	2,617
YAM++	23,900	7,044	0.872	0.682	0.765	0.780	0.678	0.725	0.791	0.685	0.734	0.814	0.681	0.742	106,107	27.52%	3,393
LogMap	612	6,312	0.877	0.615	0.723	0.795	0.619	0.696	0.811	0.629	0.708	0.828	0.621	0.710	10	0.003%	0
LogMap-noe	791	6,406	0.866	0.616	0.720	0.782	0.617	0.690	0.801	0.631	0.706	0.816	0.621	0.706	10	0.003%	0
GOMMA_Bk	1,893	12,829	0.602	0.858	0.708	0.538	0.852	0.660	0.542	0.855	0.663	0.561	0.855	0.677	119,657	31.03%	5,289
LogMapLt	171	1,823	0.880	0.178	0.296	0.814	0.183	0.299	0.844	0.189	0.309	0.846	0.183	0.301	4,938	1.28%	37
GOMMA	1,994	5,823	0.370	0.239	0.291	0.332	0.239	0.278	0.347	0.248	0.289	0.350	0.242	0.286	10,752	2.79%	609
AROMA	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
MapSSS	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
HotMatch	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-
Hertuda	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-	-