Learning classifiers from distributed, ontology-extended data sources Conference Paper uri icon


MeSH Major

  • Cardiovascular Diseases
  • Data Mining
  • Decision Support Techniques
  • Electronic Health Records


  • There is an urgent need for sound approaches to integrative and collaborative analysis of large, autonomous (and hence, inevitably semantically heterogeneous) data sources in several increasingly data-rich application domains. In this paper, we precisely formulate and solve the problem of learning classifiers from such data sources, in a setting where each data source has a hierarchical ontology associated with it and semantic correspondences between data source ontologies and a user ontology are supplied. The proposed approach yields algorithms for learning a broad class of classifiers (including Bayesian networks, decision trees, etc.) from semantically heterogeneous distributed data with strong performance guarantees relative to their centralized counterparts. We illustrate the application of the proposed approach in the case of learning Naive Bayes classifiers from distributed, ontology-extended data sources. © Springer-Verlag Berlin Heidelberg 2006.

publication date

  • January 2006



  • Conference Paper

Additional Document Info

start page

  • 363

end page

  • 373


  • 4081 LNCS