Information integration from semantically heterogeneous biological data sources
International Classification of Diseases
We present the first prototype of INDUS (Intelligent Data Understanding System), a federated, query-centric system for information integration and knowledge acquisition from distributed, semantically heterogeneous data sources that can be viewed (conceptually) as tables. INDUS employs ontologies and inter-ontology mappings, to enable a user to view a collection of such data sources (regardless of location, internal structure and query interfaces) as though they were a collection of tables structured according to an ontology supplied by the user. This allows INDUS to answer user queries against distributed, semantically heterogeneous data sources without the need for a centralized data warehouse or a common global ontology.