Graph theoretic modeling of large-scale semantic networks.
During the past several years, social network analysis methods have been used to model many complex real-world phenomena, including social networks, transportation networks, and the Internet. Graph theoretic methods, based on an elegant representation of entities and relationships, have been used in computational biology to study biological networks; however they have not yet been adopted widely by the greater informatics community. The graphs produced are generally large, sparse, and complex, and share common global topological properties. In this review of research (1998-2005) on large-scale semantic networks, we used a tailored search strategy to identify articles involving both a graph theoretic perspective and semantic information. Thirty-one relevant articles were retrieved. The majority (28, 90.3%) involved an investigation of a real-world network. These included corpora, thesauri, dictionaries, large computer programs, biological neuronal networks, word association networks, and files on the Internet. Twenty-two of the 28 (78.6%) involved a graph comprised of words or phrases. Fifteen of the 28 (53.6%) mentioned evidence of small-world characteristics in the network investigated. Eleven (39.3%) reported a scale-free topology, which tends to have a similar appearance when examined at varying scales. The results of this review indicate that networks generated from natural language have topological properties common to other natural phenomena. It has not yet been determined whether artificial human-curated terminology systems in biomedicine share these properties. Large network analysis methods have potential application in a variety of areas of informatics, such as in development of controlled vocabularies and for characterizing a given domain.