Representing nested semantic information in a linear string of text using XML. Academic Article uri icon


MeSH Major

  • Programming Languages
  • Semantics
  • Software Design


  • XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We encountered this problem while marking up medical text with structured semantic information from a Natural Language Processor. Traditional approaches to this problem separate the structured information from the actual text mark up. This paper introduces an alternative solution, which tightly integrates the semantic structure with the text. The resulting XML markup preserves the linearity of the medical texts and can therefore be easily expanded with additional types of information.

publication date

  • January 2002



  • Academic Article



  • eng

PubMed Central ID

  • PMC2244450

PubMed ID

  • 12463856

Additional Document Info

start page

  • 405

end page

  • 9