Design patterns for the development of electronic health record-driven phenotype extraction algorithms. Academic Article uri icon

Overview

MeSH

  • Data Curation
  • Phenotype

MeSH Major

  • Algorithms
  • Biological Ontologies
  • Data Mining
  • Electronic Health Records
  • Genomics
  • Natural Language Processing
  • Pattern Recognition, Automated

abstract

  • Design patterns, in the context of software development and ontologies, provide generalized approaches and guidance to solving commonly occurring problems, or addressing common situations typically informed by intuition, heuristics and experience. While the biomedical literature contains broad coverage of specific phenotype algorithm implementations, no work to date has attempted to generalize common approaches into design patterns, which may then be distributed to the informatics community to efficiently develop more accurate phenotype algorithms. Using phenotyping algorithms stored in the Phenotype KnowledgeBase (PheKB), we conducted an independent iterative review to identify recurrent elements within the algorithm definitions. We extracted and generalized recurrent elements in these algorithms into candidate patterns. The authors then assessed the candidate patterns for validity by group consensus, and annotated them with attributes. A total of 24 electronic Medical Records and Genomics (eMERGE) phenotypes available in PheKB as of 1/25/2013 were downloaded and reviewed. From these, a total of 21 phenotyping patterns were identified, which are available as an online data supplement. Repeatable patterns within phenotyping algorithms exist, and when codified and cataloged may help to educate both experienced and novice algorithm developers. The dissemination and application of these patterns has the potential to decrease the time to develop algorithms, while improving portability and accuracy. Copyright © 2014 Elsevier Inc. All rights reserved.

publication date

  • October 2014

has subject area

  • Algorithms
  • Biological Ontologies
  • Data Curation
  • Data Mining
  • Electronic Health Records
  • Genomics
  • Natural Language Processing
  • Pattern Recognition, Automated
  • Phenotype

Research

keywords

  • Journal Article

Identity

Language

  • eng

PubMed Central ID

  • PMC4194216

Digital Object Identifier (DOI)

  • 10.1016/j.jbi.2014.06.007

PubMed ID

  • 24960203

Additional Document Info

start page

  • 280

end page

  • 286

volume

  • 51