Integrating data from natural language processing into a clinical information system.
Hospital Information Systems
Information Storage and Retrieval
Medical Records Systems, Computerized
Natural Language Processing
Demographic data extracted from discharge summaries by natural language processing was compared to data gathered by a conventional hospital admitting system. Discrepancies in data were noted in names, age, sex, race, and ethnicity. Some differences are attributable to errors in collection: interaction with patient, dictation, transcription, and data entry. Very few differences were due to errors in natural language processing. Other differences can be used to critique existing data, or to enhance data with more detailed information. Discrepancies in data as elementary as patient demographics raise the issue of resolving conflicts when neither source of data is known to be more reliable. Clinical repositories can represent conflicting data from multiple sources, but clinical information systems must bear the cost of increased complexity in the application programs that will use the data.