A comparison of ground truth estimation methods. Academic Article uri icon

Overview

abstract

  • PURPOSE: Knowledge of the exact shape of a lesion, or ground truth (GT), is necessary for the development of diagnostic tools by means of algorithm validation, measurement metric analysis, accurate size estimation. Four methods that estimate GTs from multiple readers' documentations by considering the spatial location of voxels were compared: thresholded Probability-Map at 0.50 (TPM(0.50)) and at 0.75 (TPM(0.75)), simultaneous truth and performance level estimation (STAPLE) and truth estimate from self distances (TESD). METHODS: A subset of the publicly available Lung Image Database Consortium archive was used, selecting pulmonary nodules documented by all four radiologists. The pair-wise similarities between the estimated GTs were analyzed by computing the respective Jaccard coefficients. Then, with respect to the readers' marking volumes, the estimated volumes were ranked and the sign test of the differences between them was performed. RESULTS: (a) the rank variations among the four methods and the volume differences between STAPLE and TESD are not statistically significant, (b) TPM(0.50) estimates are statistically larger (c) TPM(0.75) estimates are statistically smaller (d) there is some spatial disagreement in the estimates as the one-sided 90% confidence intervals between TPM(0.75) and TPM(0.50), TPM(0.75) and STAPLE, TPM(0.75) and TESD, TPM(0.50) and STAPLE, TPM(0.50) and TESD, STAPLE and TESD, respectively, show: [0.67, 1.00], [0.67, 1.00], [0.77, 1.00], [0.93, 1.00], [0.85, 1.00], [0.85, 1.00]. CONCLUSIONS: The method used to estimate the GT is important: the differences highlighted that STAPLE and TESD, notwithstanding a few weaknesses, appear to be equally viable as a GT estimator, while the increased availability of computing power is decreasing the appeal afforded to TPMs. Ultimately, the choice of which GT estimation method, between the two, should be preferred depends on the specific characteristics of the marked data that is used with respect to the two elements that differentiate the method approaches: relative reliabilities of the readers and the reliability of the region boundaries.

publication date

  • December 9, 2009

Research

keywords

  • Algorithms
  • Image Interpretation, Computer-Assisted
  • Pattern Recognition, Automated
  • Solitary Pulmonary Nodule
  • Tomography, X-Ray Computed

Identity

Scopus Document Identifier

  • 77953649414

Digital Object Identifier (DOI)

  • 10.1007/s11548-009-0401-3

PubMed ID

  • 20033494

Additional Document Info

volume

  • 5

issue

  • 3