Accurately quantifying low-abundant targets amid similar sequences by revealing hidden correlations in oligonucleotide microarray data Academic Article uri icon


MeSH Major

  • Oligonucleotide Array Sequence Analysis
  • RNA, Ribosomal
  • Sequence Analysis, RNA


  • Microarrays have enabled the determination of how thousands of genes are expressed to coordinate function within single organisms. Yet applications to natural or engineered communities where different organisms interact to produce complex properties are hampered by theoretical and technological limitations. Here we describe a general method to accurately identify low-abundant targets in systems containing complex mixtures of homologous targets. We combined an analytical predictor of nonspecific probe-target interactions (cross-hybridization) with an optimization algorithm that iteratively deconvolutes true probe-target signal from raw signal affected by spurious contributions (cross-hybridization, noise, background, and unequal specific hybridization response). The method was capable of quantifying, with unprecedented specificity and accuracy, ribosomal RNA (rRNA) sequences in artificial and natural communities. Controlled experiments with spiked rRNA into artificial and natural communities demonstrated the accuracy of identification and quantitative behavior over different concentration ranges. Finally, we illustrated the power of this methodology for accurate detection of low-abundant targets in natural communities. We accurately identified Vibrio taxa in coastal marine samples at their natural concentrations (<0.05% of total bacteria), despite the high potential for cross-hybridization by hundreds of different coexisting rRNAs, suggesting this methodology should be expandable to any microarray platform and system requiring accurate identification of low-abundant targets amid pools of similar sequences.

publication date

  • September 12, 2006



  • Academic Article



  • eng

PubMed Central ID

  • PMC1559406

Digital Object Identifier (DOI)

  • 10.1073/pnas.0601476103

PubMed ID

  • 16950880

Additional Document Info

start page

  • 13629

end page

  • 34


  • 103


  • 37