MATE-CLEVER: Mendelian-inheritance-aware discovery and genotyping of midsize and long indels. Academic Article uri icon

Overview

MeSH

  • Computer Simulation
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Inheritance Patterns

MeSH Major

  • Algorithms
  • Genetic Variation
  • Genome, Human
  • Genotyping Techniques
  • INDEL Mutation
  • Sequence Analysis, DNA

abstract

  • Accurately predicting and genotyping indels longer than 30 bp has remained a central challenge in next-generation sequencing (NGS) studies. While indels of up to 30 bp are reliably processed by standard read aligners and the Genome Analysis Toolkit (GATK), longer indels have still resisted proper treatment. Also, discovering and genotyping longer indels has become particularly relevant owing to the increasing attention in globally concerted projects. We present MATE-CLEVER (Mendelian-inheritance-AtTEntive CLique-Enumerating Variant findER) as an approach that accurately discovers and genotypes indels longer than 30 bp from contemporary NGS reads with a special focus on family data. For enhanced quality of indel calls in family trios or quartets, MATE-CLEVER integrates statistics that reflect the laws of Mendelian inheritance. MATE-CLEVER's performance rates for indels longer than 30 bp are on a par with those of the GATK for indels shorter than 30 bp, achieving up to 90% precision overall, with >80% of calls correctly typed. In predicting de novo indels longer than 30 bp in family contexts, MATE-CLEVER even raises the standards of the GATK. MATE-CLEVER achieves precision and recall of ∼63% on indels of 30 bp and longer versus 55% in both categories for the GATK on indels of 10-29 bp. A special version of MATE-CLEVER has contributed to indel discovery, in particular for indels of 30-100 bp, the 'NGS twilight zone of indels', in the Genome of the Netherlands Project.  http://clever-sv.googlecode.com/

publication date

  • December 15, 2013

has subject area

  • Algorithms
  • Computer Simulation
  • Genetic Variation
  • Genome, Human
  • Genotyping Techniques
  • High-Throughput Nucleotide Sequencing
  • Humans
  • INDEL Mutation
  • Inheritance Patterns
  • Sequence Analysis, DNA

Research

keywords

  • Journal Article

Identity

Language

  • eng

PubMed Central ID

  • PMC3842759

Digital Object Identifier (DOI)

  • 10.1093/bioinformatics/btt556

PubMed ID

  • 24072733

Additional Document Info

start page

  • 3143

end page

  • 3150

volume

  • 29

number

  • 24