Large-scale discovery and characterization of protein regulatory motifs in eukaryotes. Academic Article uri icon

Overview

abstract

  • The increasing ability to generate large-scale, quantitative proteomic data has brought with it the challenge of analyzing such data to discover the sequence elements that underlie systems-level protein behavior. Here we show that short, linear protein motifs can be efficiently recovered from proteome-scale datasets such as sub-cellular localization, molecular function, half-life, and protein abundance data using an information theoretic approach. Using this approach, we have identified many known protein motifs, such as phosphorylation sites and localization signals, and discovered a large number of candidate elements. We estimate that ~80% of these are novel predictions in that they do not match a known motif in both sequence and biological context, suggesting that post-translational regulation of protein behavior is still largely unexplored. These predicted motifs, many of which display preferential association with specific biological pathways and non-random positioning in the linear protein sequence, provide focused hypotheses for experimental validation.

publication date

  • December 29, 2010

Research

keywords

  • Amino Acid Motifs
  • Proteomics

Identity

PubMed Central ID

  • PMC3012054

Scopus Document Identifier

  • 78650821937

Digital Object Identifier (DOI)

  • 10.1371/journal.pone.0014444

PubMed ID

  • 21206902

Additional Document Info

volume

  • 5

issue

  • 12