An independent component analysis confounding factor correction framework for identifying broad impact expression quantitative trait loci
Genome-Wide Association Study
Quantitative Trait Loci
Genome-wide expression Quantitative Trait Loci (eQTL) studies in humans have provided numerous insights into the genetics of both gene expression and complex diseases. While the majority of eQTL identified in genome-wide analyses impact a single gene, eQTL that impact many genes are particularly valuable for network modeling and disease analysis. To enable the identification of such broad impact eQTL, we introduce CONFETI: Confounding Factor Estimation Through Independent component analysis. CONFETI is designed to address two conflicting issues when searching for broad impact eQTL: the need to account for non-genetic confounding factors that can lower the power of the analysis or produce broad impact eQTL false positives, and the tendency of methods that account for confounding factors to model broad impact eQTL as non-genetic variation. The key advance of the CONFETI framework is the use of Independent Component Analysis (ICA) to identify variation likely caused by broad impact eQTL when constructing the sample covariance matrix used for the random effect in a mixed model. We show that CONFETI has better performance than other mixed model confounding factor methods when considering broad impact eQTL recovery from synthetic data. We also used the CONFETI framework and these same confounding factor methods to identify eQTL that replicate between matched twin pair datasets in the Multiple Tissue Human Expression Resource (MuTHER), the Depression Genes Networks study (DGN), the Netherlands Study of Depression and Anxiety (NESDA), and multiple tissue types in the Genotype-Tissue Expression (GTEx) consortium. These analyses identified both cis-eQTL and trans-eQTL impacting individual genes, and CONFETI had better or comparable performance to other mixed model confounding factor analysis methods when identifying such eQTL. In these analyses, we were able to identify and replicate a few broad impact eQTL although the overall number was small even when applying CONFETI. In light of these results, we discuss the broad impact eQTL that have been previously reported from the analysis of human data and suggest that considerable caution should be exercised when making biological inferences based on these reported eQTL.