Quantifying the impact of non-coding variants on transcription factor-dna binding

  • Jingkang Zhao
  • , Dongshunyi Li
  • , Jungkyun Seo
  • , Andrew S. Allen
  • , Raluca Gordân

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Many recent studies have emphasized the importance of genetic variants and mutations in cancer and other complex human diseases. The overwhelming majority of these variants occur in non-coding portions of the genome, where they can have a functional impact by disrupting regulatory interactions between transcription factors (TFs) and DNA. Here, we present a method for assessing the impact of non-coding mutations on TF-DNA interactions, based on regression models of DNA-binding specificity trained on high-throughput in vitro data. We use ordinary least squares (OLS) to estimate the parameters of the binding model for each TF, and we show that our predictions of TF binding changes due to DNA mutations correlate well with measured changes in gene expression. In addition, by leveraging distributional results associated with OLS estimation, for each predicted change in TF binding we also compute a normalized score (z-score) and a significance value (p-value) reflecting our confidence that the mutation affects TF binding. We use this approach to analyze a large set of pathogenic non-coding variants, and we show that these variants lead to significant differences in TF binding between alleles, compared to a control set of common variants. Thus, our results indicate that there is a strong regulatory component to the pathogenic non-coding variants identified thus far.

Original languageEnglish
Title of host publicationResearch in Computational Molecular Biology - 21st Annual International Conference, RECOMB 2017, Proceedings
EditorsS.Cenk Sahinalp
PublisherSpringer Verlag
Pages336-352
Number of pages17
ISBN (Print)9783319569697
DOIs
StatePublished - 2017
Externally publishedYes
Event21st Annual International Conference on Research in Computational Molecular Biology, RECOMB 2017 - Hong Kong, China
Duration: 3 May 20177 May 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10229 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st Annual International Conference on Research in Computational Molecular Biology, RECOMB 2017
Country/TerritoryChina
CityHong Kong
Period3/05/177/05/17

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Non-coding variants
  • Regression models
  • TF-DNA binding

Fingerprint

Dive into the research topics of 'Quantifying the impact of non-coding variants on transcription factor-dna binding'. Together they form a unique fingerprint.

Cite this