A clustering approach for motif discovery in chip-seq dataset

Chun Xiao Sun, Yu Yang, Hua Wang, Wen Hu Wang

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

Chromatin immunoprecipitation combined with next-generation sequencing (ChIP-Seq) technology has enabled the identification of transcription factor binding sites (TFBSs) on a genome-wide scale. To effectively and efficiently discover TFBSs in the thousand or more DNA sequences generated by a ChIP-Seq data set, we propose a new algorithm named AP-ChIP. First, we set two thresholds based on probabilistic analysis to construct and further filter the cluster subsets. Then, we use Affinity Propagation (AP) clustering on the candidate cluster subsets to find the potential motifs. Experimental results on simulated data show that the AP-ChIP algorithm is able to make an almost accurate prediction of TFBSs in a reasonable time. Also, the validity of the AP-ChIP algorithm is tested on a real ChIP-Seq data set.

Original languageEnglish
Article number802
JournalEntropy
Volume21
Issue number8
DOIs
StatePublished - 2019

Keywords

  • ChIP-Seq
  • Motif discovery
  • Planted motif search
  • Transcription factor binding sites

Fingerprint

Dive into the research topics of 'A clustering approach for motif discovery in chip-seq dataset'. Together they form a unique fingerprint.

Cite this