In genetics, an enhancer is a short region of DNA that can be bound by proteins to increase the likelihood that transcription of a particular gene will occur. These proteins are usually referred to as transcription factors.

ChIP-seq is used primarily to determine how transcription factors and other chromatin-associated proteins influence phenotype-affecting mechanisms. Determining how proteins interact with DNA to regulate gene expression is essential for fully understanding many biological processes and disease states. This epigenetic information is complementary to genotype and expression analysis.

ChIP-seq technology

ChIP-seq technology is currently seen primarily as an alternative to ChIP-chip which requires a hybridization array. This necessarily introduces some bias, as a range is restricted to a fixed number of probes.

Sequencing, by contrast, is thought to have less preference, although the sequencing bias of different sequencing technologies is not yet fully understood. Specific DNA sites in direct physical interaction with transcription factors and other proteins can be isolated by chromatin immunoprecipitation. ChIP produces a library of target DNA sites bound to a protein of interest in vivo.

Massively parallel sequence analyses are used in conjunction with whole-genome sequence databases to analyze the interaction pattern of any protein with DNA,or the design of any epigenetic chromatin modifications.

Enhancers allow researchers to understand the process of gene expression. An enhancer's functioning does not depend upon the distance and the orientation of the targeted gene. However, it is difficult to locate enhancers. New technologies (ChIP-seq (Chromatin Immunoprecipitation sequencing)) are emerging, through which enhancers can be predicted.

Most of the methods are based upon p300 binding sites and DNAase I hypersensitive sites (DHSs) for the sake of collecting positive training samples which are sometimes imprecise and lead to unsatisfactory prediction performance.

In this article, the method based on support vector machines is proposed to investigate the presence of the enhancers on cell lines and tissues by using Enhancer Atlas. Enhancer prediction has been performed in models of diseases related to heart and lung tissues. Experimental results have shown that the proposed methods performed even better than other state-of-the-art methods proposed earlier on the specific cell lines. The findings also indicate that predicting enhancers is much more comfortable in adult or young tissue samples rather than in experiments on fetal samples.