Predicting Regulatory Regions and their Target Genes to Enhance Genomic Data Analysis
Recent advances in next generation sequencing efforts have led to a massive collection of regulatory genomic datasets, including genome-wide measurements of cell-line specific chromatin states, where the chromatin state is defined by the collection of histone modifications, DNA methylation, open chromatin maps, and nucleosome positioning. Such datasets are now available for integration into annotation pipelines to define regulatory elements such as enhancers, promoters, insulators, that together control the expression of a gene. A major challenge is to characterize these regulatory elements and to connect them to morphological, physiological (e.g. diseases), and molecular (e.g. cell-type specific gene expression levels) phenotypes. To this end one of the goals of the Epigenome-Based Cellular Phenotyping (EBCP) project of the CPCP (S. Keles and S. Roy, project leads) is to develop novel computational methods to link regulatory elements to genes.