Stefan Bekiranov, PhD
Physical modeling of microarray hybridization; Analysis of genomic tiling array data; Bioinformatics; Computational biology; Regulatory networks
We are a computational biology lab that works on elucidating the complex cross regulation of histone modifications and their subsequent regulation of transcription and DNA replication. In collaboration with Dr. Marty Mayo’s lab, we are studying the regulatory network that drives the epithelial to mesenchymal transition (EMT) in non-small cell lung cancer (NSCLC). Lung cancer is the number one malignancy in the U.S. causing over 160,000 deaths last year. Greater than eighty percent of these malignancies is NSCLC. EMT is believed to be essential to carcinoma progression, causing disruption of intracellular architecture, decreased cell-cell adhesion, and loss of cellular polarization. Carcinomas that have undergone EMT are characteristically more motile, invasive and metastatic. Although EMT is believed to be an important step in cancer progression, little is known about the regulatory program that results in the phenotypic properties associated with mesenchymal cancer cells.
We discovered that de-differentiation from epithelial to mesenchymal cancer cells results in extensive epigenomic reprogramming with 10,000 genes differentially expressed. Given the extensive upregulation of gene expression during EMT, we hypothesized that this reprogramming must require altered expression of epigenomic effectors that are responsible for regulating covalent modifications on histones. Indeed, mesenchymal cancer cells upregulate gene targets that encode for enzymes, which establish covalent modifications (writers), remove specific post-translational modifications (erasers), and those with PHD, Chromo and Tudor domains, which recognize the modifications (readers). Many of these effector proteins are associated with tumor promotion via repression of tumor suppressor genes. We found upregulated genes encoding for proteins, which alter the mono-, di- and tri-methylation status of histone H3 K4, K9, K27, and K36 as well as H4K20. Consistent with the importance of H3K27me3 for gene silencing and maintenance of stem cell properties, we found that mesenchymal cancer cells upregulated Ezh1, which writes H3K27me3, members of the PRC1 complex, which read H3K27me3, and JMJD3 the enzyme that removes H3K27me3. Hallmarks of EMT are also upregulated including extracellular matrix proteins, cytokines/chemokines, metabolic enzymes, and master-switch transcription factors, Twist, Snail, Slug, and SIP1. Guided by these results, we are currently mapping a number of histone modifications including methylated and acetylated residues, histone variants H2A.Z and H3.3 as well as the transcription factors Twist, Snail, Slug, SIP1 and NF-?B using chromatin immunoprecipitation followed by high throughput sequencing (ChIP-Seq) to characterize critical components of the regulatory network driving EMT.
Increasingly, recent studies are painting a portrait of the regulation of histone modifications as a dependency network. For example, H2B ubiquitination triggers methylation of H3K4 and H3K79. Deubiquitination of H2B leads to tri-methylation of H3K36 in open reading frames. In budding yeast, H3K14 acetylation by the NuA3 histone acetyltransferase requires H3K4 and H3K36 to be methylated. We hypothesize that by applying powerful machine learning methods to genome-wide maps of EMT regulatory factors, we will be able to characterize this complex epigenomic regulatory network. Specifically, we are applying the following methods: Bayesian Network (BN) and Multivariate Adaptive Regression Splines (MARS). A BN is a probabilistic graphical model that represents a product of conditional probabilities. The structure of the products is represented by a directed acyclic graph (DAG)—a graph with no loops. Arrows in the DAG can represent causal dependencies between variables (nodes). The variables (nodes) in our study are the regulatory factors including histone modifications/variants, transcription factors, DNA methylation and gene expression levels. We are interested in learning the structure of the BN from the data. MARS is a generalization of an additive multi-linear model, which accounts for the non-linear relationship between input (e.g., mark levels) and output (e.g., gene expression levels) variables. The main goal of this project is to build and test predictive epigenomic regulatory models of EMT. A significant advantage of applying predictive modeling to this system is that the gene expression response of critical EMT markers during the transition is known. This allows us to rank regulatory component knockdowns according to their predicted impact on expression of these EMT markers and test to see if we observe the expected phenotypic outcomes.
Through our collaboration with the Mayo lab, we are currently testing the hypotheses that these regulatory network models can: 1) identify functional dependencies between regulatory elements, and 2) pinpoint candidate enzymes required to achieve epigenetic reprogramming in mesenchymal lung cancer cells. Specifically, we are testing these models via siRNA knockdown of predicted EMT regulatory network components followed by QRT-PCR, Western Blot, microarray and ChIP-Seq. We are also testing the importance of predicted regulatory factors for the induction and maintenance of mesenchymal cancer cells. To accomplish this, four biological assays are performed: 1) migration, 2) invasion, 3) wound healing, and 4) secretion of mesenchymal biomarkers. This work will potentially lead to the discovery of master regulators of EMT and novel therapeutic targets for the diagnosis and treatment of lung cancer.