Genomic Engineering Group / InteLAB
Home arrow Systems Biology arrow Bioinformatics arrow Comparative Genomics  
Friday, 24 November 2017
Comparative Genomics PDF Print

Fragile X syndrome

Fragile X syndrome is a frequent cause of mental retardation resulting from the absence of Fragile X Mental Retardation Protein (FMRP), encoded by the fmr1 gene. FMRP contains RNA-binding motifs (KH and RGG box), and both a nuclear localization signal (NLS) and a nuclear export signal (NES). The presence of these motifs suggests that FMRP might have a function in RNA transport from the nucleus to the cytoplasm. Some studies show that FMRP could be involved in the translation of some mRNA, mainly at synapses.

As a part of our efforts to understand the physiological functions of the fmr1 gene, we studied FMRP expression and the regulatory 5’ untranslated region. Other studies have been done to try to understand the interaction of transcriptional factors with the fmr1 gene, but only within the promoter region. We have been using comparative genomics and pattern recognition programs to characterize possible binding sites and to determine which transcriptional factors could be relevant for the fmr1 gene expression in a 111 kbp sequence upstream of the gene, using Vista tools. Results showed several conserved DNA segments between human and Drosophila as strong transcription factor binding sites (TFBSs) candidates. CEBP, FOXD3, HNF3B, NKX2.5, NKX6.2, OCT1, PAX4 and STAT1 sites are the most frequent in the 111 kbp of regulatory region (5’-UTR). Alignment between human and other mammalian species showed other sequences which present 109 TFBSs in the most conserved regions. For the mouse sequence the most conserved and clustered sites are CEBP, GATA1 and STAT1 binding sites, which present seven sites in a 100 bp window and CETS1P54, GATA2, ATA3, OCT1, PAX and STAT 5A with five sites in 100 bp. NKX6.2 can be clusterized in four sites for 100 bp in mouse alignment. Factors such as STAT1, NKX6.2 and OCT1, among others, have been expressed in neurons and are involved in brain development. Those are evidences that these transcriptional factors may be involved in the expression of FMRP in the cerebral cells. Our analysis identified some possible TFBSs that should now be experimentally investigated in order to establish their role in the mechanism of fmr1 expression.


Identification of Human TFBSs using the Drosophila melanogaster dfxr Gene by in silico Footprinting
Mutations in the fmr1 gene are the main cause of the Fragile X Syndrome (FXS). The syndrome was identified as a CGG expansion in the regulatory region of fmr1. This expansion may cause transcriptional silencing and loss of the gene product FMRP. FMRP is an RNA binding protein that contains two ribonucleoprotein K homology (KH domains) and an arginine-and-glycine-rich domain. It has been shown that Drosophila melanogaster has a single FXR-related gene, dfxr (also named dfmr1) that is a homolog of the mammalian fmr1/fxr gene family (fmr1, fxr1 and fxr2). The dfxr encoded protein (dFXR) presents all the functional motifs found in the human FXR proteins and dFXR mutant phenotypes are consistent with the synaptic defects associated with FXS patients. These observations provide an important argument for using flies as a model to study the FXS.
In silico or phylogenetic footprinting is a technique that compares genomic sequences across species to predict gene regulatory sites. This approach is based on the idea that transcription factor binding sites (TFBSs) are preferentially conserved over the course of evolution and that the identification of these TFBSs can give important insights to understand the mechanisms of fmr1 gene regulation. Some studies have been made to experimentally identify TFBSs in the promoter region of the fmr1 gene. Our aim is to study the complete regulatory 5’UTR by using phylogenetic footprinting computational tools.Computational (in silico) screening  of TFBSs may support future in vivo determination of TF binding sites (in vivo footprinting), so we performed a global pairwise alignment of the fmr1 human gene, chrX_146588158_146738157 (150,000 bp), with the dfxr gene (with its complete 5’UTR). The alignment was made by using the AVID program and mVISTA, masking the sequences to get better alignment results. The analysis of the possible TFBSs was made using the rVISTA. rVista is a computational tool that makes predictions by the Match program based on TRANSFAC Professional 7.4. It also identifies potential TFBSs, and determines which of the predicted sites are aligned and conserved between the species in the alignment. Core similarity had been set to 0.75 and the cut off matrix was 0.85. We found 64 possible TFBSs on the studied region of fmr1. To analyze the results, we studied TFBSs clusterization, because genes are often regulated by multiple transcriptional factors, so potential TFBSs tend to be clustered or adjacent to each other. The results of the individual clusterization of 2–100, showed only 15 TFBSs. We also increased matrix similarity to find the most conserved TFBSs. The remaining factors are now being used to identify which ones are already linked to gene expression in neurons of the central nervous system. The identification of the transcription factors that regulate gene expression and the corresponding transcription factor binding sites (TFBSs) within the DNA sequence is an essential step to understand gene function. Results of this study can help us to predict, with a considerable accuracy, the TFBSs and use them to better plan experimental investigation, hopefully improving our understanding of the fmr1 gene regulation.


Comparative Genomics of the fmr1 Gene Regulatory Sequence for Transcription Factor Binding Sites Analysis
As a part of efforts to understand the physiological functions of the fmr1 gene, we studied FMRP expression and the regulatory 5’ untranslated region of the fmr1 gene. Other studies have been done to understand the interaction of transcriptional factors with the fmr1 gene, but only within the promoter region.
Here we used comparative genomics and pattern-recognition programs to characterize possible binding sites and to determine which transcriptional factors could be relevant for the fmr1 gene expression in a 111 kbp sequence upstream of the gene, using Vista tools.Results showed several conserved DNA segments between human and Drosophila as strong transcription factor binding sites (TFBSs) candidates. CEBP, FOXD3, HNF3B, NKX2.5, NKX6.2, OCT1, PAX4 and STAT1 sites are the most frequent in the 111 kbp of regulatory region (5’-UTR). Alignment between human and other mammalian species showed other sequences which present 109 TFBSs in the most conserved regions. For the mouse sequence the most conserved and clustered sites are CEBP, GATA1 and STAT1 binding sites, which present seven sites in a 100 bp window and CETS1P54, GATA2, GATA3, OCT1, PAX and STAT 5A with five sites in 100 bp. NKX6.2 can be clusterized in four sites for 100 bp in mouse alignment.Factors such as STAT1, NKX6.2 and OCT1, among others, have been expressed in neurons and are involved in brain development. Those are evidences that these transcriptional factors may be involved in the expression of FMRP in the cerebral cells. Our analysis identified some possible TFBSs that should now be experimentally investigated in order to establish their role in the mechanism of fmr1 expression. 


Transcriptional Regulation Biological Model of the fmr1 Gene
The functions of the Fragile X Mental Retardation Protein, FMRP, in the organism are still unknown; however, structural evidences of the protein suggest that it is involved in nuclear export, cytoplasmic transport, and/or translation control of target mRNAs. The transport of mRNA from the nucleus to distal dendrites and the protein synthesis at postsynaptic sites is a phenomenon involved in synaptic plasticity and it is one of the ways of protein expression regulation related with the neuron development and behavior.
The fmr1 gene expression is also modulated (activated or inhibited) by transcription factors (TFs) binding at specific DNA sequences. Binding of TF combinations in different places of the regulatory region can modulate its transcription in specific tissues, or different developmental stage. Regulatory regions can be located next to the gene, in its 5' and 3' extremities, in the intronic region, or even very far away from the gene, several kbp upstream from the beginning of transcription.Identification of transcription factors that regulate gene expression is a step forward to better understand its transcription rules and its function in the organism. We are interested on fmr1 transcription rules. Using Vista programs, we compared the DNA regulatory sequence of human fmr1 gene with other mammalian homologous sequences looking for transcriptional factor binding sites (TFBS) that are conserved and that are important evidences of which proteins may be involved in the fmr1 gene transcription regulation.Our comparative results showed many TFBSs in the human fmr1 regulatory region and consequently many possible regulatory proteins which bind onto DNA sequence and modulate fmr1 expression. We have analyzed TFBS clustering results to select the best candidates for transcriptional regulatory proteins. TFBS clustering increases the possibilities of finding transcription factor binding sites because, frequently, gene expression can be regulated by multiple transcriptional factors clustered to each other.OCT1, NKX6.2, PAX4, STAT1 and STAT5A proteins sites were found clusterized into four sites in a 100 bp window, what suggests that these five proteins, among others, may be related with fmr1 gene expression regulation. With these findings and according to in vivo experimental evidences we established a transcription regulation biological model that describes some of the rules of the fmr1 transcription. Our results can contribute to a better understanding of the FMRP function and its role in brain development. They also suggest that computational methods can be highly useful in the FXS biological and medical research.

 
Next >
Webdesign by Webmedie.dk Webdesign by Webmedie.dk