56 research outputs found
Statistical method for modeling sequencing data from different technologies in longitudinal studies with application to Huntington disease
Advancement of gene expression measurements in longitudinal studies enables the identification of genes associated with disease severity over time. However, problems arise when the technology used to measure gene expression differs between time points. Observed differences between the results obtained at different time points can be caused by technical differences. Modeling the two measurements jointly over time might provide insight into the causes of these different results. Our work is motivated by a study of gene expression data of blood samples from Huntington disease patients, which were obtained using two different sequencing technologies. At time point 1, DeepSAGE technology was used to measure the gene expression, with a subsample also measured using RNA-Seq technology. At time point 2, all samples were measured using RNA-Seq technology. Significant associations between gene expression measured by DeepSAGE and disease severity using data from the first time point could not be replicated by the RNA-Seq data from the second time point. We modeled the relationship between the two sequencing technologies using the data from the overlapping samples. We used linear mixed models with either DeepSAGE or RNA-Seq measurements as the dependent variable and disease severity as the independent variable. In conclusion, (1) for one out of 14 genes, the initial significant result could be replicated with both technologies using data from both time points; (2) statistical efficiency is lost due to disagreement between the two technologies, measurement error when predicting gene expressions, and the need to include additional parameters to account for possible differences
Single-cell immune profiling reveals thymus-seeding populations, T cell commitment, and multilineage development in the human thymus
T cell development in the mouse thymus has been studied extensively, but less is known regarding T cell development in the human thymus. We used a combination of single-cell techniques and functional assays to perform deep immune profiling of human T cell development, focusing on the initial stages of prelineage commitment. We identified three thymus-seeding progenitor populations that also have counterparts in the bone marrow. In addition, we found that the human thymus physiologically supports the development of monocytes, dendritic cells, and NK cells, as well as limited development of B cells. These results are an important step toward monitoring and guiding regenerative therapies in patients after hematopoietic stem cell transplantation
Exacerbated inflammatory signaling underlies aberrant response to BMP9 in pulmonary arterial hypertension lung endothelial cells
Imbalanced transforming growth factor beta (TGFβ) and bone morphogenetic protein (BMP) signaling are postulated to favor a pathological pulmonary endothelial cell (EC) phenotype in pulmonary arterial hypertension (PAH). BMP9 is shown to reinstate BMP receptor type-II (BMPR2) levels and thereby mitigate hemodynamic and vascular abnormalities in several animal models of pulmonary hypertension (PH). Yet, responses of the pulmonary endothelium of PAH patients to BMP9 are unknown. Therefore, we treated primary PAH patient-derived and healthy pulmonary ECs with BMP9 and observed that stimulation induces transient transcriptional signaling associated with the process of endothelial-to-mesenchymal transition (EndMT). However, solely PAH pulmonary ECs showed signs of a mesenchymal trans-differentiation characterized by a loss of VE-cadherin, induction of transgelin (SM22α), and reorganization of the cytoskeleton. In the PAH cells, a prolonged EndMT signaling was found accompanied by sustained elevation of pro-inflammatory, pro-hypoxic, and pro-apoptotic signaling. Herein we identified interleukin-6 (IL6)-dependent signaling to be the central mediator required for the BMP9-induced phenotypic change in PAH pulmonary ECs. Furthermore, we were able to target the BMP9-induced EndMT process by an IL6 capturing antibody that normalized autocrine IL6 levels, prevented mesenchymal transformation, and maintained a functional EC phenotype in PAH pulmonary ECs. In conclusion, our results show that the BMP9-induced aberrant EndMT in PAH pulmonary ECs is dependent on exacerbated pro-inflammatory signaling mediated through IL6
CpG Deamination Creates Transcription Factor–Binding Sites with High Efficiency
The formation of new transcription factor–binding sites (TFBSs) has a major impact on the evolution of gene regulatory networks. Clearly, single nucleotide mutations arising within genomic DNA can lead to the creation of TFBSs. Are molecular processes inducing single nucleotide mutations contributing equally to the creation of TFBSs? In the human genome, a spontaneous deamination of methylated cytosine in the context of CpG dinucleotides results in the creation of thymine (C → T), and this mutation has the highest rate among all base substitutions. CpG deamination has been ascribed a role in silencing of transposons and induction of variation in regional methylation. We have previously shown that CpG deamination created thousands of p53-binding sites within genomic sequences of Alu transposons. Interestingly, we have defined a ∼30 bp region in Alu sequence, which, depending on a pattern of CpG deamination, can be converted to functional p53-, PAX-6-, and Myc-binding sites. Here, we have studied single nucleotide mutational events leading to creation of TFBSs in promoters of human genes and in genomic regions bound by such key transcription factors as Oct4, NANOG, and c-Myc. We document that CpG deamination events can create TFBSs with much higher efficiency than other types of mutational events. Our findings add a new role to CpG methylation: We propose that deamination of methylated CpGs constitutes one of the evolutionary forces acting on mutational trajectories of TFBSs formation contributing to variability in gene regulation
Transcriptional Autoregulatory Loops Are Highly Conserved in Vertebrate Evolution
BACKGROUND: Feedback loops are the simplest building blocks of transcriptional regulatory networks and therefore their behavior in the course of evolution is of prime interest. METHODOLOGY: We address the question of enrichment of the number of autoregulatory feedback loops in higher organisms. First, based on predicted autoregulatory binding sites we count the number of autoregulatory loops. We compare it to estimates obtained either by assuming that each (conserved) gene has the same chance to be a target of a given factor or by assuming that each conserved sequence position has an equal chance to be a binding site of the factor. CONCLUSIONS: We demonstrate that the numbers of putative autoregulatory loops conserved between human and fugu, danio or chicken are significantly higher than expected. Moreover we show, that conserved autoregulatory binding sites cluster close to the factors' starts of transcription. We conclude, that transcriptional autoregulatory feedback loops constitute a core transcriptional network motif and their conservation has been maintained in higher vertebrate organism evolution
Identification of Y-Box Binding Protein 1 As a Core Regulator of MEK/ERK Pathway-Dependent Gene Signatures in Colorectal Cancer Cells
Transcriptional signatures are an indispensible source of correlative information on disease-related molecular alterations on a genome-wide level. Numerous candidate genes involved in disease and in factors of predictive, as well as of prognostic, value have been deduced from such molecular portraits, e.g. in cancer. However, mechanistic insights into the regulatory principles governing global transcriptional changes are lagging behind extensive compilations of deregulated genes. To identify regulators of transcriptome alterations, we used an integrated approach combining transcriptional profiling of colorectal cancer cell lines treated with inhibitors targeting the receptor tyrosine kinase (RTK)/RAS/mitogen-activated protein kinase pathway, computational prediction of regulatory elements in promoters of co-regulated genes, chromatin-based and functional cellular assays. We identified commonly co-regulated, proliferation-associated target genes that respond to the MAPK pathway. We recognized E2F and NFY transcription factor binding sites as prevalent motifs in those pathway-responsive genes and confirmed the predicted regulatory role of Y-box binding protein 1 (YBX1) by reporter gene, gel shift, and chromatin immunoprecipitation assays. We also validated the MAPK-dependent gene signature in colorectal cancers and provided evidence for the association of YBX1 with poor prognosis in colorectal cancer patients. This suggests that MEK/ERK-dependent, YBX1-regulated target genes are involved in executing malignant properties
Skewed X-inactivation is common in the general female population
X-inactivation is a well-established dosage compensation mechanism ensuring that X-chromosomal genes are expressed at comparable levels in males and females. Skewed X-inactivation is often explained by negative selection of one of the alleles. We demonstrate that imbalanced expression of the paternal and maternal X-chromosomes is common in the general population and that the random nature of the X-inactivation mechanism can be sufficient to explain the imbalance. To this end, we analyzed blood-derived RNA and whole-genome sequencing data from 79 female children and their parents from the Genome of the Netherlands project. We calculated the median ratio of the paternal over total counts at all X-chromosomal heterozygous single-nucleotide variants with coverage ≥10. We identified two individuals where the same X-chromosome was inactivated in all cells. Imbalanced expression of the two X-chromosomes (ratios ≤0.35 or ≥0.65) was observed in nearly 50% of the population. The empirically observed skewing is explained by a theoretical model where X-inactivation takes place in an embryonic stage in which eight cells give rise to the hematopoietic compartment. Genes escaping X-inactivation are expressed from both alleles and therefore demonstrate less skewing than inactivated genes. Using this characteristic, we identified three novel escapee genes (SSR4, REPS2, and SEPT6), but did not find support for many previously reported escapee genes in blood. Our collective data suggest that skewed X-inactivation is common in the general population. This may contribute to manifestation of symptoms in carriers of recessive X-linked disorders. We recommend that X-inactivation results should not be used lightly in the interpretation of X-linked variants
Bioinformatics of eukaryotic gene regulation
Die Aufklärung der Mechanismen zur Kontrolle der Genexpression ist eines der wichtigsten Probleme der modernen Molekularbiologie. Detaillierte experimentelle Untersuchungen sind enorm aufwändig aufgrund der komplexen und kombinatorischen Wechselbeziehungen der beteiligten Moleküle. Infolgedessen sind bioinformatische Methoden unverzichtbar. Diese Dissertation stellt drei Methoden vor, die die Vorhersage der regulatorischen Elementen der Gentranskription verbessern. Der erste Ansatz findet Bindungsstellen, die von den Transkriptionsfaktoren erkannt werden. Dieser sucht statistisch überrepräsentierte kurze Motive in einer Menge von Promotersequenzen und wird erfolgreich auf das Genom der Bäckerhefe angewandt. Die Analyse der Genregulation in höheren Eukaryoten benötigt jedoch fortgeschrittenere Techniken. In verschiedenen Datenbanken liegen Hunderte von Profilen vor, die von den Transkriptionsfaktoren erkannt werden. Die Ähnlichkeit zwischen ihnen resultiert in mehrfachen Vorhersagen einer einzigen Bindestelle, was im nachhinein korrigiert werden muss. Es wird eine Methode vorgestellt, die eine Möglichkeit zur Reduktion der Anzahl von Profilen bietet, indem sie die Ähnlichkeiten zwischen ihnen identifiziert. Die komplexe Natur der Wechselbeziehung zwischen den Transkriptionsfaktoren macht jedoch die Vorhersage von Bindestellen schwierig. Auch mit einer Verringerung der zu suchenden Profile sind die Resultate der Vorhersagen noch immer stark fehlerbehafted. Die Zuhilfenahme der unabhängigen Informationsressourcen reduziert die Häufigkeit der Falschprognosen. Die dritte beschriebene Methode schlägt einen neuen Ansatz vor, die die Gen-Anotation mit der Regulierung von multiplen Transkriptionsfaktoren und den von ihnen erkannten Bindestellen assoziiert. Der Nutzen dieser Methode wird anhand von verschiedenen wohlbekannten Sätzen von Transkriptionsfaktoren demonstriert.Understanding the mechanisms which control gene expression is one of the fundamental problems of molecular biology. Detailed experimental studies of regulation are laborious due to the complex and combinatorial nature of interactions among involved molecules. Therefore, computational techniques are used to suggest candidate mechanisms for further investigation. This thesis presents three methods improving the predictions of regulation of gene transcription. The first approach finds binding sites recognized by a transcription factor based on statistical over-representation of short motifs in a set of promoter sequences. A succesful application of this method to several gene families of yeast is shown. More advanced techniques are needed for the analysis of gene regulation in higher eukaryotes. Hundreds of profiles recognized by transcription factors are provided by libraries. Dependencies between them result in multiple predictions of the same binding sites which need later to be filtered out. The second method presented here offers a way to reduce the number of profiles by identifying similarities between them. Still, the complex nature of interaction between transcription factors makes reliable predictions of binding sites difficult. Exploiting independent sources of information reduces the false predictions rate. The third method proposes a novel approach associating gene annotations with regulation of multiple transcription factors and binding sites recognized by them. The utility of the method is demonstrated on several well-known sets of transcription factors. RNA interference provides a way of efficient down-regulation of gene expression. Difficulties in predicting efficient siRNA sequences motivated the development of a library containing siRNA sequences and related experimental details described in the literature. This library, presented in the last chapter, is publicly available at http://www.human-sirna-database.ne
Relation between GC-contents of transcription factor's regulatory regions and corresponding PSCMs.
<p>Left: human sequences were used for regulatory region GC-content calculation. Right: fragments of human sequence conserved in fugu.</p
Comparison of fractions of genes and factors targeted by a transcription factor.
<p>Each point corresponds to a transcription factor <i>f</i>∈<i>G<sub>F</sub></i>. Horizontal axes provide the fraction of factors predicted to have a conserved binding site of <i>f</i>. Vertical axes give the fraction of regulated genes . Three cases are shown: no conservation, conservation to danio and to fugu.</p
- …