2,502 research outputs found
Recommended from our members
ViFi: accurate detection of viral integration and mRNA fusion reveals indiscriminate and unregulated transcription in proximal genomic regions in cervical cancer.
The integration of viral sequences into the host genome is an important driver of tumorigenesis in many viral mediated cancers, notably cervical cancer and hepatocellular carcinoma. We present ViFi, a computational method that combines phylogenetic methods with reference-based read mapping to detect viral integrations. In contrast with read-based reference mapping approaches, ViFi is faster, and shows high precision and sensitivity on both simulated and biological data, even when the integrated virus is a novel strain or highly mutated. We applied ViFi to matched genomic and mRNA data from 68 cervical cancer samples from TCGA and found high concordance between the two. Surprisingly, viral integration resulted in a dramatic transcriptional upregulation in all proximal elements, including LINEs and LTRs that are not normally transcribed. This upregulation is highly correlated with the presence of a viral gene fused with a downstream human element. Moreover, genomic rearrangements suggest the formation of apparent circular extrachromosomal (ecDNA) human-viral structures. Our results suggest the presence of apparent small circular fusion viral/human ecDNA, which correlates with indiscriminate and unregulated expression of proximal genomic elements, potentially contributing to the pathogenesis of HPV-associated cervical cancers. ViFi is available at https://github.com/namphuon/ViFi
Coevolved mutations reveal distinct architectures for two core proteins in the bacterial flagellar motor
Switching of bacterial flagellar rotation is caused by large domain movements of the FliG protein triggered by binding of the signal protein CheY to FliM. FliG and FliM form adjacent multi-subunit arrays within the basal body C-ring. The movements alter the interaction of the FliG C-terminal (FliGC) "torque" helix with the stator complexes. Atomic models based on the Salmonella entrovar C-ring electron microscopy reconstruction have implications for switching, but lack consensus on the relative locations of the FliG armadillo (ARM) domains (amino-terminal (FliGN), middle (FliGM) and FliGC) as well as changes during chemotaxis. The generality of the Salmonella model is challenged by the variation in motor morphology and response between species. We studied coevolved residue mutations to determine the unifying elements of switch architecture. Residue interactions, measured by their coevolution, were formalized as a network, guided by structural data. Our measurements reveal a common design with dedicated switch and motor modules. The FliM middle domain (FliMM) has extensive connectivity most simply explained by conserved intra and inter-subunit contacts. In contrast, FliG has patchy, complex architecture. Conserved structural motifs form interacting nodes in the coevolution network that wire FliMM to the FliGC C-terminal, four-helix motor module (C3-6). FliG C3-6 coevolution is organized around the torque helix, differently from other ARM domains. The nodes form separated, surface-proximal patches that are targeted by deleterious mutations as in other allosteric systems. The dominant node is formed by the EHPQ motif at the FliMMFliGM contact interface and adjacent helix residues at a central location within FliGM. The node interacts with nodes in the N-terminal FliGc α-helix triad (ARM-C) and FliGN. ARM-C, separated from C3-6 by the MFVF motif, has poor intra-network connectivity consistent with its variable orientation revealed by structural data. ARM-C could be the convertor element that provides mechanistic and species diversity.JK was supported by Medical Research Council grant U117581331. SK was supported by seed funds from Lahore University of Managment Sciences (LUMS) and the Molecular Biology Consortium
Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site
We introduce a novel method to screen the promoters of a set of genes with
shared biological function, against a precompiled library of motifs, and find
those motifs which are statistically over-represented in the gene set. The gene
sets were obtained from the functional Gene Ontology (GO) classification; for
each set and motif we optimized the sequence similarity score threshold,
independently for every location window (measured with respect to the TSS),
taking into account the location dependent nucleotide heterogeneity along the
promoters of the target genes. We performed a high throughput analysis,
searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of
more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology
classes and for 412 known DNA motifs. When combined with binding site and
location conservation between human and mouse, the method identifies with high
probability functional binding sites that regulate groups of biologically
related genes. We found many location-sensitive functional binding events and
showed that they clustered close to the TSS. Our method and findings were put
to several experimental tests. By allowing a "flexible" threshold and combining
our functional class and location specific search method with conservation
between human and mouse, we are able to identify reliably functional TF binding
sites. This is an essential step towards constructing regulatory networks and
elucidating the design principles that govern transcriptional regulation of
expression. The promoter region proximal to the TSS appears to be of central
importance for regulation of transcription in human and mouse, just as it is in
bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure
Genomic characterization of Gli-activator targets in sonic hedgehog-mediated neural patterning
Sonic hedgehog (Shh) acts as a morphogen to mediate the specification of distinct cell identities in the ventral neural tube through a Gli-mediated (Gli1-3) transcriptional network. Identifying Gli targets in a systematic fashion is central to the understanding of the action of Shh. We examined this issue in differentiating neural progenitors in mouse. An epitope-tagged Gli-activator protein was used to directly isolate cis-regulatory sequences by chromatin immunoprecipitation (ChIP). ChIP products were then used to screen custom genomic tiling arrays of putative Hedgehog (Hh) targets predicted from transcriptional profiling studies, surveying 50-150 kb of non-transcribed sequence for each candidate. In addition to identifying expected Gli-target sites, the data predicted a number of unreported direct targets of Shh action. Transgenic analysis of binding regions in Nkx2.2, Nkx2.1 (Titf1) and Rab34 established these as direct Hh targets. These data also facilitated the generation of an algorithm that improved in silico predictions of Hh target genes. Together, these approaches provide significant new insights into both tissue-specific and general transcriptional targets in a crucial Shh-mediated patterning process
A Third Approach to Gene Prediction Suggests Thousands of Additional Human Transcribed Regions
The identification and characterization of the complete ensemble of genes is a main goal of deciphering the digital information stored in the human genome. Many algorithms for computational gene prediction have been described, ultimately derived from two basic concepts: (1) modeling gene structure and (2) recognizing sequence similarity. Successful hybrid methods combining these two concepts have also been developed. We present a third orthogonal approach to gene prediction, based on detecting the genomic signatures of transcription, accumulated over evolutionary time. We discuss four algorithms based on this third concept: Greens and CHOWDER, which quantify mutational strand biases caused by transcription-coupled DNA repair, and ROAST and PASTA, which are based on strand-specific selection against polyadenylation signals. We combined these algorithms into an integrated method called FEAST, which we used to predict the location and orientation of thousands of putative transcription units not overlapping known genes. Many of the newly predicted transcriptional units do not appear to code for proteins. The new algorithms are particularly apt at detecting genes with long introns and lacking sequence conservation. They therefore complement existing gene prediction methods and will help identify functional transcripts within many apparent “genomic deserts.
Recommended from our members
Computational solutions for omics data
High-throughput experimental technologies are generating increasingly massive and complex genomic data sets. The sheer enormity and heterogeneity of these data threaten to make the arising problems computationally infeasible. Fortunately, powerful algorithmic techniques lead to software that can answer important biomedical questions in practice. In this Review, we sample the algorithmic landscape, focusing on state-of-the-art techniques, the understanding of which will aid the bench biologist in analysing omics data. We spotlight specific examples that have facilitated and enriched analyses of sequence, transcriptomic and network data sets.National Institutes of Health (U.S.) (Grant GM081871
The physicist's guide to one of biotechnology's hottest new topics: CRISPR-Cas
Clustered regularly interspaced short palindromic repeats (CRISPR) and
CRISPR-associated proteins (Cas) constitute a multi-functional, constantly
evolving immune system in bacteria and archaea cells. A heritable, molecular
memory is generated of phage, plasmids, or other mobile genetic elements that
attempt to attack the cell. This memory is used to recognize and interfere with
subsequent invasions from the same genetic elements. This versatile prokaryotic
tool has also been used to advance applications in biotechnology. Here we
review a large body of CRISPR-Cas research to explore themes of evolution and
selection, population dynamics, horizontal gene transfer, specific and
cross-reactive interactions, cost and regulation, non-immunological CRISPR
functions that boost host cell robustness, as well as applicable mechanisms for
efficient and specific genetic engineering. We offer future directions that can
be addressed by the physics community. Physical understanding of the CRISPR-Cas
system will advance uses in biotechnology, such as developing cell lines and
animal models, cell labeling and information storage, combatting antibiotic
resistance, and human therapeutics.Comment: 75 pages, 15 figures, Physical Biology (2018
Structural investigation of the molecular mechanisms underlying titin elasticity and signaling
Titin is a giant protein that spans >1µm from the Z-disc to the M-line, forming an intrasarcomeric filament system in vertebrate striated muscle, which is not only essential for the assembly of the sarcomere, but also critical for myofibril signaling and metabolism. Furthermore, it provides the sarcomere with resting tension, elasticity and restoring forces upon stretch, ensuring the correct positioning of the actin-myosin motors during muscle function. Titin is composed of ~300 immunoglobulin (Ig) and fibronectin-III (FnIII) domains, arranged in linear tandems. They are interspersed by an auto-inhibited Ser kinase (TK) close to its C-terminus as well as several unique sequences, most prominently a differentially spliced stretch rich in PEVK residues which localizes to the I-band part of titin where its elastic properties reside. There, the PEVK segment is flanked by a long Ig tandem, which together act as serial molecular springs that determine titin elastic response.
The focus of this work lay in the elucidation of the molecular mechanisms governing titin I-band elasticity and the recruitment of the M-line signalosome around TK involved in the control of myofibril turnover and the trophic state of muscle. To that effect, we have elucidated the crystal structure of a six-Ig fragment representative of the elastic Ig-tandem at 3.3Å resolution. The model reveals the molecular principles of Ig-arraying at the skeletal I-band of titin as mediated by conserved Ig-Ig transition motifs. Regular domain arrangements within this fragment point at the existence of a high-order in the fine structure of the filament, which is confirmed by EM data on a 19-mer poly-Ig segment. Our findings indicate a long-range, supra-order in the skeletal I-band of titin, where assembly of Ig domains into dynamical super-motifs is essential for the elastic function of the filament. We propose a novel model of spring mechanism for poly-Ig elasticity in titin based on a “carpenter ruler” model of skeletal I-band architecture. Furthermore, we have focused on the recruitment of the ubiquitin ligase MURF1 to the M-line signalosome through its specific interaction with titin domains A168 A170. MuRF1 contains several oligomerization motifs in succession, which indicates a possible need for tight regulation. We have therefore analyzed their influence on the oligomeric state of the protein. Our SEC-MALS data showed that the a-helical region of MuRF1 is dimeric in isolation, while in combination with the preceding B-Box domain, itself a dimerization motif, higher-order assembly is induced, which might be of physiological importance. We could also show that higher-order assembly of MuRF1 did not disrupt binding to A168-A170 in pull-down assays. Further biophysical or structural characterization of the complex of A168-A170 with MuRF1 constructs was hindered by the severely compromised solubility of the complex. Finally, we have successfully solved the crystal structure of the FnIII-Kin-Ig region of twitchin, which corresponds to titin A170-TK-M1. The N-terminal linker wraps around the kinase domain and positions the preceding FnIII domain in such a way that it blocks the autoregulatory tail in its inhibitory positon. Thus, from the structure we could conclude that stretch-activation of Twc kinase seems unlikely and instead propose phosphorylation of Y 104 as a possible activation mechanism.
Our findings illustrate how the structural and functional diversity in titin’s modular architecture has evolved not only on the basis of individual domains. Rather, functionality often involves adaptation of several neighboring domains or even whole Ig tandems/super-repeats. This is reflected in variations in mechanical and dynamic properties observed in different parts of the chain and highlights the necessity of working with representative multi-domain fragments to gain a comprehensive understanding of the titin chai
- …