2,559 research outputs found

    Functional Diversity and Structural Disorder in the Human Ubiquitination Pathway

    Get PDF
    The ubiquitin-proteasome system plays a central role in cellular regulation and protein quality control (PQC). The system is built as a pyramid of increasing complexity, with two E1 (ubiquitin activating), few dozen E2 (ubiquitin conjugating) and several hundred E3 (ubiquitin ligase) enzymes. By collecting and analyzing E3 sequences from the KEGG BRITE database and literature, we assembled a coherent dataset of 563 human E3s and analyzed their various physical features. We found an increase in structural disorder of the system with multiple disorder predictors (IUPred - E1: 5.97%, E2: 17.74%, E3: 20.03%). E3s that can bind E2 and substrate simultaneously (single subunit E3, ssE3) have significantly higher disorder (22.98%) than E3s in which E2 binding (multi RING-finger, mRF, 0.62%), scaffolding (6.01%) and substrate binding (adaptor/substrate recognition subunits, 17.33%) functions are separated. In ssE3s, the disorder was localized in the substrate/adaptor binding domains, whereas the E2-binding RING/HECT-domains were structured. To demonstrate the involvement of disorder in E3 function, we applied normal modes and molecular dynamics analyses to show how a disordered and highly flexible linker in human CBL (an E3 that acts as a regulator of several tyrosine kinase-mediated signalling pathways) facilitates long-range conformational changes bringing substrate and E2-binding domains towards each other and thus assisting in ubiquitin transfer. E3s with multiple interaction partners (as evidenced by data in STRING) also possess elevated levels of disorder (hubs, 22.90% vs. non-hubs, 18.36%). Furthermore, a search in PDB uncovered 21 distinct human E3 interactions, in 7 of which the disordered region of E3s undergoes induced folding (or mutual induced folding) in the presence of the partner. In conclusion, our data highlights the primary role of structural disorder in the functions of E3 ligases that manifests itself in the substrate/adaptor binding functions as well as the mechanism of ubiquitin transfer by long-range conformational transitions. © 2013 Bhowmick et al

    Transcriptional regulatory networks controlling woolliness in peach in response to preharvest gibberellin application and cold storage

    Get PDF
    BACKGROUND: Postharvest fruit conservation relies on low temperatures and manipulations of hormone metabolism to maintain sensory properties. Peaches are susceptible to chilling injuries, such as ‘woolliness’ that is caused by juice loss leading to a ‘wooly’ fruit texture. Application of gibberellic acid at the initial stages of pit hardening impairs woolliness incidence, however the mechanisms controlling the response remain unknown. We have employed genome wide transcriptional profiling to investigate the effects of gibberellic acid application and cold storage on harvested peaches. RESULTS: Approximately half of the investigated genes exhibited significant differential expression in response to the treatments. Cellular and developmental process gene ontologies were overrepresented among the differentially regulated genes, whereas sequences in cell death and immune response categories were underrepresented. Gene set enrichment demonstrated a predominant role of cold storage in repressing the transcription of genes associated to cell wall metabolism. In contrast, genes involved in hormone responses exhibited a more complex transcriptional response, indicating an extensive network of crosstalk between hormone signaling and low temperatures. Time course transcriptional analyses demonstrate the large contribution of gene expression regulation on the biochemical changes leading to woolliness in peach. CONCLUSION: Overall, our results provide insights on the mechanisms controlling the complex phenotypes associated to postharvest textural changes in peach and suggest that hormone mediated reprogramming previous to pit hardening affects the onset of chilling injuries. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12870-015-0659-2) contains supplementary material, which is available to authorized users

    Modelling and recognition of protein contact networks by multiple kernel learning and dissimilarity representations

    Get PDF
    Multiple kernel learning is a paradigm which employs a properly constructed chain of kernel functions able to simultaneously analyse different data or different representations of the same data. In this paper, we propose an hybrid classification system based on a linear combination of multiple kernels defined over multiple dissimilarity spaces. The core of the training procedure is the joint optimisation of kernel weights and representatives selection in the dissimilarity spaces. This equips the system with a two-fold knowledge discovery phase: by analysing the weights, it is possible to check which representations are more suitable for solving the classification problem, whereas the pivotal patterns selected as representatives can give further insights on the modelled system, possibly with the help of field-experts. The proposed classification system is tested on real proteomic data in order to predict proteins' functional role starting from their folded structure: specifically, a set of eight representations are drawn from the graph-based protein folded description. The proposed multiple kernel-based system has also been benchmarked against a clustering-based classification system also able to exploit multiple dissimilarities simultaneously. Computational results show remarkable classification capabilities and the knowledge discovery analysis is in line with current biological knowledge, suggesting the reliability of the proposed system

    Insights into the regulation of intrinsically disordered proteins in the human proteome by analyzing sequence and gene expression data

    Get PDF
    Background: Disordered proteins need to be expressed to carry out specified functions; however, their accumulation in the cell can potentially cause major problems through protein misfolding and aggregation. Gene expression levels, mRNA decay rates, microRNA (miRNA) targeting and ubiquitination have critical roles in the degradation and disposal of human proteins and transcripts. Here, we describe a study examining these features to gain insights into the regulation of disordered proteins. Results: In comparison with ordered proteins, disordered proteins have a greater proportion of predicted ubiquitination sites. The transcripts encoding disordered proteins also have higher proportions of predicted miRNA target sites and higher mRNA decay rates, both of which are indicative of the observed lower gene expression levels. The results suggest that the disordered proteins and their transcripts are present in the cell at low levels and/or for a short time before being targeted for disposal. Surprisingly, we find that for a significant proportion of highly disordered proteins, all four of these trends are reversed. Predicted estimates for miRNA targets, ubiquitination and mRNA decay rate are low in the highly disordered proteins that are constitutively and/or highly expressed. Conclusions: Mechanisms are in place to protect the cell from these potentially dangerous proteins. The evidence suggests that the enrichment of signals for miRNA targeting and ubiquitination may help prevent the accumulation of disordered proteins in the cell. Our data also provide evidence for a mechanism by which a significant proportion of highly disordered proteins (with high expression levels) can escape rapid degradation to allow them to successfully carry out their function

    WNT-DEPENDENT REGENERATIVE FUNCTION IS INDUCED IN LEUKEMIA-INITIATING AC133BRIGHT CELLS

    Get PDF
    The Cancer Stem Cell model supported the notion that leukemia was initiated and maintained in vivo by a small fraction of leukemia-initiating cells (LICs). Previous studies have suggested the involvement of Wnt signaling pathway in Acute Myeloid Leukemia (AML) by the ability to sustain the development of LICs. A novel hematopoietic stem and progenitor cell marker, monoclonal antibody AC133, recognizes the CD34bright CD38- subset of human acute myeloid leukemia cells, suggesting that it may be an early marker for the LICs. During the first part of my phD program we previously evaluated the ability of leukemic AC133+ fraction, to perform engraftment following to xenotransplantation in immunodeficient mouse model Rag2-/-\u3b3c-/-. The results showed that the surface marker AC133 is able to enrich for the cell fraction that contains the LICs. In consideration of our previously reported data, derived from the expression profiling analysis performed in normal (n=10) and leukemic (n=33) human long-term reconstituting AC133+ cells, we revealed that the ligand-dependent Wnt signaling is induced in AML through a diffuse expression and release of WNT10B, a hematopoietic stem cells regenerative-associated molecule. In situ detection performed on bone marrow biopsies of AML patients, showed the activation of the Wnt pathway, through the concomitant presence of the ligand WNT10B and of the active dephosphorylated \u3b2-catenin form, suggesting an autocrine / paracrine-type ligand-dependent activation mechanism. In consideration of the link between hematopoietic regeneration and developmental signaling, we transplanted primary AC133+ AML A46 cells into developing zebrafish. This biosensor model revealed the formation of ectopic structures by activation of dorsal organizer markers that act downstream of the Wnt pathway. These results suggested that the misappropriating Wnt associated functions can promote pathological stem cell-like regeneration responsiveness. The analyses performed in situ retained information on the cellular localization, enabling determination of the activity status of individual cells and allowing the tumor environment view. Taking this issue into consideration, during the second part of my phD program, I set up the application of a new in situ method for localized detection and genotyping of individual transcripts directly in cells and tissues. The mRNA in situ detection technique is based on padlock probes ligation and target priming rolling circle amplification allowing the single nucleotide resolution in heterogenous tissues. The mRNA in situ detection performed on bone marrow biopsies derived from AML patients, showed a diffuse localization pattern of WNT10B molecule in the tissue. Conversely, only the AC133bright cell population shows the Wnt signaling activation signature represented by the cytoplasmatic accumulation and nuclear translocation of the active form of \u3b2-catenin. In spite of this, we previously evidenced that the regenerative function of WNT signaling pathway is defined by the up-regulation of WNT10B, WNT10A, WNT2B and WNT6 loci, we identified the WNT10B as a major locus associated with the regenerative function and over-expressed by all AML patients. By the molecular evaluation of the WNT10B transcript, we isolated an aberrant splicing variant (WNT10BIVS1), that identify Non Core-Binding Factor Leukemia (NCBFL) class and whose potential role is discussed. Moreover, we demonstrate that the function of "leukemia stem cell", present in the cell population enriched for the marker AC133bright, is strictly related to regenerative function associated with WNT signaling, defining the key role of WNT10B ligand as a specific molecular marker for leuchemogenesis. This thesis defines the new suitable approaches to characterize the leukemia-initiating cells (LICs) and suggest the role of WNT10B as a new suitable target for AML

    Machine Learning Guided Exploration of an Empirical Ribozyme Fitness Landscape

    Get PDF
    Okinawa Institute of Science and Technology Graduate UniversityDoctor of PhilosophyFitness landscape of a biomolecule is a representation of its activity as a function of its sequence. Properties of a fitness landscape determine how evolution proceeds. Therefore, the distribution of functional variants and more importantly, the connectivity of these variants within the sequence space are important scientific questions. Exploration of these spaces, however, is impeded by the combinatorial explosion of the sequence space. High-throughput experimental methods have recently reduced this impediment but only modestly. Better computational methods are needed to fully utilize the rich information from these experimental data to better understand the properties of the fitness landscape. In this work, I seek to improve this exploration process by combining data from massively parallel experimental assay with smart library design using advanced computational techniques. I focus on an artificial RNA enzyme or ribozyme that can catalyze a ligation reaction between two RNA fragments. This chemistry is analogous to that of the modern RNA polymeraseenzymes, therefore, represents an important reaction in the origin of life. In the first chapter, I discuss the background to this work in the context of evolutionary theory of fitness landscape and its implications in biotechnology. In chapter 2, I explore the use of processes borrowed from the field of evolutionary computation to solve optimization problems using real experimental sequence-activity data. In chapter 3, I investigate the use of supervised machine learning models to extract information on epistatic interactions from the dataset collected during multiple rounds of directed evolution. I investigate and experimentally validate the extent to which a deep learning model can be used to guide a completely computational evolutionary algorithm towards distant regions of the fitness landscape. In the final chapter, I perform a comprehensive experimental assay of the combinatorial region explored by the deep learning-guided evolutionary algorithm. Using this dataset, I analyze higher-order epistasis and attempt to explain the increased predictability of the region sampled by the algorithm. Finally, I provide the first experimental evidence of a large RNA ‘neutral network’. Altogether, this work represents the most comprehensive experimental and computational study of the RNA ligase ribozyme fitness landscape to date, providing important insights into the evolutionary search space possibly explored during the earliest stages of life.doctoral thesi

    Protein Functional Surfaces: Global Shape Matching and Local Spatial Alignments of Ligand Binding Sites

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein surfaces comprise only a fraction of the total residues but are the most conserved functional features of proteins. Surfaces performing identical functions are found in proteins absent of any sequence or fold similarity. While biochemical activity can be attributed to a few key residues, the broader surrounding environment plays an equally important role.</p> <p>Results</p> <p>We describe a methodology that attempts to optimize two components, global shape and local physicochemical texture, for evaluating the similarity between a pair of surfaces. Surface shape similarity is assessed using a three-dimensional object recognition algorithm and physicochemical texture similarity is assessed through a spatial alignment of conserved residues between the surfaces. The comparisons are used in tandem to efficiently search the Global Protein Surface Survey (GPSS), a library of annotated surfaces derived from structures in the PDB, for studying evolutionary relationships and uncovering novel similarities between proteins.</p> <p>Conclusion</p> <p>We provide an assessment of our method using library retrieval experiments for identifying functionally homologous surfaces binding different ligands, functionally diverse surfaces binding the same ligand, and binding surfaces of ubiquitous and conformationally flexible ligands. Results using surface similarity to predict function for proteins of unknown function are reported. Additionally, an automated analysis of the ATP binding surface landscape is presented to provide insight into the correlation between surface similarity and function for structures in the PDB and for the subset of protein kinases.</p

    Tubulin tyrosination regulates synaptic function and is disrupted in Alzheimer's disease

    Get PDF
    : Microtubules play fundamental roles in the maintenance of neuronal processes and in synaptic function and plasticity. While dynamic microtubules are mainly composed of tyrosinated tubulin, long-lived microtubules contain detyrosinated tubulin, suggesting that the tubulin tyrosination/detyrosination cycle is a key player in the maintenance of microtubule dynamics and neuronal homeostasis, conditions which go awry in neurodegenerative diseases. In the tyrosination/detyrosination cycle, the C-terminal tyrosine of α-tubulin is removed by tubulin carboxypeptidases and re-added by tubulin tyrosine ligase. Here we show that tubulin tyrosine ligase hemizygous mice exhibit decreased tyrosinated microtubules, reduced dendritic spine density, and both synaptic plasticity and memory deficits. We further report decreased tubulin tyrosine ligase expression in sporadic and familial Alzheimer's disease, and reduced microtubule dynamics in human neurons harboring the familial APP-V717I mutation. Finally, we show that synapses visited by dynamic microtubules are more resistant to oligomeric amyloid β peptide toxicity and that expression of tubulin tyrosine ligase, by restoring microtubule entry into spines, suppresses the loss of synapses induced by amyloid β peptide. Together, our results demonstrate that a balanced tyrosination/detyrosination tubulin cycle is necessary for the maintenance of synaptic plasticity, is protective against amyloid β peptide-induced synaptic damage, and that this balance is lost in Alzheimer's disease, providing evidence that defective tubulin retyrosination may contribute to circuit dysfunction during neurodegeneration in Alzheimer's disease

    Associated bacteria affect sexual reproduction by altering gene expression and metabolic processes in a biofilm inhabiting diatom

    Get PDF
    Diatoms are unicellular algae with a fundamental role in global biogeochemical cycles as major primary producers at the base of aquatic food webs. In recent years, chemical communication between diatoms and associated bacteria has emerged as a key factor in diatom ecology, spurred by conceptual and technological advancements to study the mechanisms underlying these interactions. Here, we use a combination of physiological, transcriptomic, and metabolomic approaches to study the influence of naturally coexisting bacteria, Maribacter sp. and Roseovarius sp., on the sexual reproduction of the biofilm inhabiting marine pennate diatom Seminavis robusta. While Maribacter sp. severely reduces the reproductive success of S. robusta cultures, Roseovarius sp. slightly enhances it. Contrary to our expectation, we demonstrate that the effect of the bacterial exudates is not caused by altered cell-cycle regulation prior to the switch to meiosis. Instead, Maribacter sp. exudates cause a reduced production of diproline, the sexual attraction pheromone of S. robusta. Transcriptomic analyses show that this is likely an indirect consequence of altered intracellular metabolic fluxes in the diatom, especially those related to amino acid biosynthesis, oxidative stress response, and biosynthesis of defense molecules. This study provides the first insights into the influence of bacteria on diatom sexual reproduction and adds a new dimension to the complexity of a still understudied phenomenon in natural diatom populations

    Machine learning methods for genomic high-content screen data analysis applied to deduce organization of endocytic network

    Get PDF
    High-content screens are widely used to get insight on mechanistic organization of biological systems. Chemical and/or genomic interferences are used to modulate molecular machinery, then light microscopy and quantitative image analysis yield a large number of parameters describing phenotype. However, extracting functional information from such high-content datasets (e.g. links between cellular processes or functions of unknown genes) remains challenging. This work is devoted to the analysis of a multi-parametric image-based genomic screen of endocytosis, the process whereby cells uptake cargoes (signals and nutrients) and distribute them into different subcellular compartments. The complexity of the quantitative endocytic data was approached using different Machine Learning techniques, namely, Clustering methods, Bayesian networks, Principal and Independent component analysis, Artificial neural networks. The main goal of such an analysis is to predict possible modes of action of screened genes and also to find candidate genes that can be involved in a process of interest. The degree of freedom for the multidimensional phenotypic space was identified using the data distributions, and then the high-content data were deconvolved into separate signals from different cellular modules. Some of those basic signals (phenotypic traits) were straightforward to interpret in terms of known molecular processes; the other components gave insight into interesting directions for further research. The phenotypic profile of perturbation of individual genes are sparse in coordinates of the basic signals, and, therefore, intrinsically suggest their functional roles in cellular processes. Being a very fundamental process, endocytosis is specifically modulated by a variety of different pathways in the cell; therefore, endocytic phenotyping can be used for analysis of non-endocytic modules in the cell. Proposed approach can be also generalized for analysis of other high-content screens.:Contents Objectives Chapter 1 Introduction 1.1 High-content biological data 1.1.1 Different perturbation types for HCS 1.1.2 Types of observations in HTS 1.1.3 Goals and outcomes of MP HTS 1.1.4 An overview of the classical methods of analysis of biological HT- and HCS data 1.2 Machine learning for systems biology 1.2.1 Feature selection 1.2.2 Unsupervised learning 1.2.3 Supervised learning 1.2.4 Artificial neural networks 1.3 Endocytosis as a system process 1.3.1 Endocytic compartments and main players 1.3.2 Relation to other cellular processes Chapter 2 Experimental and analytical techniques 2.1 Experimental methods 2.1.1 RNA interference 2.1.2 Quantitative multiparametric image analysis 2.2 Detailed description of the endocytic HCS dataset 2.2.1 Basic properties of the endocytic dataset 2.2.2 Control subset of genes 2.3 Machine learning methods 2.3.1 Latent variables models 2.3.2 Clustering 2.3.3 Bayesian networks 2.3.4 Neural networks Chapter 3 Results 3.1 Selection of labeled data for training and validation based on KEGG information about genes pathways 3.2 Clustering of genes 3.2.1 Comparison of clustering techniques on control dataset 3.2.2 Clustering results 3.3 Independent components as basic phenotypes 3.3.1 Algorithm for identification of the best number of independent components 3.3.2 Application of ICA on the full dataset and on separate assays of the screen 3.3.3 Gene annotation based on revealed phenotypes 3.3.4 Searching for genes with target function 3.4 Bayesian network on endocytic parameters 3.4.1 Prediction of pathway based on parameters values using Naïve Bayesian Classifier 3.4.2 General Bayesian Networks 3.5 Neural networks 3.5.1 Autoencoders as nonlinear ICA 3.5.2 siRNA sequence motives discovery with deep NN 3.6 Biological results 3.6.1 Rab11 ZNF-specific phenotype found by ICA 3.6.2 Structure of BN revealed dependency between endocytosis and cell adhesion Chapter 4 Discussion 4.1 Machine learning approaches for discovery of phenotypic patterns 4.1.1 Functional annotation of unknown genes based on phenotypic profiles 4.1.2 Candidate genes search 4.2 Adaptation to other HCS data and generalization Chapter 5 Outlook and future perspectives 5.1 Handling sequence-dependent off-target effects with neural networks 5.2 Transition between machine learning and systems biology models Acknowledgements References Appendix A.1 Full list of cellular and endocytic parameters A.2 Description of independent components of the full dataset A.3 Description of independent components extracted from separate assays of the HC
    • …
    corecore