27 research outputs found

    SNARE-CNN: a 2D convolutional neural network architecture to identify SNARE proteins from high-throughput sequencing data

    Get PDF
    Deep learning has been increasingly and widely used to solve numerous problems in various fields with state-of-the-art performance. It can also be applied in bioinformatics to reduce the requirement for feature extraction and reach high performance. This study attempts to use deep learning to predict SNARE proteins, which is one of the most vital molecular functions in life science. A functional loss of SNARE proteins has been implicated in a variety of human diseases (e.g., neurodegenerative, mental illness, cancer, and so on). Therefore, creating a precise model to identify their functions is a crucial problem for understanding these diseases, and designing the drug targets. Our SNARE-CNN model which uses two-dimensional convolutional neural networks and position-specific scoring matrix profiles could identify SNARE proteins with achieved sensitivity of 76.6%, specificity of 93.5%, accuracy of 89.7%, and MCC of 0.7 in cross-validation dataset. We also evaluate the performance of our model via an independent dataset and the result shows that we are able to solve the overfitting problem. Compared with other state-of-the-art methods, this approach achieved significant improvement in all of the metrics. Throughout the proposed study, we provide an effective model for identifying SNARE proteins and a basis for further research that can apply deep learning in bioinformatics, especially in protein function prediction. SNARE-CNN are freely available at https://github.com/khanhlee/snare-cnn

    FGF2 Affects Parkinson's Disease-Associated Molecular Networks Through Exosomal Rab8b/Rab31

    Get PDF
    Ras-associated binding (Rab) proteins are small GTPases that regulate the trafficking of membrane components during endocytosis and exocytosis including the release of extracellular vesicles (EVs). Parkinson's disease (PD) is one of the most prevalent neurodegenerative disorder in the elderly population, where pathological proteins such as alpha-synuclein (alpha-Syn) are transmitted in EVs from one neuron to another neuron and ultimately across brain regions, thereby facilitating the spreading of pathology. We recently demonstrated fibroblast growth factor-2 (FGF2) to enhance the release of EVs and delineated the proteomic signature of FGF2-triggered EVs in cultured primary hippocampal neurons. Out of 235 significantly upregulated proteins, we found that FGF2 specifically enriched EVs for the two Rab family membersRab8bandRab31. Consequently, we investigated the interactions ofRab8bandRab31using a network analysis approach in order to estimate the global influence of their enrichment in EVs. To achieve this, we have demarcated a protein-protein interaction network (PPiN) for these Rabs and identified the proteins associated with PD in various cellular components of the central nervous system (CNS), in different brain regions, and in the enteric nervous system (ENS). A total of 126 direct or indirect interactions were reported for two Rab candidates, out of which 114 areRab8binteractions and 54 areRab31interactions, ultimately resulting in an individual interaction score (IS) of 90.48 and 42.86%, respectively. Conclusively, these results for the first time demonstrate the relevance of FGF2-induced Rab-enrichment in EVs and its potential to regulate PD pathophysiology

    Machine Learning Guided Exploration of an Empirical Ribozyme Fitness Landscape

    Get PDF
    Okinawa Institute of Science and Technology Graduate UniversityDoctor of PhilosophyFitness landscape of a biomolecule is a representation of its activity as a function of its sequence. Properties of a fitness landscape determine how evolution proceeds. Therefore, the distribution of functional variants and more importantly, the connectivity of these variants within the sequence space are important scientific questions. Exploration of these spaces, however, is impeded by the combinatorial explosion of the sequence space. High-throughput experimental methods have recently reduced this impediment but only modestly. Better computational methods are needed to fully utilize the rich information from these experimental data to better understand the properties of the fitness landscape. In this work, I seek to improve this exploration process by combining data from massively parallel experimental assay with smart library design using advanced computational techniques. I focus on an artificial RNA enzyme or ribozyme that can catalyze a ligation reaction between two RNA fragments. This chemistry is analogous to that of the modern RNA polymeraseenzymes, therefore, represents an important reaction in the origin of life. In the first chapter, I discuss the background to this work in the context of evolutionary theory of fitness landscape and its implications in biotechnology. In chapter 2, I explore the use of processes borrowed from the field of evolutionary computation to solve optimization problems using real experimental sequence-activity data. In chapter 3, I investigate the use of supervised machine learning models to extract information on epistatic interactions from the dataset collected during multiple rounds of directed evolution. I investigate and experimentally validate the extent to which a deep learning model can be used to guide a completely computational evolutionary algorithm towards distant regions of the fitness landscape. In the final chapter, I perform a comprehensive experimental assay of the combinatorial region explored by the deep learning-guided evolutionary algorithm. Using this dataset, I analyze higher-order epistasis and attempt to explain the increased predictability of the region sampled by the algorithm. Finally, I provide the first experimental evidence of a large RNA ‘neutral network’. Altogether, this work represents the most comprehensive experimental and computational study of the RNA ligase ribozyme fitness landscape to date, providing important insights into the evolutionary search space possibly explored during the earliest stages of life.doctoral thesi

    Discovering gene functional relationships using a literature-based NMF model

    Get PDF
    The rapid growth of the biomedical literature and genomic information presents a major challenge for determining the functional relationships among genes. Several bioinformatics tools have been developed to extract and identify gene relationships from various biological databases. However, an intuitive user-interface tool that allows the biologist to determine functional relationships among genes is still not available. In this study, we develop a Web-based bioinformatics software environment called FAUN or Feature Annotation Using Nonnegative matrix factorization (NMF) to facilitate both the discovery and classification of functional relationships among genes. Both the computational complexity and parameterization of NMF for processing gene sets are discussed. We tested FAUN on three manually constructed gene document collections, and then used it to analyze several microarray-derived gene sets obtained from studies of the developing cerebellum in normal and mutant mice. FAUN provides utilities for collaborative knowledge discovery and identification of new gene relationships from text streams and repositories (e.g., MEDLINE). It is particularly useful for the validation and analysis of gene associations suggested by microarray experimentation. The FAUN site is publicly available at http://grits.eecs.utk.edu/faun

    Doctor of Philosophy

    Get PDF
    dissertationElectron microscopy can visualize synapses at nanometer resolution, and can thereby capture the fine structure of these contacts. However, this imaging method lacks three key elements: temporal information, protein visualization, and large volume reconstruction. For my dissertation, I developed three methods in electron microscopy that overcame these limitations. First, I developed a method to freeze neurons at any desired time point after a stimulus to study synaptic vesicle cycle. Second, I developed a method to couple super-resolution fluorescence microscopy and electron microscopy to pinpoint the location of proteins in electron micrographs at nanometer resolution. Third, I collaborated with computer scientists to develop methods for semi-automated reconstruction of nervous system. I applied these techniques to answer two fundamental questions in synaptic biology. Which vesicles fuse in response to a stimulus? How are synaptic vesicles recovered at synapses after fusion? Only vesicles that are in direct contact with plasma membrane fuse upon stimulation. The active zone in C. elegans is broad, but primed vesicles are concentrated around the dense projection. Following exocytosis of synaptic vesicles, synaptic vesicle membrane was recovered rapidly at two distinct locations at a synapse: the dense projection and adherens junctions. These studies suggest that there may be a novel form of ultrafast endocytosis

    Characterizing of Robo downstream signalling to promote direct neurogenesis

    Get PDF
    The size and degree of folding of the mammalian cortex are pivotal factors that affect species’ cognitive abilities and sensorimotor skills. The cerebral cortex is the main region in the mammalian brain that governs complex cognitive behaviors. The development of the cortex depends on the amplification of neural stem cells (NSCs), neural progenitors (NPs) and the generation and differentiation of postmitotic neurons. There are two main types of NPs in the mouse neocortex (NCx): apical radial glia (aRGCs) and intermediate progenitor cells (IPCs). Robo receptors play an important role in regulating the amplification of cortical progenitors. The absence of Robo receptor signalling plus the alteration of the Notch signalling pathway in the mouse NCx leads to an overproduction of poorly functional IPCs. Ancient amniotic cortices exhibit a predominance of direct neurogenesis during development, where aRGCs produce neurons directly. Intriguingly, Robo receptors as well as Notch signalling play a major role in attenuating the mode of neurogenesis. This hypothesis was validated in several brain structures with phyletic antiquity, confirming that Robo receptors are essential in the shift towards indirect neurogenesis during the evolution and expansion of the cerebral cortex. However, little is known about the precise signalling cascade or interactors employed by Robo to initiate direct neurogenesis. In this thesis, we demonstrated the transcriptomic differences between the developing mouse NCx and OB (where direct neurogenesis is predominant in the OB vs NCx) using single cell RNA sequencing (scRNA). We showed aRGCs populations that are differently enriched between these regions. We traced lineage trajectories of indirect and direct neurogenesis, as well as validating the expression of several differentially expressed genes between the two regions. We used Robo intracellular domain (ICD)—this region is considered a constitutively active form of Robo receptor—and demonstrated the protein interactors that bind it. Following that, we demonstrated Robo ICD localization to the nucleus. We discovered that Robo conserved cytoplasmic domains play an important role in Robo ICD nucleocytoplasmic localization and direct neurogenesis induction in the mouse NCx. Next, we showed that Robo ICD localizes to chromatin, and causes transcriptional changes that occur upon the experimental gain of function of Robo ICD in the NCx and in vitro. Additionally, we showed that loss of function of Nup107, a nuclear pore complex (NPC) protein and one of Robo ICD protein interactors, induces direct neurogenesis in mouse NCx and chick lateral pallium. Taken together, our findings suggest the transcriptional role Robo ICD exerts by binding DNA and, consequently, its conserved role in moderating direct neurogenesis. El tamaño y el grado de plegamiento de la corteza cerebral son factores fundamentales que afectan a las capacidades cognitivas y habilidades sensoriomotoras de los mamíferos. La corteza cerebral es la principal región del cerebro que gobierna conductas cognitivas complejas. El desarrollo de la corteza depende de la amplificación de células madre neurales (CMN), progenitores neurales (PN) y de la generación y diferenciación de neuronas postmitóticas. Hay dos tipos principales de PN en la neocorteza o neocórtex (NCx) del ratón: las células de glía radial apical (CGRa) y las células progenitoras intermedias (CPI). Los receptores Robo juegan un papel importante en la regulación de la amplificación de los progenitores corticales. La ausencia de señalización del receptor Robo sumada a la alteración de la vía de señalización de Notch en el NCx de ratón conduce a una sobreproducción de CPI poco funcionales. La corteza de especies amniotas anteriores en la evolución a los mamíferos (como los reptiles y las aves) exhiben un predominio de neurogénesis directa durante el desarrollo, por el cual las CGRa producen neuronas directamente. Curiosamente, los receptores Robo, así como la señalización de Notch, desempeñan un papel importante en la atenuación de esta modalidad de neurogénesis a lo largo de la evolución. Esta hipótesis ha sido validada en varias estructuras cerebrales con antigüedad filética, confirmando que los receptores Robo son esenciales en el cambio hacia la neurogénesis indirecta durante la evolución y la consecuente expansión de la corteza cerebral. Sin embargo, se sabe poco sobre la cascada de señalización de Robo, así como de los mensajeros secundarios empleados por este receptor para iniciar el proceso de neurogénesis directa. En esta tesis, demostramos las diferencias transcriptómicas que existen entre el NCx y el bulbo olfatorio (BO) de ratón en desarrollo (sabiendo que la neurogénesis directa es predominante en BO frente al NCx). Para ello usamos la técnica de secuenciación de ARN de células individuales (single-cell RNA sequencing (scRNAseq) en inglés). Mostramos que hay poblaciones de RGCa que están diferentemente enriquecidas entre estas regiones. Trazamos trayectorias de linaje de neurogénesis indirecta y directa y validamos la expresión de varios genes expresados diferencialmente entre las dos regiones. Utilizamos el dominio intracelular (DIC) de Robo (esta región se considera una forma constitutivamente activa del receptor) y demostramos los mensajeros secundarios que se unen. Después, demostramos la localización del DIC de Robo en el núcleo. Descubrimos que sus dominios citoplasmáticos, muy conservados a lo largo de la evolución, tienen un papel importante en la localización núcleo-citoplasmática del DIC y la inducción directa de neurogénesis en el NCx de ratón. A continuación, mostramos que una vez en el núcleo, el DIC se une a la cromatina y provoca cambios transcripcionales que tienen como resultado una la ganancia de función de Robo tanto en el NCx como in vitro. Además, demostramos que la pérdida de función de Nup107, una proteína que forma parte del complejo del poro nuclear (CPN) además de ser una proteína de interacción del DIC de Robo, induce neurogénesis directa en el NCx de ratón y en el palio lateral de pollo. En conjunto, nuestros resultados sugieren el papel de modulación transcripcional que ejerce el DIC de Robo al unirse al ADN y, en consecuencia, su rol conservado a lo largo de la evolución en la disminución de la neurogénesis directa

    Characterising A/E pathogens’ Type III secretion system effector proteins

    Get PDF
    The human diarrhoeal pathogens enteropathogenic Escherichia coli, enterohaemorrhagic E. coli and their murine analogue Citrobacter rodentium are a family of enteric bacteria that cause characteristic attaching and effacing lesions at the site of intestinal colonisation. Lesion formation and successful infection depends on a Type III Secretion System and its cognate effector proteins. Once translocated into the host cytosol, effectors subvert several mammalian cell processes to create and maintain an infectious niche. With the advent of high-throughput screening techniques, the rate of effector discovery has surpassed their biochemical investigation. This work therefore explores the function of seven effectors, three uncharacterised and four previously described. Through machine learning algorithms, two completely novel C. rodentium effectors were identified and designated NleN and NleO. While nleN encodes a truncation of a conserved effector, NleO disrupts the host cytoskeleton by binding and cleaving the Rho GTPase Rac1 at a unique scission site. A third C. rodentium effector, EspS, which has previously demonstrated to modulate intestinal pathology, binds the host mitochondrial protein TRIAP1 and contributes to the remodelling of host lipids during infection of C57Bl/6 mice. Finally, the contribution of EspZ, TccP, EspT and NleG to the recently described pyroptotic cell death pathway in human macrophages is explored. The findings presented herein contribute to the ever-growing repertoire of effector protein knowledge and underscore the importance of studying effector synergy during infection.Open Acces

    Guidelines for the use and interpretation of assays for monitoring autophagy (4th edition)

    Get PDF

    Guidelines for the use and interpretation of assays for monitoring autophagy (4th edition)

    Get PDF
    This work was supported by the National Institute of General Medical Sciences [GM131919].In 2008, we published the first set of guidelines for standardizing research in autophagy. Since then, this topic has received increasing attention, and many scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Thus, it is important to formulate on a regular basis updated guidelines for monitoring autophagy in different organisms. Despite numerous reviews, there continues to be confusion regarding acceptable methods to evaluate autophagy, especially in multicellular eukaryotes. Here, we present a set of guidelines for investigators to select and interpret methods to examine autophagy and related processes, and for reviewers to provide realistic and reasonable critiques of reports that are focused on these processes. These guidelines are not meant to be a dogmatic set of rules, because the appropriateness of any assay largely depends on the question being asked and the system being used. Moreover, no individual assay is perfect for every situation, calling for the use of multiple techniques to properly monitor autophagy in each experimental setting. Finally, several core components of the autophagy machinery have been implicated in distinct autophagic processes (canonical and noncanonical autophagy), implying that genetic approaches to block autophagy should rely on targeting two or more autophagy-related genes that ideally participate in distinct steps of the pathway. Along similar lines, because multiple proteins involved in autophagy also regulate other cellular pathways including apoptosis, not all of them can be used as a specific marker for bona fide autophagic responses. Here, we critically discuss current methods of assessing autophagy and the information they can, or cannot, provide. Our ultimate goal is to encourage intellectual and technical innovation in the field.PostprintPeer reviewe
    corecore