850 research outputs found

    Improved Core Genes Prediction for Constructing well-supported Phylogenetic Trees in large sets of Plant Species

    Full text link
    The way to infer well-supported phylogenetic trees that precisely reflect the evolutionary process is a challenging task that completely depends on the way the related core genes have been found. In previous computational biology studies, many similarity based algorithms, mainly dependent on calculating sequence alignment matrices, have been proposed to find them. In these kinds of approaches, a significantly high similarity score between two coding sequences extracted from a given annotation tool means that one has the same genes. In a previous work article, we presented a quality test approach (QTA) that improves the core genes quality by combining two annotation tools (namely NCBI, a partially human-curated database, and DOGMA, an efficient annotation algorithm for chloroplasts). This method takes the advantages from both sequence similarity and gene features to guarantee that the core genome contains correct and well-clustered coding sequences (\emph{i.e.}, genes). We then show in this article how useful are such well-defined core genes for biomolecular phylogenetic reconstructions, by investigating various subsets of core genes at various family or genus levels, leading to subtrees with strong bootstraps that are finally merged in a well-supported supertree.Comment: 12 pages, 7 figures, IWBBIO 2015 (3rd International Work-Conference on Bioinformatics and Biomedical Engineering

    Enabling comparative modeling of closely related genomes: Example genus Brucella

    Get PDF
    For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this short report, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as well as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.We thank Jean Jacques Letesson, Maite Iriarte, Stephan Kohler and David O'Callaghan for their input on improving specific annotations. This project has been funded by the United States National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Contract No. HHSN272200900040C, awarded to BW Sobral, and from the United States National Science Foundation under Grant MCB-1153357, awarded to CS Henry. J.P.F. acknowledges funding from [FRH/BD/70824/2010] of the FCT (Portuguese Foundation for Science and Technology) Ph.D. scholarship

    Complex -Glycans Influence the Spatial Arrangement of Voltage Gated Potassium Channels in Membranes of Neuronal-Derived Cells

    Get PDF
    The intrinsic electrical properties of a neuron depend on expression of voltage gated potassium (Kv) channel isoforms, as well as their distribution and density in the plasma membrane. Recently, we showed that N-glycosylation site occupancy of Kv3.1b modulated its placement in the cell body and neurites of a neuronal-derived cell line, B35 neuroblastoma cells. To extrapolate this mechanism to other N-glycosylated Kv channels, we evaluated the impact of N-glycosylation occupancy of Kv3.1a and Kv1.1 channels. Western blots revealed that wild type Kv3.1a and Kv1.1 α-subunits had complex and oligomannose N-glycans, respectively, and that abolishment of the N-glycosylation site(s) generated Kv proteins without N-glycans. Total internal reflection fluorescence microscopy images revealed that N-glycans of Kv3.1a contributed to its placement in the cell membrane while N-glycans had no effect on the distribution of Kv1.1. Based on particle analysis of EGFP-Kv proteins in the adhered membrane, glycosylated forms of Kv3.1a, Kv1.1, and Kv3.1b had differences in the number, size or density of Kv protein clusters in the cell membrane of neurites and cell body of B35 cells. Differences were also observed between the unglycosylated forms of the Kv proteins. Cell dissociation assays revealed that cell-cell adhesion was increased by the presence of complex N-glycans of Kv3.1a, like Kv3.1b, whereas cell adhesion was similar in the oligomannose and unglycosylated Kv1.1 subunit containing B35 cells. Our findings provide direct evidence that N-glycans of Kv3.1 splice variants contribute to the placement of these glycoproteins in the plasma membrane of neuronal-derived cells while those of Kv1.1 were absent. Further when the cell membrane distribution of the Kv channel was modified by N-glycans then the cell-cell adhesion properties were altered. Our study demonstrates that N-glycosylation of Kv3.1a, like Kv3.1b, provides a mechanism for the distribution of these proteins to the cell body and outgrowths and thereby can generate different voltage-dependent conductances in these membranes

    FLORA: a novel method to predict protein function from structure in diverse superfamilies

    Get PDF
    Predicting protein function from structure remains an active area of interest, particularly for the structural genomics initiatives where a substantial number of structures are initially solved with little or no functional characterisation. Although global structure comparison methods can be used to transfer functional annotations, the relationship between fold and function is complex, particularly in functionally diverse superfamilies that have evolved through different secondary structure embellishments to a common structural core. The majority of prediction algorithms employ local templates built on known or predicted functional residues. Here, we present a novel method (FLORA) that automatically generates structural motifs associated with different functional sub-families (FSGs) within functionally diverse domain superfamilies. Templates are created purely on the basis of their specificity for a given FSG, and the method makes no prior prediction of functional sites, nor assumes specific physico-chemical properties of residues. FLORA is able to accurately discriminate between homologous domains with different functions and substantially outperforms (a 2–3 fold increase in coverage at low error rates) popular structure comparison methods and a leading function prediction method. We benchmark FLORA on a large data set of enzyme superfamilies from all three major protein classes (α, β, αβ) and demonstrate the functional relevance of the motifs it identifies. We also provide novel predictions of enzymatic activity for a large number of structures solved by the Protein Structure Initiative. Overall, we show that FLORA is able to effectively detect functionally similar protein domain structures by purely using patterns of structural conservation of all residues

    The speciation of the proteome

    Get PDF
    <p>Abstract</p> <p>Introduction</p> <p>In proteomics a paradox situation developed in the last years. At one side it is basic knowledge that proteins are post-translationally modified and occur in different isoforms. At the other side the protein expression concept disclaims post-translational modifications by connecting protein names directly with function.</p> <p>Discussion</p> <p>Optimal proteome coverage is today reached by bottom-up liquid chromatography/mass spectrometry. But quantification at the peptide level in shotgun or bottom-up approaches by liquid chromatography and mass spectrometry is completely ignoring that a special peptide may exist in an unmodified form and in several-fold modified forms. The acceptance of the protein species concept is a basic prerequisite for meaningful quantitative analyses in functional proteomics. In discovery approaches only top-down analyses, separating the protein species before digestion, identification and quantification by two-dimensional gel electrophoresis or protein liquid chromatography, allow the correlation between changes of a biological situation and function.</p> <p>Conclusion</p> <p>To obtain biological relevant information kinetics and systems biology have to be performed at the protein species level, which is the major challenge in proteomics today.</p

    Analysis of congenital disorder of glycosylation-Id in a yeast model system shows diverse site-specific under-glycosylation of glycoproteins

    Get PDF
    Asparagine-linked glycosylation is a common post translational modification of proteins in eukaryotes. Mutations in the human ALG3 gene cause changed levels and altered glycan structures on mature glycoproteins and are the cause of a severe congenital disorder of glycosylation (CDG-Id). Diverse glycoproteins are also under-glycosylated in Saccharomyces cerevisae alg3 mutants. Here we analyzed site-specific glycosylation occupancy in this yeast model system using peptide-N-glycosidase F to label glycosylation sites with an asparagine-aspartate conversion that creates a new endoproteinase AspN cleavage site, followed by proteolytic digestion, and detection of peptides and glycopeptides by LC-ESI-MS/MS. We used this analytical method to identify and measure site specific glycosylation occupancy in alg3 mutant and wild type yeast strains. We found decreased site specific N-glycosylation occupancy in the alg3 knockout strain preferentially at Asn-Xaa-Ser sequences located in secondary structural elements, features previously associated with poor glycosylation efficiency. Furthermore, we identified 26 previously experimentally unverified glycosylation sites. Our results provide insights into the underlying mechanisms of disease in CDG-Id, and our methodology will be useful in site specific glycosylation analysis in many model systems and clinical applications

    Glycan Structures Contain Information for the Spatial Arrangement of Glycoproteins in the Plasma Membrane

    Get PDF
    Glycoconjugates at the cell surface are crucial for cells to communicate with each other and the extracellular microenvironment. While it is generally accepted that glycans are vectorial biopolymers, their information content is unclear. This report provides evidence that distinct N-glycan structures influence the spatial arrangement of two integral membrane glycoproteins, Kv3.1 and E-cadherin, at the adherent membrane which in turn alter cellular properties. Distinct N-glycan structures were generated by heterologous expression of these glycoproteins in parental and glycosylation mutant Chinese hamster ovary cell lines. Unlike the N-linked glycans, the O-linked glycans of the mutant cell lines are similar to those of the parental cell line. Western and lectin blots of total membranes and GFP immunopurified samples, combined with glycosidase digestion reactions, were employed to verify the glycoproteins had predominantly complex, oligomannose, and bisecting type N-glycans from Pro(-)5, Lec1, and Lec10B cell lines, respectively. Based on total internal reflection fluorescence and differential interference contrast microscopy techniques, and cellular assays of live parental and glycosylation mutant CHO cells, we propose that glycoproteins with complex, oligomannose or bisecting type N-glycans relay information for localization of glycoproteins to various regions of the plasma membrane in both a glycan-specific and protein-specific manner, and furthermore cell-cell interactions are required for deciphering much of this information. These distinct spatial arrangements also impact cell adhesion and migration. Our findings provide direct evidence that N-glycan structures of glycoproteins contribute significantly to the information content of cells

    Minería de datos para el descubrimiento de patrones en enfermedades respiratorias en Bogotá, Colombia

    Get PDF
    Trabajo de InvestigaciónEl presente proyecto se basa en la aplicación de minería de datos mediante el algoritmo de clustering K- means que permita la generación de un modelo descriptivo con el análisis de los datos y con el objetivo de identificar posibles comportamientos en enfermedades respiratorias en la ciudad de Bogotá. El conjunto de clústeres generados por la herramienta RapidMiner es la recopilación de datos de un periodo de cinco años de 2012 a 2016, en donde se contemplan el número de casos asociados a 184 diagnósticos de enfermedades respiratorias y la edad de los pacientes corresponde de 0 a 5 años.Trabajo de Investigación1. GENERALIDADES 2. OBJETIVOS 3. JUSTIFICACIÓN 4. DELIMITACIÓN 5. MARCO REFERENCIAL 6. METODOLOGÍA 7. FUENTES DE EXTRACCIÓN Y SUS VARIABLES 8. DISEÑO 9. SELECCIÓN DE ALGORITMOS DE CLUSTERING 10. RECONOCER PATRONES A PARTIR DE LA INFORMACIÓN RECOPILADA 11. CONCLUSIONES 12. TRABAJOS FUTUROS 13. REFERENCIAS BIBLIOGRÁFICAS 14. ANEXOSPregradoIngeniero de Sistema
    corecore