Search CORE

54 research outputs found

pfsearchV3: a code acceleration and heuristic to search PROSITE profiles

Author: Bougueleret Lydie
Bridge Alan
Cerutti Lorenzo
Pagni Marco
Schuepbach Thierry
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

Summary: The PROSITE resource provides a rich and well annotated source of signatures in the form of generalized profiles that allow protein domain detection and functional annotation. One of the major limiting factors in the application of PROSITE in genome and metagenome annotation pipelines is the time required to search protein sequence databases for putative matches. We describe an improved and optimized implementation of the PROSITE search tool pfsearch that, combined with a newly developed heuristic, addresses this limitation. On a modern x86_64 hyper-threaded quad-core desktop computer, the new pfsearchV3 is two orders of magnitude faster than the original algorithm. Availability and implementation: Source code and binaries of pfsearchV3 are freely available for download at http://web.expasy.org/pftools/#pfsearchV3, implemented in C and supported on Linux. PROSITE generalized profiles including the heuristic cut-off scores are available at the same address. Contact: [email protected]

RERO DOC Digital Library

ViralZone: a knowledge resource to understand virus diversity

Author: Bairoch Amos
Bougueleret Lydie
de Castro Edouard
Hulo Chantal
Le Mercier Philippe
Masson Patrick
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

The molecular diversity of viruses complicates the interpretation of viral genomic and proteomic data. To make sense of viral gene functions, investigators must be familiar with the virus host range, replication cycle and virion structure. Our aim is to provide a comprehensive resource bridging together textbook knowledge with genomic and proteomic sequences. ViralZone web resource (www.expasy.org/viralzone/) provides fact sheets on all known virus families/genera with easy access to sequence data. A selection of reference strains (RefStrain) provides annotated standards to circumvent the exponential increase of virus sequences. Moreover ViralZone offers a complete set of detailed and accurate virion picture

RERO DOC Digital Library

ViralZone: recent updates to the virus knowledge resource

Author: Bitter Hans
Bougueleret Lydie
De Castro Edouard
Essioux Laurent
Gruenbaum Lore
Hulo Chantal
Le Mercier Philippe
Masson Patrick
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

ViralZone (http://viralzone.expasy.org) is a knowledge repository that allows users to learn about viruses including their virion structure, replication cycle and host-virus interactions. The information is divided into viral fact sheets that describe virion shape, molecular biology and epidemiology for each viral genus, with links to the corresponding annotated proteomes of UniProtKB. Each viral genus page contains detailed illustrations, text and PubMed references. This new update provides a linked view of viral molecular biology through 133 new viral ontology pages that describe common steps of viral replication cycles shared by several viral genera. This viral cell-cycle ontology is also represented in UniProtKB in the form of annotated keywords. In this way, users can navigate from the description of a replication-cycle event, to the viral genus concerned, and the associated UniProtKB protein record

RERO DOC Digital Library

UniPathway: a resource for the exploration and annotation of metabolic pathways

Author: Axelsen Kristian B.
Bairoch Amos
Bougueleret Lydie
Bridge Alan
Coissac Eric
Coudert Elisabeth
Keller Guillaume
Morgat Anne
Viari Alain
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

UniPathway (http://www.unipathway.org) is a fully manually curated resource for the representation and annotation of metabolic pathways. UniPathway provides explicit representations of enzyme-catalyzed and spontaneous chemical reactions, as well as a hierarchical representation of metabolic pathways. This hierarchy uses linear subpathways as the basic building block for the assembly of larger and more complex pathways, including species-specific pathway variants. All of the pathway data in UniPathway has been extensively cross-linked to existing pathway resources such as KEGG and MetaCyc, as well as sequence resources such as the UniProt KnowledgeBase (UniProtKB), for which UniPathway provides a controlled vocabulary for pathway annotation. We introduce here the basic concepts underlying the UniPathway resource, with the aim of allowing users to fully exploit the information provided by UniPathwa

RERO DOC Digital Library

New and continuing developments at PROSITE

Author: Bougueleret Lydie
Bridge Alan
Cerutti Lorenzo
Cuche Béatrice A.
de Castro Edouard
Hulo Nicolas
Sigrist Christian J. A.
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

PROSITE (http://prosite.expasy.org/) consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule a collection of rules, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE signatures, together with ProRule, are used for the annotation of domains and features of UniProtKB/Swiss-Prot entries. Here, we describe recent developments that allow users to perform whole-proteome annotation as well as a number of filtering options that can be combined to perform powerful targeted searches for biological discovery. The latest version of PROSITE (release 20.85, of 30 August 2012) contains 1308 patterns, 1039 profiles and 1041 ProRule

RERO DOC Digital Library

HAMAP in 2015: updates to the protein family classification and annotation system

Author: Auchincloss Andrea H.
Baratin Delphine
Bougueleret Lydie
Bridge Alan
Coudert Elisabeth
Cuche Béatrice A.
deCastro Edouard
Keller Guillaume
Pedruzzi Ivo
Poux Sylvain
Redaschi Nicole
Rivoire Catherine
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

HAMAP (High-quality Automated and Manual Annotation of Proteins—available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorith

RERO DOC Digital Library

Genetic Variations and Diseases in UniProtKB/Swiss-Prot: The Ins and Outs of Expert Manual Curation.

Author: Alan Bridge
Anne Estreicher
Arnaud Gos
Ioannis Xenarios
Jerven Bolleman
Lionel Breuza
Lydie Bougueleret
Maria Livia Famiglietti
Nicole Redaschi
null null
Sylvain Poux
Sébastien Géhant
Publication venue: 'Wiley'
Publication date: 01/01/2014
Field of study

During the last few years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized, and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype

Crossref

Serveur académique lausannois

PubMed Central

HAMAP in 2013, new developments in the protein family classification and annotation system

Author: Auchincloss Andrea H.
Baratin Delphine
Bougueleret Lydie
Bridge Alan
Coudert Elisabeth
Cuche Béatrice A.
de Castro Edouard
Keller Guillaume
Pedruzzi Ivo
Poux Sylvain
Redaschi Nicole
Rivoire Catherine
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

HAMAP (High-quality Automated and Manual Annotation of Proteins—available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profile

RERO DOC Digital Library

Rhea—a manually curated resource of biochemical reactions

Author: Alcántara Rafael
Axelsen Kristian B.
Belda Eugeni
Bougueleret Lydie
Bridge Alan
Cao Hong
Coudert Elisabeth
de Matos Paula
Ennis Marcus
Morgat Anne
Owen Gareth
Steinbeck Christoph
Turner Steve
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive resource of expert-curated biochemical reactions. Rhea provides a non-redundant set of chemical transformations for use in a broad spectrum of applications, including metabolic network reconstruction and pathway inference. Rhea includes enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list), transport reactions and spontaneously occurring reactions. Rhea reactions are described using chemical species from the Chemical Entities of Biological Interest ontology (ChEBI) and are stoichiometrically balanced for mass and charge. They are extensively manually curated with links to source literature and other public resources on metabolism including enzyme and pathway databases. This cross-referencing facilitates the mapping and reconciliation of common reactions and compounds between distinct resources, which is a common first step in the reconstruction of genome scale metabolic networks and model

RERO DOC Digital Library

The SwissLipids knowledgebase for lipid biology

Author: Aimo Lucila
Bougueleret Lydie
Bridge Alan
David Fabrice P.A.
Gleizes Anne
Götz Lou
Hyka-Nouspikel Nevila
Kuznetsov Dmitry
Liechti Robin
Niknejad Anne
Riezman Howard
van der Goot F. Gisou
Xenarios Ioannis
Publication venue
Publication date: 02/08/2017
Field of study

Motivation: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it. Results: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology—SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. The current version of SwissLipids includes over 244 000 known and theoretically possible lipid structures, over 800 proteins, and curated links to published knowledge from over 620 peer-reviewed publications. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge. Availability: SwissLipids is freely available at http://www.swisslipids.org/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

RERO DOC Digital Library