Search CORE

629 research outputs found

The ProDom database of protein domain families: more emphasis on 3D

Author: Beausse Yoann
Bru Catherine
Carrère Sébastien
Courcelle Emmanuel
Dalmar Sandrine
Kahn Daniel
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

ProDom is a comprehensive database of protein domain families generated from the global comparison of all available protein sequences. Recent improvements include the use of three-dimensional (3D) information from the SCOP database; a completely redesigned web interface (http://www.toulouse.inra.fr/prodom.html); visualization of ProDom domains on 3D structures; coupling of ProDom analysis with the Geno3D homology modelling server; Bayesian inference of evolutionary scenarios for ProDom families. In addition, we have developed ProDom-SG, a ProDom-based server dedicated to the selection of candidate proteins for structural genomics

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

MACHOS: Markov clusters of homologous subsequences

Author: Altschul
Apic
Bateman
Benson
Berman
Birney
Bork
Bru
Carninci
Dorit
Enright
Finn
Fitch
Gracy
Hall
Harlow
Heger
Holm
Huang
John
Jones
Koonin
Krause
Kriventseva
Kunin
Lankester
Lund
Margoliash
Mark A. Ragan
Owen
Pearson
Price
Richardson
Servant
Simon Wong
Smith
The Uniprot Consortium
van Dongen
Yona
Yona
Zuckerkandl
Zuckerkandl
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Motivation: The classification of proteins into homologous groups (families) allows their structure and function to be analysed and compared in an evolutionary context. The modular nature of eukaryotic proteins presents a considerable challenge to the delineation of families, as different local regions within a single protein may share common ancestry with distinct, even mutually exclusive, sets of homologs, thereby creating an intricate web of homologous relationships if full-length sequences are taken as the unit of evolution. We attempt to disentangle this web by developing a fully automated pipeline to delineate protein subsequences that represent sensible units for homology inference, and clustering them into putatively homologous families using the Markov clustering algorithm

CiteSeerX

Crossref

PubMed Central

University of Queensland eSpace

A distributed computation of Interpro Pfam, PROSITE and ProDom for protein annotation

Author: BIOFOCO Network
Costa Marcos Mota
Lopes Irving R. M.
Melo Melo, Alba Cristina Magalhães Alves de
Ribeiro Edward de Oliveira
Ribeiro Victor B. R.
Walter Maria Emília Machado Telles
Zerlotini Gustavo G.
Publication venue: FUNPEC-RP
Publication date: 01/01/2005
Field of study

Interpro is a widely used tool for protein annotation in genome sequencing projects, demanding a large amount of computation and representing a huge time-consuming step. We present a strategy to execute programs using databases Pfam, PROSITE and ProDom of Interpro in a distributed environment using a Java-based messaging system. We developed a two-layer scheduling architecture of the distributed infrastructure. Then, we made experiments and analyzed the results. Our distributed system gave much better results than Interpro Pfam, PROSITE and ProDom running in a centralized platform. This approach seems to be appropriate and promising for highly demanding computational tools used for biological applications

CiteSeerX

Repository Open Access to Scientific Information from Embrapa

Repositório Institucional da Universidade de Brasília

Draft genomes of two Artocarpus plants, jackfruit (A. heterophyllus) and breadfruit (A. altilis)

Author: Featherston Jonathan
Hendre Prasad S.
Jamnadass Ramni
Jiang Sanjie
Kao Shu-Min
Kariba Robert
Liu Huan
Liu Min
Liu Xin
Muchugi Alice
Muthemba Samuel
Sahu Sunil Kumar
Song Bo
Van de Peer Yves
Van Deynze Allen
Xu Xun
Yang Huanming
Yssel Anna
Zerega Nyree J. C.
Publication venue: 'MDPI AG'
Publication date: 24/12/2019
Field of study

Two of the most economically important plants in the Artocarpus genus are jackfruit (A. heterophyllus Lam.) and breadfruit (A. altilis (Parkinson) Fosberg). Both species are long-lived trees that have been cultivated for thousands of years in their native regions. Today they are grown throughout tropical to subtropical areas as an important source of starch and other valuable nutrients. There are hundreds of breadfruit varieties that are native to Oceania, of which the most commonly distributed types are seedless triploids. Jackfruit is likely native to the Western Ghats of India and produces one of the largest tree-borne fruit structures (reaching up to 45 kg). To-date, there is limited genomic information for these two economically important species. Here, we generated 273 Gb and 227 Gb of raw data from jackfruit and breadfruit, respectively. The high-quality reads from jackfruit were assembled into 162,440 scaffolds totaling 982 Mb with 35,858 genes. Similarly, the breadfruit reads were assembled into 180,971 scaffolds totaling 833 Mb with 34,010 genes. A total of 2822 and 2034 expanded gene families were found in jackfruit and breadfruit, respectively, enriched in pathways including starch and sucrose metabolism, photosynthesis, and others. The copy number of several starch synthesis-related genes were found to be increased in jackfruit and breadfruit compared to closely-related species, and the tissue-specific expression might imply their sugar-rich and starch-rich characteristics. Overall, the publication of high-quality genomes for jackfruit and breadfruit provides information about their specific composition and the underlying genes involved in sugar and starch metabolism

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Copenhagen University Research Information System

UPSpace at the University of Pretoria

Protein function annotation using protein domain family resources

Author: Das S
Orengo CA
Publication venue
Publication date: 15/01/2016
Field of study

As a result of the genome sequencing and structural genomics initiatives, we have a wealth of protein sequence and structural data. However, only about 1% of these proteins have experimental functional annotations. As a result, computational approaches that can predict protein functions are essential in bridging this widening annotation gap. This article reviews the current approaches of protein function prediction using structure and sequence based classification of protein domain family resources with a special focus on functional families in the CATH-Gene3D resource

UCL Discovery

The InterPro protein families database: the classification resource after 15 years

The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 201

RERO DOC Digital Library

The InterPro protein families and domains database: 20 years on

Author: Bateman A
Blum M
Bork P
Bridge A
Chang H-Y
Chuguransky S
Finn RD
Gough J
Grego T
Haft DH
Kandasaamy S
Letunic I
Marchler-Bauer A
Mi H
Mitchell A
Natale DA
Necci M
Nuka G
Orengo CA
Pandurangan AP
Paysan-Lafosse T
Qureshi M
Raj S
Richardson L
Rivoire C
Salazar GA
Sigrist CJA
Sillitoe I
Thanki N
Thomas PD
Tosatto SCE
Williams L
Wu CH
Publication venue
Publication date: 06/11/2020
Field of study

The InterPro database (https://www.ebi.ac.uk/interpro/) provides an integrative classification of protein sequences into families, and identifies functionally important domains and conserved sites. InterProScan is the underlying software that allows protein and nucleic acid sequences to be searched against InterPro's signatures. Signatures are predictive models which describe protein families, domains or sites, and are provided by multiple databases. InterPro combines signatures representing equivalent families, domains or sites, and provides additional information such as descriptions, literature references and Gene Ontology (GO) terms, to produce a comprehensive resource for protein classification. Founded in 1999, InterPro has become one of the most widely used resources for protein family annotation. Here, we report the status of InterPro (version 81.0) in its 20th year of operation, and its associated software, including updates to database content, the release of a new website and REST API, and performance improvements in InterProScan

UCL Discovery

The InterPro protein families database: the classification resource after 15 years.

Author: Attwood T.K.
Bateman A.
Bork P.
Chang H.Y.
Daugherty L.
Finn R.D.
Fraser M.
Gough J.
Guyot D.
Haft D.
Huang H.
Hunter S.
Kahn D.
Letunic I.
Lopez R.
McAnulla C.
McMenamin C.
Mi H.
Mitchell A.
Natale D.A.
Nuka G.
Oates M.
Orengo C.
Pesseat S.
Punta M.
Rato C.
Redaschi N.
Rivoire C.
Sangrador-Vegas A.
Scheremetjew M.
Sigrist C.J.
Sillitoe I.
Thomas P.D.
Wu C.H.
Xenarios I.
Yong S.Y.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

Serveur académique lausannois

The University of Manchester - Institutional Repository

Plant protein-coding gene families: emerging bioinformatics approaches

Author: Altschul
Andreeva
Attwood
Beers
Benson
Bru
Cambra
Carretero-Paulet
Chain
Chen
Cochrane
Cuff
de Lima Morais
Del Bem
Enright
Faro
Feng
Finn
Fraser
Frech
Garcia-Lorenzo
Guilfoyle
Guindon
Haft
Hunter
Kaminuma
Kersey
Klimke
Kolodziejczyk
Kotsyfakis
Lees
Leinonen
Letunic
Li
Li
Lijavetzky
Lima
Liolios
Lu
Manuel Martinez
Marchler-Bauer
Martinez
Martinez
Martinez
Mi
Moreno-Risueno
Mugford
Nikolskaya
Nissen
Paterson
Pearson
Perez-Rodriguez
Philippe
Plett
Proost
Pruitt
Rautengarten
Rawlings
Remington
Roberts
Rouard
Sigrist
Singh
Swaminathan
Takahashi
Tatusov
Tian
Tyler
UniProt_Consortium
Van de Peer
Vercammen
Wang
Yu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Protein-coding gene families are sets of similar genes with a shared evolutionary origin and, generally, with similar biological functions. In plants, the size and role of gene families has been only partially addressed. However, suitable bioinformatics tools are being developed to cluster the enormous number of sequences currently available in databases. Specifically, comparative genomic databases promise to become powerful tools for gene family annotation in plant clades. In this review, I evaluate the data retrieved from various gene family databases, the ease with which they can be extracted and how useful the extracted information is

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM