Search CORE

43 research outputs found

Ranking-aware integration and explorative search of distributed bio-data

Author: Ghisalberti G
Masseroli M
Picozzi M
Publication venue
Publication date: 01/01/2012
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

A new bioinformatics analysis tools framework at EMBL–EBI

Author: Edgar
F. Valentin
H. McWilliam
Hull
J. Paern
Katoh
Lassmann
Lopez
M. Goujon
Notredame
Pearson
Quevillon
R. Lopez
S. Squizzato
W. Li
Publication venue: Oxford University Press
Publication date
Field of study

The EMBL-EBI provides access to various mainstream sequence analysis applications. These include sequence similarity search services such as BLAST, FASTA, InterProScan and multiple sequence alignment tools such as ClustalW, T-Coffee and MUSCLE. Through the sequence similarity search services, the users can search mainstream sequence databases such as EMBL-Bank and UniProt, and more than 2000 completed genomes and proteomes. We present here a new framework aimed at both novice as well as expert users that exposes novel methods of obtaining annotations and visualizing sequence analysis results through one uniform and consistent interface. These services are available over the web and via Web Services interfaces for users who require systematic access or want to interface with customized pipe-lines and workflows using common programming languages. The framework features novel result visualizations and integration of domain and functional predictions for protein database searches. It is available at http://www.ebi.ac.uk/Tools/sss for sequence similarity searches and at http://www.ebi.ac.uk/Tools/msa for multiple sequence alignments

Crossref

PubMed Central

Uncovering hidden biodiversity in the Cryptophyta: New picoplanktonic clades from clone library studies at the Helgoland time series site in the southern German Bight.

Author: Medlin LK
Metfies M
Piwosz K
Publication venue: Vie et Mileu
Publication date: 01/06/2017
Field of study

Cryptophyceae are important group in marine phytoplankton, but little is known about the occurrence and distribution of individual species. Recently, with use of molecular probes and microarray technology, it has been shown that species related to Teleaulax spp. or Chroomonas spp. (clades 4 and 6) contributed most to cryptophyceam biomass in the North Sea. The probe for clades 4 and 6 cannot separate them and the single probe recognises members of both clades. Here, we increase the genetic diversity of our investigations of cryptophycean diversity in the North Sea by sequencing 18S rRNA clone libraries made from fractionated water samples to examine specifically the picoplanktonic fraction and to determine whether clade 4 or 6 were the dominant cyrptophytes. We focused on samples from the spring phytoplankton bloom in 2004 because the microarray signals were the strongest at this time. Excluding chimeric sequences, we detected nine cryptophycean OTUs, seven of which fell into the Teleaulax/ Plagioselmis branch, whereas two grouped with Geminigera spp. Our results indicate that these OTUs, affiliated with clade 4, may be an important component of cryptophyte community during spring bloom in the North Sea

Plymouth Marine Science Electronic Archive (PlyMSEA)

The EMBL Nucleotide Sequence Database

Author: Aldebert Philippe
Althorpe Nicola
Apweiler Rolf
Baker Wendy
Baldwin Alastair
Bates Kirsty
Browne Paul
Castro Matias
Cochrane Guy
Diez Federico Garcia
Duggan Karyn
Eberhardt Ruth
Faruque Nadeem
Gamble John
Harte Nicola
Kanz Carola
Kulikova Tamara
Lin Quan
Lombard Vincent
Lopez Rodrigo
Mancuso Renato
McHale Michelle
Nardone Francesco
Silventoinen Ville
Sobhany Siamak
Stoehr Peter
Tuli Mary Ann
Tzouvara Katerina
van den Broek Alexandra
Vaughan Robert
Wu Dan
Zhu Weimin
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl), maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK, is a comprehensive collection of nucleotide sequences and annotation from available public sources. The database is part of an international collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged daily between the collaborating institutes to achieve swift synchrony. Webin is the preferred tool for individual submissions of nucleotide sequences, including Third Party Annotation (TPA) and alignments. Automated procedures are provided for submissions from large-scale sequencing projects and data from the European Patent Office. New and updated data records are distributed daily and the whole EMBL Nucleotide Sequence Database is released four times a year. Access to the sequence data is provided via ftp and several WWW interfaces. With the web-based Sequence Retrieval System (SRS) it is also possible to link nucleotide data to other specialist molecular biology databases maintained at the EBI. Other tools are available for sequence similarity searching (e.g. FASTA and BLAST). Changes over the past year include the removal of the sequence length limit, the launch of the EMBLCDSs dataset, extension of the Sequence Version Archive functionality and the revision of quality rules for TPA data

Crossref

PubMed Central

Ranking-aware integration and explorative search of distributed bio-data

Author: Ghisalberti Giorgio
Masseroli Marco
Picozzi Matteo
Publication venue
Publication date: 13/02/2014
Field of study

Open Access Repository

Analysis of the Human Kinome Using Methods Including Fold Recognition Reveals Two Novel Kinases

Author: A Bateman
AG Murzin
AS Yang
Ayelet Starr
B Rost
Bostjan Kobe
C Aoyama
C Yuan
CH Wu
DL Wheeler
DM Daigle
E Ackerstaff
E Birney
ED Scheeff
G Manning
GJ Bartlett
HM Berman
J Gough
JM Chandonia
K Glunde
Kristine M. Briedis
L Holm
N Alexandrov
N Kannan
NN Alexandrov
Philip E. Bourne
R Lopez
S Brenner
S Cheek
SE Brenner
SF Altschul
SK Hanks
WC Hon
WW Li
X Ye
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: Protein sequence similarity is a commonly used criterion for inferring the unknown function of a protein from a protein of known function. However, proteins can diverge significantly over time such that sequence similarity is difficult, if not impossible, to find. In some cases, a structural similarity remains over long evolutionary time scales and once detected can be used to predict function. Methodology/Principal Findings: Here we employed a high-throughput approach to assign structural and functional annotation to the human proteome, focusing on the collection of human protein kinases, the human kinome. We compared human protein sequences to a library of domains from known structures using WU-BLAST, PSI-BLAST, and 123D. This approach utilized both sequence comparison and fold recognition methods. The resulting set of potential protein kinases was cross-checked against previously identified human protein kinases, and analyzed for conserved kinase motifs. Conclusions/Significance: We demonstrate that our structure-based method can be used to identify both typical and atypical human protein kinases. We also identify two potentially novel kinases that contain an interesting combination o

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

RiboSubstrates: a web application addressing the cleavage specificities of ribozymes in designated genomes

Author: Bergeron Lucien Junior
Brière Francis P
Elela Sherif Abou
Lucier Jean-François
Ouellette Rodney
Perreault Jean-Pierre
Publication venue: BioMed Central
Publication date: 01/10/2006
Field of study

BACKGROUND: RNA-dependent gene silencing is becoming a routine tool used in laboratories worldwide. One of the important remaining hurdles in the selection of the target sequence, if not the most important one, is the designing of tools that have minimal off-target effects (i.e. cleaves only the desired sequence). Increasingly, in the current dawn of the post-genomic era, there is a heavy reliance on tools that are suitable for high-throughput functional genomics, consequently more and more bioinformatic software is becoming available. However, to date none have been designed to satisfy the ever-increasing need for the accurate selection of targets for a specific silencing reagent. RESULTS: In order to overcome this hurdle we have developed RiboSubstrates . This integrated bioinformatic software permits the searching of a cDNA database for all potential substrates for a given ribozyme. This includes the mRNAs that perfectly match the specific requirements of a given ribozyme, as well those including Wobble base pairs and mismatches. The results generated allow rapid selection of sequences suitable as targets for RNA degradation. The current web-based RiboSubstrates version permits the identification of potential gene targets for both SOFA-HDV ribozymes and for hammerhead ribozymes. Moreover, a minimal template for the search of siRNAs is also available. This flexible and reliable tool is easily adaptable for use with any RNA tool (i.e. other ribozymes, deoxyribozymes and antisense), and may use the information present in any cDNA bank. CONCLUSION: RiboSubstrates should become an essential step for all, even including "non-RNA biologists", who endeavor to develop a gene-inactivation system

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

WormBase: a comprehensive data resource for Caenorhabditis biology and genomics

Author: Antoshechkin Igor
Bastiani Carol
Bieri Tamberlyn
Blasiar Darin
Bradnam Keith
Canaran Payan
Chan Juancarlos
Chen Chao-Kung
Chen Nansheng
Chen Wen J.
Cunningham Fiona
Davis Paul
Durbin Richard
Harris Todd W.
Kenny Eimear
Kishore Ranjana
Lawson Daniel
Lee Raymond
Muller Hans-Michael
Nakamura Cecilia
Ozersky Philip
Pai Shraddha
Petcherski Andrei
Rogers Anthony
Sabo Aniko
Schwarz Erich M.
Spieth John
Stein Lincoln D.
Sternberg Paul W.
Van Auken Kimberly
Wang Qinghua
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

WormBase (http://www.wormbase.org), the model organism database for information about Caenorhabditis elegans and related nematodes, continues to expand in breadth and depth. Over the past year, WormBase has added multiple large-scale datasets including SAGE, interactome, 3D protein structure datasets and NCBI KOGs. To accommodate this growth, the International WormBase Consortium has improved the user interface by adding new features to aid in navigation, visualization of large-scale datasets, advanced searching and data mining. Internally, we have restructured the database models to rationalize the representation of genes and to prepare the system to accept the genome sequences of three additional Caenorhabditis species over the coming year

CiteSeerX

Crossref

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

Caltech Authors

PBmice: an integrated database system of piggyBac (PB) insertional mutations and their characterizations in mice

Author: Amsterdam
Amsterdam
Austin
Auwerx
Beebe
Cary
Clark
Cooley
Ding
Dupuy
Fahrer
Fraser
Fraser
Handler
Horie
Hrab de Angelis
Ivics
J. Zhou
K. Jin
Kuromori
L. V. Sun
L. Wang
L. Ye
L. Zhu
Lopez
Lorenzen
Luo
M. Han
Mochizuki
N. Gu
Nolan
Nolan
Pargent
Rathkolb
S. Ding
Sarkar
Soewarto
Stanford
T. Xu
Thibault
Uren
Varmus
W. Yang
Wu
Wu
X. Wu
X. Xie
Y. Liu
Y. Su
Y. Zhong
Y. Zhuang
Zhang
Publication venue: Oxford University Press
Publication date
Field of study

DNA transposon piggyBac (PB) is a newly established mutagen for large-scale mutagenesis in mice. We have designed and implemented an integrated database system called PBmice (PB Mutagenesis Information CEnter) for storing, retrieving and displaying the information derived from PB insertions (INSERTs) in the mouse genome. This system is centered on INSERTs with information including their genomic locations and flanking genomic sequences, the expression levels of the hit genes, and the expression patterns of the trapped genes if a trapping vector was used. It also archives mouse phenotyping data linked to INSERTs, and allows users to conduct quick and advanced searches for genotypic and phenotypic information relevant to a particular or a set of INSERT(s). Sequence-based information can be cross-referenced with other genomic databases such as Ensembl, BLAST and GBrowse tools used in PBmice offer enhanced search and display for additional information relevant to INSERTs. The total number and genomic distribution of PB INSERTs, as well as the availability of each PB insertional LINE can also be viewed with user-friendly interfaces. PBmice is freely available at http://www.idmshanghai.cn/PBmice or http://www.scbit.org/PBmice/

Crossref

PubMed Central

Comparative genomics of the syndecans defines an ancestral genomic context associated with matrilins in vertebrates

Author: Adams Josephine C
Chakravarti Ritu
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The syndecans are the major family of transmembrane proteoglycans in animals and are known for multiple roles in cell interactions and growth factor signalling during development, inflammatory response, wound-repair and tumorigenesis. Although syndecans have been cloned from several invertebrate and vertebrate species, the extent of conservation of the family across the animal kingdom is unknown and there are gaps in our knowledge of chordate syndecans. Here, we develop a new level of knowledge for the whole syndecan family, by combining molecular phylogeny of syndecan protein sequences with analysis of the genomic contexts of syndecan genes in multiple vertebrate organisms. RESULTS: We identified syndecan-encoding sequences in representative Cnidaria and throughout the Bilateria. The C1 and C2 regions of the cytoplasmic domain are highly conserved throughout the animal kingdom. We identified in the variable region a universally-conserved leucine residue and a tyrosine residue that is conserved throughout the Bilateria. Of all the genomes examined, only tetrapod and fish genomes encode multiple syndecans. No syndecan-1 was identified in fish. The genomic context of each vertebrate syndecan gene is syntenic between human, mouse and chicken, and this conservation clearly extends to syndecan-2 and -3 in T. nigroviridis. In addition, tetrapod syndecans were found to be encoded from paralogous chromosomal regions that also contain the four members of the matrilin family. Whereas the matrilin-3 and syndecan-1 genes are adjacent in tetrapods, this chromosomal region appears to have undergone extensive lineage-specific rearrangements in fish. CONCLUSION: Throughout the animal kingdom, syndecan extracellular domains have undergone rapid change and elements of the cytoplasmic domains have been very conserved. The four syndecan genes of vertebrates are syntenic across tetrapods, and synteny of the syndecan-2 and -3 genes is apparent between tetrapods and fish. In vertebrates, each of the four family members are encoded from paralogous genomic regions in which members of the matrilin family are also syntenic between tetrapods and fish. This genomic organization appears to have been set up after the divergence of urochordates (Ciona) and vertebrates. The syndecan-1 gene appears to have been lost relatively early in the fish lineage. These conclusions provide the basis for a new model of syndecan evolution in vertebrates and a new perspective for analyzing the roles of syndecans in cells and whole organisms

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central