173,325 research outputs found
Phyloinformatics in the age of Wikipedia
This talk describes a mapping between the NCBI taxonomy database and Wikipedia. These two databases were chosen because the NCBI taxonomy contains all the taxa for which sequences are publicly available, and for many taxa Wikipedia is the first site returned in a Google search on that taxon's scientific name. The NCBI web pages for nearly 53,000 NCBI taxa now have a link to the corresponding page in Wikipedia
Gene Similarity-based Approaches for Determining Core-Genes of Chloroplasts
In computational biology and bioinformatics, the manner to understand
evolution processes within various related organisms paid a lot of attention
these last decades. However, accurate methodologies are still needed to
discover genes content evolution. In a previous work, two novel approaches
based on sequence similarities and genes features have been proposed. More
precisely, we proposed to use genes names, sequence similarities, or both,
insured either from NCBI or from DOGMA annotation tools. Dogma has the
advantage to be an up-to-date accurate automatic tool specifically designed for
chloroplasts, whereas NCBI possesses high quality human curated genes (together
with wrongly annotated ones). The key idea of the former proposal was to take
the best from these two tools. However, the first proposal was limited by name
variations and spelling errors on the NCBI side, leading to core trees of low
quality. In this paper, these flaws are fixed by improving the comparison of
NCBI and DOGMA results, and by relaxing constraints on gene names while adding
a stage of post-validation on gene sequences. The two stages of similarity
measures, on names and sequences, are thus proposed for sequence clustering.
This improves results that can be obtained using either NCBI or DOGMA alone.
Results obtained with this quality control test are further investigated and
compared with previously released ones, on both computational and biological
aspects, considering a set of 99 chloroplastic genomes.Comment: 4 pages, IEEE International Conference on Bioinformatics and
Biomedicine (BIBM 2014
The NIF LinkOut Broker: A Web Resource to Facilitate Federated Data Integration using NCBI Identifiers
This paper describes the NIF LinkOut Broker (NLB) that has been built as part of the Neuroscience Information Framework (NIF) project. The NLB is designed to coordinate the assembly of links to neuroscience information items (e.g., experimental data, knowledge bases, and software tools) that are (1) accessible via the Web, and (2) related to entries in the National Center for Biotechnology Information’s (NCBI’s) Entrez system. The NLB collects these links from each resource and passes them to the NCBI which incorporates them into its Entrez LinkOut service. In this way, an Entrez user looking at a specific Entrez entry can LinkOut directly to related neuroscience information. The information stored in the NLB can also be utilized in other ways. A second approach, which is operational on a pilot basis, is for the NLB Web server to create dynamically its own Web page of LinkOut links for each NCBI identifier in the NLB database. This approach can allow other resources (in addition to the NCBI Entrez) to LinkOut to related neuroscience information. The paper describes the current NLB system and discusses certain design issues that arose during its implementation
See a Need, Fill a Need —Reaching Out to the Bioinformatics Research Community at Iowa State University
This article describes my efforts in organizing the National Center for Biotechnology (NCBI) Field Guide workshop in March 2006 and four NCBI mini-courses in April 2007 at Iowa State University. It also includes an overview of academic libraries that are providing bioinformatics support and summarizes library involvement in hosting NCBI courses. A discussion of how hosting the NCBI courses has influenced my collection development, instruction, and liaison activities and suggestions to librarians about how to get involved with bioinformatics is also included
NCBI BLAST: a better web interface
Basic Local Alignment Search Tool (BLAST) is a sequence similarity search program. The public interface of BLAST, http://www.ncbi.nlm.nih.gov/blast, at the NCBI website has recently been reengineered to improve usability and performance. Key new features include simplified search forms, improved navigation, a list of recent BLAST results, saved search strategies and a documentation directory. Here, we describe the BLAST web application's new features, explain design decisions and outline plans for future improvement
Database resources of the National Center for Biotechnology Information
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov
Database resources of the National Center for Biotechnology Information
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov
Recommended from our members
Cost effective, experimentally robust differential-expression analysis for human/mammalian, pathogen and dual-species transcriptomics.
As sequencing read length has increased, researchers have quickly adopted longer reads for their experiments. Here, we examine 14 pathogen or host-pathogen differential gene expression data sets to assess whether using longer reads is warranted. A variety of data sets was used to assess what genomic attributes might affect the outcome of differential gene expression analysis including: gene density, operons, gene length, number of introns/exons and intron length. No genome attribute was found to influence the data in principal components analysis, hierarchical clustering with bootstrap support, or regression analyses of pairwise comparisons that were undertaken on the same reads, looking at all combinations of paired and unpaired reads trimmed to 36, 54, 72 and 101 bp. Read pairing had the greatest effect when there was little variation in the samples from different conditions or in their replicates (e.g. little differential gene expression). But overall, 54 and 72 bp reads were typically most similar. Given differences in costs and mapping percentages, we recommend 54 bp reads for organisms with no or few introns and 72 bp reads for all others. In a third of the data sets, read pairing had absolutely no effect, despite paired reads having twice as much data. Therefore, single-end reads seem robust for differential-expression analyses, but in eukaryotes paired-end reads are likely desired to analyse splice variants and should be preferred for data sets that are acquired with the intent to be community resources that might be used in secondary data analyses
- …