Search CORE

186 research outputs found

Recommended from our members

Globally Distributed Object Identification for Biological Knowledge Bases

Author: Clark Timothy William
Liefeld Ted
Martin Sean
Publication venue: Henry Stewart Publications
Publication date: 30/04/2013
Field of study

The World-Wide Web provides a globally distributed communication framework that is essential for almost all scientific collaboration, including bioinformatics. However, several limits and inadequacies have become apparent, one of which is the inability to programmatically identify locally named objects that may be widely distributed over the network. This shortcoming limits our ability to integrate multiple knowledgebases, each of which gives partial information of a shared domain, as is commonly seen in bioinformatics. The Life Science Identifier (LSID) and LSID Resolution System (LSRS) provide simple and elegant solutions to this problem, based on the extension of existing internet technologies. LSID and LSRS are consistent with next-generation semantic web and semantic grid approaches. This article describes the syntax, operations, infrastructure compatibility considerations, use cases and potential future applications of LSID and LSRS. We see the adoption of these methods as important steps toward simpler, more elegant and more reliable integration of the world’s biological knowledgebases, and as facilitating stronger global collaboration in biology

Harvard University - DASH

Biodiversity informatics: the challenge of linking data and the role of shared identifiers

Author: Altschul
Dellavalle
Martin
Moreau
Ouellette
Page
Patterson
R. D. M. Page
Saux
Smith
Stein
Zamors'ky
Publication venue
Publication date: 01/01/2008
Field of study

A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers (such as DOIs and LSIDs), and the implementation of services that link those identifiers

Crossref

Enlighten

Nature Precedings

BioGUID: resolving, discovering, and minting identifiers for biodiversity informatics

Author: Page R.D.M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background: Linking together the data of interest to biodiversity researchers (including specimen records, images, taxonomic names, and DNA sequences) requires services that can mint, resolve, and discover globally unique identifiers (including, but not limited to, DOIs, HTTP URIs, and LSIDs). Results: BioGUID implements a range of services, the core ones being an OpenURL resolver for bibliographic resources, and a LSID resolver. The LSID resolver supports Linked Data-friendly resolution using HTTP 303 redirects and content negotiation. Additional services include journal ISSN look-up, author name matching, and a tool to monitor the status of biodiversity data providers. Conclusion: BioGUID is available at http://bioguid.info/. Source code is available from http://code.google.com/p/bioguid/

Springer - Publisher Connector

PubMed Central

Enlighten

Nature Precedings

Aggregating, tagging and integrating biodiversity research

Author: BL Fisher
Brian L. Fisher
C Thomas
David P. Mindell
DP Faith
E Boakes
Georgina M. Mace
H Miller
J Walston
JA Johnson
Jonathan Eisen
Peter Roopnarine
RDM Page
Richard L. Pyle
Roderic D. M. Page
Sean A. Rands
SHM Butchart
T Clark
VS Chavan
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

UCL Discovery

eScholarship - University of California

Enlighten

Oracle Database 10g: a platform for BLAST search and Regular Expression pattern matching in life sciences

Author: Chen Jake Y.
Davidson Marcel G.
Stephens Susie M.
Thomas Shiby
Trute Barry M.
Publication venue: Oxford University Press
Publication date: 17/12/2004
Field of study

As database management systems expand their array of analytical functionality, they become powerful research engines for biomedical data analysis and drug discovery. Databases can hold most of the data types commonly required in life sciences and consequently can be used as flexible platforms for the implementation of knowledgebases. Performing data analysis in the database simplifies data management by minimizing the movement of data from disks to memory, allowing pre-filtering and post-processing of datasets, and enabling data to remain in a secure, highly available environment. This article describes the Oracle Database 10g implementation of BLAST and Regular Expression Searches and provides case studies of their usage in bioinformatics. http://www.oracle.com/technology/software/index.htm

Crossref

PubMed Central

The NIF LinkOut Broker: A Web Resource to Facilitate Federated Data Integration using NCBI Identifiers

Author: Ascoli G. A., Giorgio A.
Marenco Luis
Martone Maryann E.
Miller Perry L.
Shepherd Gordon M.
Publication venue: Humana Press Inc.
Publication date: 01/09/2008
Field of study

This paper describes the NIF LinkOut Broker (NLB) that has been built as part of the Neuroscience Information Framework (NIF) project. The NLB is designed to coordinate the assembly of links to neuroscience information items (e.g., experimental data, knowledge bases, and software tools) that are (1) accessible via the Web, and (2) related to entries in the National Center for Biotechnology Information’s (NCBI’s) Entrez system. The NLB collects these links from each resource and passes them to the NCBI which incorporates them into its Entrez LinkOut service. In this way, an Entrez user looking at a specific Entrez entry can LinkOut directly to related neuroscience information. The information stored in the NLB can also be utilized in other ways. A second approach, which is operational on a pilot basis, is for the NLB Web server to create dynamically its own Web page of LinkOut links for each NCBI identifier in the NLB database. This approach can allow other resources (in addition to the NCBI Entrez) to LinkOut to related neuroscience information. The paper describes the current NLB system and discusses certain design issues that arose during its implementation

Caltech Authors

Consequences of refining biological networks through detailed pathway information : From genes to proteoforms

Author: Hernández Sánchez Luis Francisco
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2022
Field of study

Biologiske nettverk kan brukes til å modellere molekylære prosesser, forstå sykdomsprogresjon og finne nye behandlingsstrategier. Denne avhandlingen har undersøkt hvordan utformingen av slike nettverk påvirker deres struktur, og hvordan dette kan benyttes til å forbedre spesifisiteten for påfølgende analyser av slike modeller. Det første som ble undersøkt var potensialet ved å bruke mer detaljerte molekylære data når man modellerer humane biokjemiske reaksjonsnettverk. Resultatene bekrefter at det er nok informasjon om proteoformer, det vil si proteiner i spesifikke post-translasjonelle tilstander, for systematiske analyser og viste også store forskjeller i strukturen mellom en gensentrisk og en proteoformsentrisk representasjon. Deretter utviklet vi programmatisk tilgang og søk i slike nettverk basert på ulike typer av biomolekyler, samt en generisk algoritme som muliggjør fleksibel kartlegging av eksperimentelle data knyttet til den teoretiske representasjonen av proteoformer i referansedatabaser. Til slutt ble det konstruert såkalte pathway-spesifikke nettverk ved bruk av ulike detaljnivåer ved representasjonen av biokjemiske reaksjoner. Her ble informasjon som vanligvis blir oversett i standard nettverksrepresentasjoner inkludert: små molekyler, isoformer og modifikasjoner. Strukturelle egenskaper, som nettverksstørrelse, graddistribusjon og tilkobling i både globale og lokale undernettverk, ble deretter analysert for å kvantifisere virkningene av endringene.Biological networks can be used to model molecular processes, understand disease progression, and find new treatment strategies. This thesis investigated how refining the design of biological networks influences their structure, and how this can be used to improve the specificity of pathway analyses. First, we investigate the potential to use more detailed molecular data in current human biological pathways. We verified that there are enough proteoform annotations, i.e. information about proteins in specific post-translational states, for systematic analyses and characterized the structure of gene-centric versus proteoform-centric network representations of pathways. Next, we enabled the programmatic search and mining of pathways using different models for biomolecules including proteoforms. We notably designed a generic proteoform matching algorithm enabling the flexible mapping of experimental data to the theoretic representation in reference databases. Finally, we constructed pathway-based networks using different degrees of detail in the representation of biochemical reactions. We included information overlooked in most standard network representations: small molecules, isoforms, and post-translational modifications. Structural properties such as network size, degree distribution, and connectivity in both global and local subnetworks, were analysed to quantify the impact of the added molecular entities.Doktorgradsavhandlin

University of Bergen

NORA - Norwegian Open Research Archives

Transforming zoonotic disease surveillance and spatial epidemiology: Improving content, detail, and data infrastructure

Author: Townsend PA
Publication venue: 'African Journals Online (AJOL)'
Publication date: 23/08/2011
Field of study

No Abstract

AJOL - African Journals Online

nuID: a universal naming scheme of oligonucleotides for Illumina, Affymetrix, and other microarrays

Author: Du Pan
Kibbe Warren A
Lin Simon M
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Oligonucleotide probes that are sequence identical may have different identifiers between manufacturers and even between different versions of the same company's microarray; and sometimes the same identifier is reused and represents a completely different oligonucleotide, resulting in ambiguity and potentially mis-identification of the genes hybridizing to that probe. Results We have devised a unique, non-degenerate encoding scheme that can be used as a universal representation to identify an oligonucleotide across manufacturers. We have named the encoded representation 'nuID', for nucleotide universal identifier. Inspired by the fact that the raw sequence of the oligonucleotide is the true definition of identity for a probe, the encoding algorithm uniquely and non-degenerately transforms the sequence itself into a compact identifier (a lossless compression). In addition, we added a redundancy check (checksum) to validate the integrity of the identifier. These two steps, encoding plus checksum, result in an nuID, which is a unique, non-degenerate, permanent, robust and efficient representation of the probe sequence. For commercial applications that require the sequence identity to be confidential, we have an encryption schema for nuID. We demonstrate the utility of nuIDs for the annotation of Illumina microarrays, and we believe it has universal applicability as a source-independent naming convention for oligomers. Reviewers This article was reviewed by Itai Yanai, Rong Chen (nominated by Mark Gerstein), and Gregory Schuler (nominated by David Lipman).</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

myTea: Connecting the Web to Digital Science on the Desktop

Author: Brostoff Sacha
Cooke Ray
Gibson Andrew
schraefel m.c.
Stevens Robert
Publication venue: s.n.
Publication date: 01/01/2005
Field of study

Bioinformaticians regularly access the hundreds of databases and tools that are available to them on the Web. None of these tools communicate with each other, causing the scientist to copy results manually from a Web site into a spreadsheet or word processor. myGrids' Taverna has made it possible to create templates (workflows) that automatically run searches using these databases and tools, cutting down what previously took days of work into hours, and enabling the automated capture of experimental details. What is still missing in the capture process, however, is the details of work done on that material once it moves from the Web to the desktop: if a scientist runs a process on some data, there is nothing to record why that action was taken; it is likewise not easy to publish a record of this process back to the community on the Web. In this paper, we present a novel interaction framework, built on Semantic Web technologies, and grounded in usability design practice, in particular the Making Tea method. Through this work, we introduce a new model of practice designed specifically to (1) support the scientists' interactions with data from the Web to the desktop, (2) provide automatic annotation of process to capture what has previously been lost and (3) associate provenance services automatically with that data in order to enable meaningful interrogation of the process and controlled sharing of the results

Southampton (e-Prints Soton)