Search CORE

70,899 research outputs found

A logical approach to working with biological databases

Author: Angelopoulos Nicos
Giamas Georgios
Publication venue: CEUR
Publication date
Field of study

It has been argued before that Prolog is a strong candidate for research and code develop-ment in bioinformatics and computational biology. This position has been based on boththe intrinsic strengths of Prolog and recent advances in its technologies. Here we strengthenthe case for the deployment and penetration of Prolog into bioinformatics, by introduc-ingbiodb, a comprehensive and extensible system for working with biological data. Wefocus on databases that translate between biological products and product-to-productinteractions, the latter of which can be visualised as graphs. This library allows easy ac-cess to high quality data in two formats: as Prolog fact files and as SQLite databases.On-demand downloading of prepacked data files in these two formats is supported in alloperating system architectures as well as reconstruction from latest data files from thecurated databases. The methods used to deliver the data are transparent to the user andthe data are delivered in he familiar format of Prolog facts

Online Research @ Cardiff

Semantic distillation: a method for clustering objects by their contextual specificity

Author: AN Langville
AN Langville
Chris Godsil and Gordon Royle
CJ Rijsbergen van
DM Cvetković
F Fouss
I Yanai
J Mercer
J Shi
JC Bezdek
K Pearson
LA Zadeh
M Belkin
M Campanino
Miklós Rédei
MLD Chiara
MW Berry
N Aronszajn
P Baldi
P Gärdenfors
R Baeza-Yates
R Fan
R Homayouni
RR Coifman
S Vishveshwara
ST Wang
Sándor Dominich
Publication venue
Publication date: 01/01/2007
Field of study

Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel method of fuzzy hierarchical clustering, termed \textit{semantic distillation} -- strongly inspired from the theory of quantum measurement --, we developed to analyse raw data coming from various types of experiments on DNA arrays. We illustrate the method by analysing DNA arrays experiments and clustering the genes of the array according to their specificity.Comment: Accepted for publication in Studies in Computational Intelligence, Springer-Verla

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-Rennes 1

Subject benchmark statement: forensic science

Author
Publication venue: Quality Assurance Agency for Higher Education
Publication date: 01/01/2012
Field of study

Digital Education Resource Archive

Ontological theory for ontological engineering: Biomedical systems information integration

Author: Ceusters Werner
Fielding James M.
Simon Jonathan
Smith Barry
Publication venue
Publication date: 01/01/2004
Field of study

Software application ontologies have the potential to become the keystone in state-of-the-art information management techniques. It is expected that these ontologies will support the sort of reasoning power required to navigate large and complex terminologies correctly and efficiently. Yet, there is one problem in particular that continues to stand in our way. As these terminological structures increase in size and complexity, and the drive to integrate them inevitably swells, it is clear that the level of consistency required for such navigation will become correspondingly difficult to maintain. While descriptive semantic representations are certainly a necessary component to any adequate ontology-based system, so long as ontology engineers rely solely on semantic information, without a sound ontological theory informing their modeling decisions, this goal will surely remain out of reach. In this paper we describe how Language and Computing nv (L&C), along with The Institute for Formal Ontology and Medical Information Sciences (IFOMIS), are working towards developing and implementing just such a theory, combining the open software architecture of L&C’s LinkSuiteTM with the philosophical rigor of IFOMIS’s Basic Formal Ontology. In this way we aim to move beyond the more or less simple controlled vocabularies that have dominated the industry to date

PhilPapers

CiteSeerX

The representation of protein complexes in the Protein Ontology

Author: Arighi Cecilia
Blake Judith
Bult Carol
Drabkin Harold
D’Eustachio Peter
Evsikov Alexei
Natale Darren
Roberts Natalia
Ruttenberg Alan
Smith Barry
Wu Cathy
Publication venue
Publication date: 01/01/2011
Field of study

Representing species-specific proteins and protein complexes in ontologies that are both human and machine-readable facilitates the retrieval, analysis, and interpretation of genome-scale data sets. Although existing protin-centric informatics resources provide the biomedical research community with well-curated compendia of protein sequence and structure, these resources lack formal ontological representations of the relationships among the proteins themselves. The Protein Ontology (PRO) Consortium is filling this informatics resource gap by developing ontological representations and relationships among proteins and their variants and modified forms. Because proteins are often functional only as members of stable protein complexes, the PRO Consortium, in collaboration with existing protein and pathway databases, has launched a new initiative to implement logical and consistent representation of protein complexes. We describe here how the PRO Consortium is meeting the challenge of representing species-specific protein complexes, how protein complex representation in PRO supports annotation of protein complexes and comparative biology, and how PRO is being integrated into existing community bioinformatics resources. The PRO resource is accessible at http://pir.georgetown.edu/pro/

PhilPapers

BioCloud Search EnGene: Surfing Biological Data on the Cloud

Author: DESSI NICOLETTA
MILIA GABRIELE
Pascariello E
PES BARBARA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The massive production and spread of biomedical data around the web introduces new challenges related to identify computational approaches for providing quality search and browsing of web resources. This papers presents BioCloud Search EnGene (BSE), a cloud application that facilitates searching and integration of the many layers of biological information offered by public large-scale genomic repositories. Grounding on the concept of dataspace, BSE is built on top of a cloud platform that severely curtails issues associated with scalability and performance. Like popular online gene portals, BSE adopts a gene-centric approach: researchers can find their information of interest by means of a simple “Google-like” query interface that accepts standard gene identification as keywords. We present BSE architecture and functionality and discuss how our strategies contribute to successfully tackle big data problems in querying gene-based web resources. BSE is publically available at: http://biocloud-unica.appspot.com/

Archivio istituzionale della ricerca - Università di Cagliari

Infectious Disease Ontology

Technological developments have resulted in tremendous increases in the volume and diversity of the data and information that must be processed in the course of biomedical and clinical research and practice. Researchers are at the same time under ever greater pressure to share data and to take steps to ensure that data resources are interoperable. The use of ontologies to annotate data has proven successful in supporting these goals and in providing new possibilities for the automated processing of data and information. In this chapter, we describe different types of vocabulary resources and emphasize those features of formal ontologies that make them most useful for computational applications. We describe current uses of ontologies and discuss future goals for ontology-based computing, focusing on its use in the field of infectious diseases. We review the largest and most widely used vocabulary resources relevant to the study of infectious diseases and conclude with a description of the Infectious Disease Ontology (IDO) suite of interoperable ontology modules that together cover the entire infectious disease domain

PhilPapers

CiteSeerX

Crossref

Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

Author: Bishop N.
Gillet V.J.
Holliday J.D.
Willett P.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2003
Field of study

This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

Crossref

White Rose Research Online

Ontology-based knowledge representation of experiment metadata in biological data mining

Author: Burke Squires
Carl Dahlke
Hagler Herb
Herb Hagler
Jamie Lee
Jeff Wiser
Jennifer Cai
Karp David
Megan Kong
Patrick Dunn
Richard Scheuermann
Smith Barry
Yu Qian
Publication venue
Publication date: 01/01/2009
Field of study

According to the PubMed resource from the U.S. National Library of Medicine, over 750,000 scientific articles have been published in the ~5000 biomedical journals worldwide in the year 2007 alone. The vast majority of these publications include results from hypothesis-driven experimentation in overlapping biomedical research domains. Unfortunately, the sheer volume of information being generated by the biomedical research enterprise has made it virtually impossible for investigators to stay aware of the latest findings in their domain of interest, let alone to be able to assimilate and mine data from related investigations for purposes of meta-analysis. While computers have the potential for assisting investigators in the extraction, management and analysis of these data, information contained in the traditional journal publication is still largely unstructured, free-text descriptions of study design, experimental application and results interpretation, making it difficult for computers to gain access to the content of what is being conveyed without significant manual intervention. In order to circumvent these roadblocks and make the most of the output from the biomedical research enterprise, a variety of related standards in knowledge representation are being developed, proposed and adopted in the biomedical community. In this chapter, we will explore the current status of efforts to develop minimum information standards for the representation of a biomedical experiment, ontologies composed of shared vocabularies assembled into subsumption hierarchical structures, and extensible relational data models that link the information components together in a machine-readable and human-useable framework for data mining purposes

PhilPapers

Subject benchmark statement: forensic science: draft for consultation

Author
Publication venue: Quality Assurance Agency for Higher Education
Publication date: 01/01/2012
Field of study

Digital Education Resource Archive