Search CORE

13,816 research outputs found

A User-centric Framework for Accessing Biological Sources and Tools

Author: E. Schallehn
J. Zhao
J.W. Ely
M. Samsonova
P. Buneman
P. Lord
R.D. Stevens
T. Etzold
Publication venue: ScholarlyCommons
Publication date: 01/01/2005
Field of study

Biologists face two problems in interpreting their experiments: the integration of their data with information from multiple heterogeneous sources and data analysis with bioinformatics tools. It is difficult for scientists to choose between the numerous sources and tools without assistance. Following a thorough analysis of scientists’ needs during the querying process, we found that biologists express preferences concerning the sources to be queried and the tools to be used. Interviews also showed that the querying process itself – the strategy followed – differs between scientists. In response to these findings, we have introduced a user-centric framework allowing to specify various querying processes. Then we have developed the BioGuide system which helps the scientists to choose suitable sources and tools, find complementary information in sources, and deal with divergent data. It is generic in that it can be adapted by each user to provide answers respecting his/her preferences, and obtained following his/her strategies

Crossref

ScholarlyCommons@Penn

Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.

Author: Bichutskiy Vadim Y
Brachmann Rainer K
Colman Richard
Lathrop Richard H
Publication venue: eScholarship, University of California
Publication date: 01/01/2006
Field of study

Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)

Directory of Open Access Journals

eScholarship - University of California

Towards Exascale Scientific Metadata Management

Author: Blanas Spyros
Byna Surendra
Publication venue
Publication date: 29/03/2015
Field of study

Advances in technology and computing hardware are enabling scientists from all areas of science to produce massive amounts of data using large-scale simulations or observational facilities. In this era of data deluge, effective coordination between the data production and the analysis phases hinges on the availability of metadata that describe the scientific datasets. Existing workflow engines have been capturing a limited form of metadata to provide provenance information about the identity and lineage of the data. However, much of the data produced by simulations, experiments, and analyses still need to be annotated manually in an ad hoc manner by domain scientists. Systematic and transparent acquisition of rich metadata becomes a crucial prerequisite to sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and domain-agnostic metadata management infrastructure that can meet the demands of extreme-scale science is notable by its absence. To address this gap in scientific data management research and practice, we present our vision for an integrated approach that (1) automatically captures and manipulates information-rich metadata while the data is being produced or analyzed and (2) stores metadata within each dataset to permeate metadata-oblivious processes and to query metadata through established and standardized data access interfaces. We motivate the need for the proposed integrated approach using applications from plasma physics, climate modeling and neuroscience, and then discuss research challenges and possible solutions

arXiv.org e-Print Archive

eScholarship - University of California

A File System Abstraction for Sense and Respond Systems

Author: Abu-Ghazaleh Nael
Brown Geoffrey
Chiu Kenneth
Pisupati Bhanu
Tilak Sameer
Publication venue
Publication date: 01/01/2005
Field of study

The heterogeneity and resource constraints of sense-and-respond systems pose significant challenges to system and application development. In this paper, we present a flexible, intuitive file system abstraction for organizing and managing sense-and-respond systems based on the Plan 9 design principles. A key feature of this abstraction is the ability to support multiple views of the system via filesystem namespaces. Constructed logical views present an application-specific representation of the network, thus enabling high-level programming of the network. Concurrently, structural views of the network enable resource-efficient planning and execution of tasks. We present and motivate the design using several examples, outline research challenges and our research plan to address them, and describe the current state of implementation.Comment: 6 pages, 3 figures Workshop on End-to-End, Sense-and-Respond Systems, Applications, and Services In conjunction with MobiSys '0

arXiv.org e-Print Archive

CiteSeerX

Linked Data - the story so far

Author: Berners-Lee Tim
Bizer Christian
Heath Tom
Publication venue: 'IGI Global'
Publication date: 01/01/2009
Field of study

The term “Linked Data” refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions— the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward

Southampton (e-Prints Soton)

MAnnheim DOCument Server

From access and integration to mining of secure genomic data sets across the grid

Author: Sinnott R.O.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

The UK Department of Trade and Industry (DTI) funded BRIDGES project (Biomedical Research Informatics Delivered by Grid Enabled Services) has developed a Grid infrastructure to support cardiovascular research. This includes the provision of a compute Grid and a data Grid infrastructure with security at its heart. In this paper we focus on the BRIDGES data Grid. A primary aim of the BRIDGES data Grid is to help control the complexity in access to and integration of a myriad of genomic data sets through simple Grid based tools. We outline these tools, how they are delivered to the end user scientists. We also describe how these tools are to be extended in the BBSRC funded Grid Enabled Microarray Expression Profile Search (GEMEPS) to support a richer vocabulary of search capabilities to support mining of microarray data sets. As with BRIDGES, fine grain Grid security underpins GEMEPS

Enlighten

University of Melbourne Institutional Repository

2011 Strategic roadmap for Australian research infrastructure

Author: Department of Innovation
Industry
Science and Research
Publication venue: Science and Research
Publication date
Field of study

The 2011 Roadmap articulates the priority research infrastructure areas of a national scale (capability areas) to further develop Australia’s research capacity and improve innovation and research outcomes over the next five to ten years. The capability areas have been identified through considered analysis of input provided by stakeholders, in conjunction with specialist advice from Expert Working Groups   It is intended the Strategic Framework will provide a high-level policy framework, which will include principles to guide the development of policy advice and the design of programs related to the funding of research infrastructure by the Australian Government. Roadmapping has been identified in the Strategic Framework Discussion Paper as the most appropriate prioritisation mechanism for national, collaborative research infrastructure. The strategic identification of Capability areas through a consultative roadmapping process was also validated in the report of the 2010 NCRIS Evaluation. The 2011 Roadmap is primarily concerned with medium to large-scale research infrastructure. However, any landmark infrastructure (typically involving an investment in excess of $100 million over five years from the Australian Government) requirements identified in this process will be noted. NRIC has also developed a ‘Process to identify and prioritise Australian Government landmark research infrastructure investments’ which is currently under consideration by the government as part of broader deliberations relating to research infrastructure. NRIC will have strategic oversight of the development of the 2011 Roadmap as part of its overall policy view of research infrastructure

Analysis and Policy Observatory (APO)

Path-based systems to guide scientists in the maze of biological data sources

Author: Cohen-Boulakia Sarah
Davidson Susan B.
Froidevaux Christine
Lacroix Zoe
Vidal Maria-Esther
Publication venue: ScholarlyCommons
Publication date: 24/08/2006
Field of study

Fueled by novel technologies capable of producing massive amounts of data for a single experiment, scientists are faced with an explosion of information which must be rapidly analyzed and combined with other data to form hypotheses and create knowledge. Today, numerous biological questions can be answered without entering a wet lab. Scientific protocols designed to answer these questions can be run entirely on a computer. Biological resources are often complementary, focused on different objects and reflecting various experts\u27 points of view. Exploiting the richness and diversity of these resources is crucial for scientists. However, with the increase of resources, scientists have to face the problem of selecting sources and tools when interpreting their data. In this paper, we analyze the way in which biologists express and implement scientific protocols, and we identify the requirements for a system which can guide scientists in constructing protocols to answer new biological questions. We present two such systems, BioNavigation and BioGuide dedicated to help scientists select resources by following suitable paths within the growing network of interconnected biological resources

ScholarlyCommons@Penn

Current Trends and New Challenges of Databases and Web Applications for Systems Driven Biological Research

Author: Kim Do Han
Sreenivasaiah Pradeep Kumar
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2010
Field of study

Dynamic and rapidly evolving nature of systems driven research imposes special requirements on the technology, approach, design and architecture of computational infrastructure including database and Web application. Several solutions have been proposed to meet the expectations and novel methods have been developed to address the persisting problems of data integration. It is important for researchers to understand different technologies and approaches. Having familiarized with the pros and cons of the existing technologies, researchers can exploit its capabilities to the maximum potential for integrating data. In this review we discuss the architecture, design and key technologies underlying some of the prominent databases and Web applications. We will mention their roles in integration of biological data and investigate some of the emerging design concepts and computational technologies that are likely to have a key role in the future of systems driven biomedical research

Crossref

PubMed Central

Frontiers - Publisher Connector