888 research outputs found
A P2P Integration Architecture for Protein Resources
The availability of a direct pathway from a primary sequence (denovo or DNA derived) to macromolecular structure to biological function using computer-based tools is the ultimate goal for a protein scientist. Today\u27s state of the art protein resources and on-going research and experiments provide the raw data that can enable protein scientists to achieve at least some steps of this goal. Thus, protein scientists are looking towards taking their benchtop research from the specific to a much broader base of using the large resources of available electronic information. However, currently the burden falls on the scientist to manually interface with each data resource, integrate the required information, and then finally interpret the results. Their discoveries are impeded by the lack of tools that can not only bring integrated information from several known data resources, but also weave in information as it is discovered and brought online by other research groups. We propose a novel peer-to-peer based architecture that allows protein scientists to share resources in the form of data and tools within their community, facilitating ad hoc, decentralized sharing of data. In this paper, we present an overview of this integration architecture and briefly describe the tools that are essential to this framework
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
XML-based approaches for the integration of heterogeneous bio-molecular data
Background: The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. Results: In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. Conclusion: XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources. </p
Global Grids and Software Toolkits: A Study of Four Grid Middleware Technologies
Grid is an infrastructure that involves the integrated and collaborative use
of computers, networks, databases and scientific instruments owned and managed
by multiple organizations. Grid applications often involve large amounts of
data and/or computing resources that require secure resource sharing across
organizational boundaries. This makes Grid application management and
deployment a complex undertaking. Grid middlewares provide users with seamless
computing ability and uniform access to resources in the heterogeneous Grid
environment. Several software toolkits and systems have been developed, most of
which are results of academic research projects, all over the world. This
chapter will focus on four of these middlewares--UNICORE, Globus, Legion and
Gridbus. It also presents our implementation of a resource broker for UNICORE
as this functionality was not supported in it. A comparison of these systems on
the basis of the architecture, implementation model and several other features
is included.Comment: 19 pages, 10 figure
Recommended from our members
Grid-based semantic integration of heterogeneous data resources: Implementation on a HealthGrid
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel University.The semantic integration of geographically distributed and heterogeneous data
resources still remains a key challenge in Grid infrastructures. Today's
mainstream Grid technologies hold the promise to meet this challenge in a
systematic manner, making data applications more scalable and manageable. The
thesis conducts a thorough investigation of the problem, the state of the art, and
the related technologies, and proposes an Architecture for Semantic Integration of
Data Sources (ASIDS) addressing the semantic heterogeneity issue. It defines a
simple mechanism for the interoperability of heterogeneous data sources in order
to extract or discover information regardless of their different semantics. The
constituent technologies of this architecture include Globus Toolkit (GT4) and
OGSA-DAI (Open Grid Service Architecture Data Integration and Access)
alongside other web services technologies such as XML (Extensive Markup
Language). To show this, the ASIDS architecture was implemented and tested in a
realistic setting by building an exemplar application prototype on a HealthGrid
(pilot implementation).
The study followed an empirical research methodology and was informed by
extensive literature surveys and a critical analysis of the relevant technologies and
their synergies. The two literature reviews, together with the analysis of the
technology background, have provided a good overview of the current Grid and
HealthGrid landscape, produced some valuable taxonomies, explored new paths
by integrating technologies, and more importantly illuminated the problem and
guided the research process towards a promising solution. Yet the primary
contribution of this research is an approach that uses contemporary Grid
technologies for integrating heterogeneous data resources that have semantically
different. data fields (attributes). It has been practically demonstrated (using a
prototype HealthGrid) that discovery in semantically integrated distributed data
sources can be feasible by using mainstream Grid technologies, which have been
shown to have some Significant advantages over non-Grid based approaches
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Schema matching in a peer-to-peer database system
Includes bibliographical references (p. 112-118).Peer-to-peer or P2P systems are applications that allow a network of peers to share resources in a scalable and efficient manner. My research is concerned with the use of P2P systems for sharing databases. To allow data mediation between peers' databases, schema mappings need to exist, which are mappings between semantically equivalent attributes in different peers' schemas. Mappings can either be defined manually or found semi-automatically using a technique called schema matching. However, schema matching has not been used much in dynamic environments, such as P2P networks. Therefore, this thesis investigates how to enable effective semi-automated schema matching within a P2P network
- …