149 research outputs found
Astrolabe: Curating, Linking and Computing Astronomy's Dark Data
Where appropriate repositories are not available to support all relevant
astronomical data products, data can fall into darkness: unseen and unavailable
for future reference and re-use. Some data in this category are legacy or old
data, but newer datasets are also often uncurated and could remain "dark". This
paper provides a description of the design motivation and development of
Astrolabe, a cyberinfrastructure project that addresses a set of community
recommendations for locating and ensuring the long-term curation of dark or
otherwise at-risk data and integrated computing. This paper also describes the
outcomes of the series of community workshops that informed creation of
Astrolabe. According to participants in these workshops, much astronomical dark
data currently exist that are not curated elsewhere, as well as software that
can only be executed by a few individuals and therefore becomes unusable
because of changes in computing platforms. Astronomical research questions and
challenges would be better addressed with integrated data and computational
resources that fall outside the scope of existing observatory and space mission
projects. As a solution, the design of the Astrolabe system is aimed at
developing new resources for management of astronomical data. The project is
based in CyVerse cyberinfrastructure technology and is a collaboration between
the University of Arizona and the American Astronomical Society. Overall the
project aims to support open access to research data by leveraging existing
cyberinfrastructure resources and promoting scientific discovery by making
potentially-useful data in a computable format broadly available to the
astronomical community.Comment: Accepted for publication in the Astrophysical Journal Supplement
Series, 22 pages, 2 figure
Reliable scientific service compositions
Abstract. Distributed service oriented architectures (SOAs) are increas-ingly used by users, who are insufficiently skilled in the art of distributed system programming. A good example are computational scientists who build large-scale distributed systems using service-oriented Grid comput-ing infrastructures. Computational scientists use these infrastructure to build scientific applications, which are composed from basic Web ser-vices into larger orchestrations using workflow languages, such as the Business Process Execution Language. For these users reliability of the infrastructure is of significant importance and that has to be provided in the presence of hardware or operational failures. The primitives avail-able to achieve such reliability currently leave much to be desired by users who do not necessarily have a strong education in distributed sys-tem construction. We characterise scientific service compositions and the environment they operate in by introducing the notion of global scien-tific BPEL workflows. We outline the threats to the reliability of such workflows and discuss the limited support that available specifications and mechanisms provide to achieve reliability. Furthermore, we propose a line of research to address the identified issues by investigating auto-nomic mechanisms that assist computational scientists in building, exe-cuting and maintaining reliable workflows.
The state of peer-to-peer network simulators
Networking research often relies on simulation in order to test and evaluate new ideas. An important requirement of this process is that results must be reproducible so that other researchers can replicate, validate and extend existing work. We look at the landscape of simulators for research in peer-to-peer (P2P) networks by conducting a survey of a combined total of over 280 papers from before and after 2007 (the year of the last survey in this area), and comment on the large quantity of research using bespoke, closed-source simulators. We propose a set of criteria that P2P simulators should meet, and poll the P2P research community for their agreement. We aim to drive the community towards performing their experiments on simulators that allow for others to validate their results
Formal analysis techniques for gossiping protocols
We give a survey of formal verification techniques that can be used to corroborate existing experimental results for gossiping protocols in a rigorous manner. We present properties of interest for gossiping protocols and discuss how various formal evaluation techniques can be employed to predict them
Exploring heterogeneity of unreliable machines for p2p backup
P2P architecture is a viable option for enterprise backup. In contrast to
dedicated backup servers, nowadays a standard solution, making backups directly
on organization's workstations should be cheaper (as existing hardware is
used), more efficient (as there is no single bottleneck server) and more
reliable (as the machines are geographically dispersed).
We present the architecture of a p2p backup system that uses pairwise
replication contracts between a data owner and a replicator. In contrast to
standard p2p storage systems using directly a DHT, the contracts allow our
system to optimize replicas' placement depending on a specific optimization
strategy, and so to take advantage of the heterogeneity of the machines and the
network. Such optimization is particularly appealing in the context of backup:
replicas can be geographically dispersed, the load sent over the network can be
minimized, or the optimization goal can be to minimize the backup/restore time.
However, managing the contracts, keeping them consistent and adjusting them in
response to dynamically changing environment is challenging.
We built a scientific prototype and ran the experiments on 150 workstations
in the university's computer laboratories and, separately, on 50 PlanetLab
nodes. We found out that the main factor affecting the quality of the system is
the availability of the machines. Yet, our main conclusion is that it is
possible to build an efficient and reliable backup system on highly unreliable
machines (our computers had just 13% average availability)
- ā¦