12,826 research outputs found
Evaluating the SiteStory Transactional Web Archive With the ApacheBench Tool
Conventional Web archives are created by periodically crawling a web site and
archiving the responses from the Web server. Although easy to implement and
common deployed, this form of archiving typically misses updates and may not be
suitable for all preservation scenarios, for example a site that is required
(perhaps for records compliance) to keep a copy of all pages it has served. In
contrast, transactional archives work in conjunction with a Web server to
record all pages that have been served. Los Alamos National Laboratory has
developed SiteSory, an open-source transactional archive written in Java
solution that runs on Apache Web servers, provides a Memento compatible access
interface, and WARC file export features. We used the ApacheBench utility on a
pre-release version of to measure response time and content delivery time in
different environments and on different machines. The performance tests were
designed to determine the feasibility of SiteStory as a production-level
solution for high fidelity automatic Web archiving. We found that SiteStory
does not significantly affect content server performance when it is performing
transactional archiving. Content server performance slows from 0.076 seconds to
0.086 seconds per Web page access when the content server is under load, and
from 0.15 seconds to 0.21 seconds when the resource has many embedded and
changing resources.Comment: 13 pages, Technical Repor
Towards Exascale Scientific Metadata Management
Advances in technology and computing hardware are enabling scientists from
all areas of science to produce massive amounts of data using large-scale
simulations or observational facilities. In this era of data deluge, effective
coordination between the data production and the analysis phases hinges on the
availability of metadata that describe the scientific datasets. Existing
workflow engines have been capturing a limited form of metadata to provide
provenance information about the identity and lineage of the data. However,
much of the data produced by simulations, experiments, and analyses still need
to be annotated manually in an ad hoc manner by domain scientists. Systematic
and transparent acquisition of rich metadata becomes a crucial prerequisite to
sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and
domain-agnostic metadata management infrastructure that can meet the demands of
extreme-scale science is notable by its absence.
To address this gap in scientific data management research and practice, we
present our vision for an integrated approach that (1) automatically captures
and manipulates information-rich metadata while the data is being produced or
analyzed and (2) stores metadata within each dataset to permeate
metadata-oblivious processes and to query metadata through established and
standardized data access interfaces. We motivate the need for the proposed
integrated approach using applications from plasma physics, climate modeling
and neuroscience, and then discuss research challenges and possible solutions
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
Event-Driven Network Model for Space Mission Optimization with High-Thrust and Low-Thrust Spacecraft
Numerous high-thrust and low-thrust space propulsion technologies have been
developed in the recent years with the goal of expanding space exploration
capabilities; however, designing and optimizing a multi-mission campaign with
both high-thrust and low-thrust propulsion options are challenging due to the
coupling between logistics mission design and trajectory evaluation.
Specifically, this computational burden arises because the deliverable mass
fraction (i.e., final-to-initial mass ratio) and time of flight for low-thrust
trajectories can can vary with the payload mass; thus, these trajectory metrics
cannot be evaluated separately from the campaign-level mission design. To
tackle this challenge, this paper develops a novel event-driven space logistics
network optimization approach using mixed-integer linear programming for space
campaign design. An example case of optimally designing a cislunar propellant
supply chain to support multiple lunar surface access missions is used to
demonstrate this new space logistics framework. The results are compared with
an existing stochastic combinatorial formulation developed for incorporating
low-thrust propulsion into space logistics design; our new approach provides
superior results in terms of cost as well as utilization of the vehicle fleet.
The event-driven space logistics network optimization method developed in this
paper can trade off cost, time, and technology in an automated manner to
optimally design space mission campaigns.Comment: 38 pages; 11 figures; Journal of Spacecraft and Rockets (Accepted);
previous version presented at the AAS/AIAA Astrodynamics Specialist
Conference, 201
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the first six months. The project aim is to scale the Erlang’s radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
Maintaining consistency in distributed systems
In systems designed as assemblies of independently developed components, concurrent access to data or data structures normally arises within individual programs, and is controlled using mutual exclusion constructs, such as semaphores and monitors. Where data is persistent and/or sets of operation are related to one another, transactions or linearizability may be more appropriate. Systems that incorporate cooperative styles of distributed execution often replicate or distribute data within groups of components. In these cases, group oriented consistency properties must be maintained, and tools based on the virtual synchrony execution model greatly simplify the task confronting an application developer. All three styles of distributed computing are likely to be seen in future systems - often, within the same application. This leads us to propose an integrated approach that permits applications that use virtual synchrony with concurrent objects that respect a linearizability constraint, and vice versa. Transactional subsystems are treated as a special case of linearizability
A generic persistence model for CLP systems (and two useful implementations)
This paper describes a model of persistence in (C)LP languages and two different and practically very useful ways to implement this model in current systems. The fundamental idea is that persistence is a characteristic of certain dynamic predicates (Le., those which encapsulate
state). The main effect of declaring a predicate persistent is that the dynamic changes made to such predicates persist from one execution to the next one. After proposing a syntax for declaring persistent predicates, a simple, file-based implementation of the concept is presented and
some examples shown. An additional implementation is presented which stores persistent predicates in an external datábase. The abstraction of the concept of persistence from its implementation allows developing applications
which can store their persistent predicates alternatively in files or databases with only a few simple changes to a declaration stating the location and modality used for persistent storage. The paper presents the model, the implementation approach in both the cases of using files
and relational databases, a number of optimizations of the process (using information obtained from static global analysis and goal clustering), and performance results from an implementation of these ideas
- …