10,185 research outputs found
MPWide: a light-weight library for efficient message passing over wide area networks
We present MPWide, a light weight communication library which allows
efficient message passing over a distributed network. MPWide has been designed
to connect application running on distributed (super)computing resources, and
to maximize the communication performance on wide area networks for those
without administrative privileges. It can be used to provide message-passing
between application, move files, and make very fast connections in
client-server environments. MPWide has already been applied to enable
distributed cosmological simulations across up to four supercomputers on two
continents, and to couple two different bloodflow simulations to form a
multiscale simulation.Comment: accepted by the Journal Of Open Research Software, 13 pages, 4
figures, 1 tabl
Deliverable JRA1.1: Evaluation of current network control and management planes for multi-domain network infrastructure
This deliverable includes a compilation and evaluation of available control and management architectures and protocols applicable to a multilayer infrastructure in a multi-domain Virtual Network environment.The scope of this deliverable is mainly focused on the virtualisation of the resources within a network and at processing nodes. The virtualization of the FEDERICA infrastructure allows the provisioning of its available resources to users by means of FEDERICA slices. A slice is seen by the user as a real physical network under his/her domain, however it maps to a logical partition (a virtual instance) of the physical FEDERICA resources. A slice is built to exhibit to the highest degree all the principles applicable to a physical network (isolation, reproducibility, manageability, ...). Currently, there are no standard definitions available for network virtualization or its associated architectures. Therefore, this deliverable proposes the Virtual Network layer architecture and evaluates a set of Management- and Control Planes that can be used for the partitioning and virtualization of the FEDERICA network resources. This evaluation has been performed taking into account an initial set of FEDERICA requirements; a possible extension of the selected tools will be evaluated in future deliverables. The studies described in this deliverable define the virtual architecture of the FEDERICA infrastructure. During this activity, the need has been recognised to establish a new set of basic definitions (taxonomy) for the building blocks that compose the so-called slice, i.e. the virtual network instantiation (which is virtual with regard to the abstracted view made of the building blocks of the FEDERICA infrastructure) and its architectural plane representation. These definitions will be established as a common nomenclature for the FEDERICA project. Other important aspects when defining a new architecture are the user requirements. It is crucial that the resulting architecture fits the demands that users may have. Since this deliverable has been produced at the same time as the contact process with users, made by the project activities related to the Use Case definitions, JRA1 has proposed a set of basic Use Cases to be considered as starting point for its internal studies. When researchers want to experiment with their developments, they need not only network resources on their slices, but also a slice of the processing resources. These processing slice resources are understood as virtual machine instances that users can use to make them behave as software routers or end nodes, on which to download the software protocols or applications they have produced and want to assess in a realistic environment. Hence, this deliverable also studies the APIs of several virtual machine management software products in order to identify which best suits FEDERICA’s needs.Postprint (published version
Pando: Personal Volunteer Computing in Browsers
The large penetration and continued growth in ownership of personal
electronic devices represents a freely available and largely untapped source of
computing power. To leverage those, we present Pando, a new volunteer computing
tool based on a declarative concurrent programming model and implemented using
JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying
number of failure-prone personal devices contributed by volunteers to
parallelize the application of a function on a stream of values, by using the
devices' browsers. We show that Pando can provide throughput improvements
compared to a single personal device, on a variety of compute-bound
applications including animation rendering and image processing. We also show
the flexibility of our approach by deploying Pando on personal devices
connected over a local network, on Grid5000, a French-wide computing grid in a
virtual private network, and seven PlanetLab nodes distributed in a wide area
network over Europe.Comment: 14 pages, 12 figures, 2 table
Many-Task Computing and Blue Waters
This report discusses many-task computing (MTC) generically and in the
context of the proposed Blue Waters systems, which is planned to be the largest
NSF-funded supercomputer when it begins production use in 2012. The aim of this
report is to inform the BW project about MTC, including understanding aspects
of MTC applications that can be used to characterize the domain and
understanding the implications of these aspects to middleware and policies.
Many MTC applications do not neatly fit the stereotypes of high-performance
computing (HPC) or high-throughput computing (HTC) applications. Like HTC
applications, by definition MTC applications are structured as graphs of
discrete tasks, with explicit input and output dependencies forming the graph
edges. However, MTC applications have significant features that distinguish
them from typical HTC applications. In particular, different engineering
constraints for hardware and software must be met in order to support these
applications. HTC applications have traditionally run on platforms such as
grids and clusters, through either workflow systems or parallel programming
systems. MTC applications, in contrast, will often demand a short time to
solution, may be communication intensive or data intensive, and may comprise
very short tasks. Therefore, hardware and software for MTC must be engineered
to support the additional communication and I/O and must minimize task dispatch
overheads. The hardware of large-scale HPC systems, with its high degree of
parallelism and support for intensive communication, is well suited for MTC
applications. However, HPC systems often lack a dynamic resource-provisioning
feature, are not ideal for task communication via the file system, and have an
I/O system that is not optimized for MTC-style applications. Hence, additional
software support is likely to be required to gain full benefit from the HPC
hardware
FiVO/QStorMan Semantic Toolkit for Supporting Data-Intensive Applications in Distributed Environments
In this paper we present a semantic-based approach for supporting data-intensive applications in distributed environments. The approach is characterized by usage of explicit definition of non-functional quality parameters regarding storage systems, semantic descriptions of the available storage infrastructre and monitoring data concering the infrastructure workload and users operation, along with an implementation of the approach in the form of a toolkit called FiVO/QStorMan. In particular, we describe semantic descriptions, which are exploited in the storage resource provisioning process. In addition, the paper describes results of the performed experimental evaluation of the toolkit, which confirm the effectiveness of the proposed approach for the storage resource provisioning
A Toolkit For Storage Qos Provisioning For Data-Intensive Applications
This paper describes a programming toolkit developed in the PL-Grid project, named QStorMan, which supports storage QoS provisioning for data-intensive applications in distributed environments. QStorMan exploits knowledge-oriented methods for matching storage resources to non-functional requirements, which are defined for a data-intensive application. In order to support various usage scenarios, QStorMan provides two interfaces, such as programming libraries or a web portal. The interfaces allow to define the requirements either directly in an application source code or by using an intuitive graphical interface. The first way provides finer granularity, e.g., each portion of data processed by an application can define a different set of requirements. The second method is aimed at legacy applications support, which source code can not be modified. The toolkit has been evaluated using synthetic benchmarks and the production infrastructure of PL-Grid, in particular its storage infrastructure, which utilizes the Lustre file system
Executing Bag of Distributed Tasks on Virtually Unlimited Cloud Resources
Bag-of-Distributed-Tasks (BoDT) application is the collection of identical
and independent tasks each of which requires a piece of input data located
around the world. As a result, Cloud computing offers an ef- fective way to
execute BoT application as it not only consists of multiple geographically
distributed data centres but also allows a user to pay for what she actually
uses only. In this paper, BoDT on the Cloud using virtually unlimited cloud
resources. A heuristic algorithm is proposed to find an execution plan that
takes budget constraints into account. Compared with other approaches, with the
same given budget, our algorithm is able to reduce the overall execution time
up to 50%
A Cyberinfrastructure for BigData Transportation Engineering
Big Data-driven transportation engineering has the potential to improve
utilization of road infrastructure, decrease traffic fatalities, improve fuel
consumption, decrease construction worker injuries, among others. Despite these
benefits, research on Big Data-driven transportation engineering is difficult
today due to the computational expertise required to get started. This work
proposes BoaT, a transportation-specific programming language, and it's Big
Data infrastructure that is aimed at decreasing this barrier to entry. Our
evaluation that uses over two dozen research questions from six categories show
that research is easier to realize as a BoaT computer program, an order of
magnitude faster when this program is run, and exhibits 12-14x decrease in
storage requirements
- …