Search CORE

9,691 research outputs found

Many-Task Computing and Blue Waters

Author: Armstrong Timothy G.
Katz Daniel S.
Wilde Michael
Wozniak Justin M.
Zhang Zhao
Publication venue
Publication date: 01/01/2012
Field of study

This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

arXiv.org e-Print Archive

CiteSeerX

Enabling virtual radio functions on software defined radio for future wireless networks

Author: DaSilva Luiz
Jiao Xianjun
Liu Wei
Marquez-Barja Johann
Moerman Ingrid
Pollin Sofie
Santos Joao F.
van de Belt Jonathan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Today's wired networks have become highly flexible, thanks to the fact that an increasing number of functionalities are realized by software rather than dedicated hardware. This trend is still in its early stages for wireless networks, but it has the potential to improve the network's flexibility and resource utilization regarding both the abundant computational resources and the scarce radio spectrum resources. In this work we provide an overview of the enabling technologies for network reconfiguration, such as Network Function Virtualization, Software Defined Networking, and Software Defined Radio. We review frequently used terminology such as softwarization, virtualization, and orchestration, and how these concepts apply to wireless networks. We introduce the concept of Virtual Radio Function, and illustrate how softwarized/virtualized radio functions can be placed and initialized at runtime, allowing radio access technologies and spectrum allocation schemes to be formed dynamically. Finally we focus on embedded Software-Defined Radio as an end device, and illustrate how to realize the placement, initialization and configuration of virtual radio functions on such kind of devices

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Scalable data abstractions for distributed parallel computations

Author: Hanlon James
Hollis Simon J.
May David
Publication venue
Publication date: 03/10/2012
Field of study

The ability to express a program as a hierarchical composition of parts is an essential tool in managing the complexity of software and a key abstraction this provides is to separate the representation of data from the computation. Many current parallel programming models use a shared memory model to provide data abstraction but this doesn't scale well with large numbers of cores due to non-determinism and access latency. This paper proposes a simple programming model that allows scalable parallel programs to be expressed with distributed representations of data and it provides the programmer with the flexibility to employ shared or distributed styles of data-parallelism where applicable. It is capable of an efficient implementation, and with the provision of a small set of primitive capabilities in the hardware, it can be compiled to operate directly on the hardware, in the same way stack-based allocation operates for subroutines in sequential machines

arXiv.org e-Print Archive

Explore Bristol Research

ReSHAPE: A Framework for Dynamic Resizing and Scheduling of Homogeneous Applications in a Parallel Environment

Author: Ribbens Calvin J.
Sudarsan Rajesh
Publication venue
Publication date: 01/01/2007
Field of study

Applications in science and engineering often require huge computational resources for solving problems within a reasonable time frame. Parallel supercomputers provide the computational infrastructure for solving such problems. A traditional application scheduler running on a parallel cluster only supports static scheduling where the number of processors allocated to an application remains fixed throughout the lifetime of execution of the job. Due to the unpredictability in job arrival times and varying resource requirements, static scheduling can result in idle system resources thereby decreasing the overall system throughput. In this paper we present a prototype framework called ReSHAPE, which supports dynamic resizing of parallel MPI applications executed on distributed memory platforms. The framework includes a scheduler that supports resizing of applications, an API to enable applications to interact with the scheduler, and a library that makes resizing viable. Applications executed using the ReSHAPE scheduler framework can expand to take advantage of additional free processors or can shrink to accommodate a high priority application, without getting suspended. In our research, we have mainly focused on structured applications that have two-dimensional data arrays distributed across a two-dimensional processor grid. The resize library includes algorithms for processor selection and processor mapping. Experimental results show that the ReSHAPE framework can improve individual job turn-around time and overall system throughput.Comment: 15 pages, 10 figures, 5 tables Submitted to International Conference on Parallel Processing (ICPP'07

arXiv.org e-Print Archive

Computer Science Technical Reports @Virginia Tech

CiteSeerX

Recommended from our members

On the conditions for efficient interoperability with threads: An experience with PGAS languages using Cray communication domains

Author: Ibrahim KZ
Yelick K
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

Today's high performance systems are typically built from shared memory nodes connected by a high speed network. That architecture, combined with the trend towards less memory per core, encourages programmers to use a mixture of message passing and multithreaded programming. Unfortunately, the advantages of using threads for in-node programming are hindered by their inability to efficiently communicate between nodes. In this work, we identify some of the performance problems that arise in such hybrid programming environments and characterize conditions needed to achieve high communication performance for multiple threads: addressability of targets, separability of communication paths, and full direct reachability to targets. Using the GASNet communication layer on the Cray XC30 as our experimental platform, we show how to satisfy these conditions. We also discuss how satisfying these conditions is influenced by the communication abstraction, implementation constraints, and the interconnect messaging capabilities. To evaluate these ideas, we compare the communication performance of a thread-based node runtime to a process-based runtime. Without our GASNet extensions, thread communication is significantly slower than processes - up to 21x slower. Once the implementation is modified to address each of our conditions, the two runtimes have comparable communication performance. This allows programmers to more easily mix models like OpenMP, CILK, or pthreads with a GASNet-based model like UPC, with the associated performance, convenience and interoperability advantages that come from using threads within a node. © 2014 ACM

eScholarship - University of California

On the Benefit of Virtualization: Strategies for Flexible Server Allocation

Author: Arora Dushyant
Feldmann Anja
Schaffrath Gregor
Schmid Stefan
Publication venue
Publication date: 01/01/2010
Field of study

Virtualization technology facilitates a dynamic, demand-driven allocation and migration of servers. This paper studies how the flexibility offered by network virtualization can be used to improve Quality-of-Service parameters such as latency, while taking into account allocation costs. A generic use case is considered where both the overall demand issued for a certain service (for example, an SAP application in the cloud, or a gaming application) as well as the origins of the requests change over time (e.g., due to time zone effects or due to user mobility), and we present online and optimal offline strategies to compute the number and location of the servers implementing this service. These algorithms also allow us to study the fundamental benefits of dynamic resource allocation compared to static systems. Our simulation results confirm our expectations that the gain of flexible server allocation is particularly high in scenarios with moderate dynamics

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe