Search CORE

7,411 research outputs found

BriskStream: Scaling Data Stream Processing on Shared-Memory Multicore Architectures

Author: He Bingsheng
He Jiong
Zhang Shuhao
Zhou Amelie Chi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/04/2019
Field of study

We introduce BriskStream, an in-memory data stream processing system (DSPSs) specifically designed for modern shared-memory multicore architectures. BriskStream's key contribution is an execution plan optimization paradigm, namely RLAS, which takes relative-location (i.e., NUMA distance) of each pair of producer-consumer operators into consideration. We propose a branch and bound based approach with three heuristics to resolve the resulting nontrivial optimization problem. The experimental evaluations demonstrate that BriskStream yields much higher throughput and better scalability than existing DSPSs on multi-core architectures when processing different types of workloads.Comment: To appear in SIGMOD'1

arXiv.org e-Print Archive

Crossref

ScholarBank@NUS

Strategies for dynamic appointment making by container terminals

Author: Douma Albert
Mes Martijn
Publication venue: University of Twente, BETA Research School for Operations Management and Logistics
Publication date: 01/01/2012
Field of study

We consider a container terminal that has to make appointments with barges dynamically, in real-time, and partly automatic. The challenge for the terminal is to make appointments with only limited knowledge about future arriving barges, and in the view of uncertainty and disturbances, such as uncertain arrival and handling times, as well as cancellations and no-shows. We illustrate this problem using an innovative implementation project which is currently running in the Port of Rotterdam. This project aims to align barge rotations and terminal quay schedules by means of a multi-agent system. In this\ud paper, we take the perspective of a single terminal that will participate in this planning system, and focus on the decision making capabilities of its intelligent agent. We focus on the question how the terminal operator can optimize, on an operational level, the utilization of its quay resources, while making reliable appointments with barges, i.e., with a guaranteed departure time. We explore two approaches: (i) an analytical approach based on the value of having certain intervals within the schedule and (ii) an approach based on sources of exibility that are naturally available to the terminal. We use simulation to get insight in the benefits of these approaches. We conclude that a major increase in utilization degree could be achieved only by deploying the sources of exibility, without harming the waiting time of barges too much

University of Twente Research Information

The HdpH DSLs for scalable reliable computation

Author: Maier Patrick
Stewart Robert
Trinder Phil
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

The statelessness of functional computations facilitates both parallelism and fault recovery. Faults and non-uniform communication topologies are key challenges for emergent large scale parallel architectures. We report on HdpH and HdpH-RS, a pair of Haskell DSLs designed to address these challenges for irregular task-parallel computations on large distributed-memory architectures. Both DSLs share an API combining explicit task placement with sophisticated work stealing. HdpH focuses on scalability by making placement and stealing topology aware whereas HdpH-RS delivers reliability by means of fault tolerant work stealing. We present operational semantics for both DSLs and investigate conditions for semantic equivalence of HdpH and HdpH-RS programs, that is, conditions under which topology awareness can be transparently traded for fault tolerance. We detail how the DSL implementations realise topology awareness and fault tolerance. We report an initial evaluation of scalability and fault tolerance on a 256-core cluster and on up to 32K cores of an HPC platform

Heriot Watt Pure

Crossref

Stirling Online Research Repository (RIOXX)

Sheffield Hallam University Research Archive

Enlighten

Stirling Online Research Repository

Data Replication and Its Alignment with Fault Management in the Cloud Environment

Author: Xie Fei
Publication venue: School of Computing and Information Technology
Publication date: 01/01/2021
Field of study

Nowadays, the exponential data growth becomes one of the major challenges all over the world. It may cause a series of negative impacts such as network overloading, high system complexity, and inadequate data security, etc. Cloud computing is developed to construct a novel paradigm to alleviate massive data processing challenges with its on-demand services and distributed architecture. Data replication has been proposed to strategically distribute the data access load to multiple cloud data centres by creating multiple data copies at multiple cloud data centres. A replica-applied cloud environment not only achieves a decrease in response time, an increase in data availability, and more balanced resource load but also protects the cloud environment against the upcoming faults. The reactive fault tolerance strategy is also required to handle the faults when the faults already occurred. As a result, the data replication strategies should be aligned with the reactive fault tolerance strategies to achieve a complete management chain in the cloud environment. In this thesis, a data replication and fault management framework is proposed to establish a decentralised overarching management to the cloud environment. Three data replication strategies are firstly proposed based on this framework. A replica creation strategy is proposed to reduce the total cost by jointly considering the data dependency and the access frequency in the replica creation decision making process. Besides, a cloud map oriented and cost efficiency driven replica creation strategy is proposed to achieve the optimal cost reduction per replica in the cloud environment. The local data relationship and the remote data relationship are further analysed by creating two novel data dependency types, Within-DataCentre Data Dependency and Between-DataCentre Data Dependency, according to the data location. Furthermore, a network performance based replica selection strategy is proposed to avoid potential network overloading problems and to increase the number of concurrent-running instances at the same time

Research Online

Introducing distributed dynamic data-intensive (D3) science: Understanding applications and infrastructure

Author: Chue Hong Neil
Jha Shantenu
Katz Daniel S.
Luckow Andre
Rana Omer
Simmhan Yogesh
Publication venue: 'Wiley'
Publication date: 12/09/2016
Field of study

A common feature across many science and engineering applications is the amount and diversity of data and computation that must be integrated to yield insights. Data sets are growing larger and becoming distributed; and their location, availability and properties are often time-dependent. Collectively, these characteristics give rise to dynamic distributed data-intensive applications. While "static" data applications have received significant attention, the characteristics, requirements, and software systems for the analysis of large volumes of dynamic, distributed data, and data-intensive applications have received relatively less attention. This paper surveys several representative dynamic distributed data-intensive application scenarios, provides a common conceptual framework to understand them, and examines the infrastructure used in support of applications.Comment: 38 pages, 2 figure

arXiv.org e-Print Archive

Online Research @ Cardiff

Edinburgh Research Explorer

Open Access Repository of IISc Research Publications

Helping Poor Working Parents Get Ahead: Federal Funds for New State Strategies and Systems

Author: Harry J. Holzer
Karin Martinson
Publication venue: Urban Institute
Publication date: 07/07/2008
Field of study

Examines the cost-effectiveness of state job advancement systems and outlines a proposal for federally funding programs that provide more education and training, greater access to better-paying jobs, and more robust financial incentives and supports

IssueLab