3,612 research outputs found
Extending and Implementing the Self-adaptive Virtual Processor for Distributed Memory Architectures
Many-core architectures of the future are likely to have distributed memory
organizations and need fine grained concurrency management to be used
effectively. The Self-adaptive Virtual Processor (SVP) is an abstract
concurrent programming model which can provide this, but the model and its
current implementations assume a single address space shared memory. We
investigate and extend SVP to handle distributed environments, and discuss a
prototype SVP implementation which transparently supports execution on
heterogeneous distributed memory clusters over TCP/IP connections, while
retaining the original SVP programming model
Model-driven Scheduling for Distributed Stream Processing Systems
Distributed Stream Processing frameworks are being commonly used with the
evolution of Internet of Things(IoT). These frameworks are designed to adapt to
the dynamic input message rate by scaling in/out.Apache Storm, originally
developed by Twitter is a widely used stream processing engine while others
includes Flink, Spark streaming. For running the streaming applications
successfully there is need to know the optimal resource requirement, as
over-estimation of resources adds extra cost.So we need some strategy to come
up with the optimal resource requirement for a given streaming application. In
this article, we propose a model-driven approach for scheduling streaming
applications that effectively utilizes a priori knowledge of the applications
to provide predictable scheduling behavior. Specifically, we use application
performance models to offer reliable estimates of the resource allocation
required. Further, this intuition also drives resource mapping, and helps
narrow the estimated and actual dataflow performance and resource utilization.
Together, this model-driven scheduling approach gives a predictable application
performance and resource utilization behavior for executing a given DSPS
application at a target input stream rate on distributed resources.Comment: 54 page
Coordination Implications of Software Coupling in Open Source Projects
The effect of software coupling on the quality of software has been studied quite widely since the seminal paper on software modularity by Parnas [1]. However, the effect of the increase in software coupling on the coordination of the developers has not been researched as much. In commercial software development environments there normally are coordination mechanisms in place to manage the coordination requirements due to software dependencies. But, in the case of Open Source software such coordination mechanisms are harder to implement, as the developers tend to rely solely on electronic means of communication. Hence, an understanding of the changing coordination requirements is essential to the management of an Open Source project. In this paper we study the effect of changes in software coupling on the coordination requirements in a case study of a popular Open Source project called JBoss
The AXIOM software layers
AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft
Reconfigurable interconnects in DSM systems: a focus on context switch behavior
Recent advances in the development of reconfigurable optical interconnect technologies allow for the fabrication of low cost and run-time adaptable interconnects in large distributed shared-memory (DSM) multiprocessor machines. This can allow the use of adaptable interconnection networks that alleviate the huge bottleneck present due to the gap between the processing speed and the memory access time over the network. In this paper we have studied the scheduling of tasks by the kernel of the operating system (OS) and its influence on communication between the processing nodes of the system, focusing on the traffic generated just after a context switch. We aim to use these results as a basis to propose a potential reconfiguration of the network that could provide a significant speedup
High-Performance Distributed ML at Scale through Parameter Server Consistency Models
As Machine Learning (ML) applications increase in data size and model
complexity, practitioners turn to distributed clusters to satisfy the increased
computational and memory demands. Unfortunately, effective use of clusters for
ML requires considerable expertise in writing distributed code, while
highly-abstracted frameworks like Hadoop have not, in practice, approached the
performance seen in specialized ML implementations. The recent Parameter Server
(PS) paradigm is a middle ground between these extremes, allowing easy
conversion of single-machine parallel ML applications into distributed ones,
while maintaining high throughput through relaxed "consistency models" that
allow inconsistent parameter reads. However, due to insufficient theoretical
study, it is not clear which of these consistency models can really ensure
correct ML algorithm output; at the same time, there remain many
theoretically-motivated but undiscovered opportunities to maximize
computational throughput. Motivated by this challenge, we study both the
theoretical guarantees and empirical behavior of iterative-convergent ML
algorithms in existing PS consistency models. We then use the gleaned insights
to improve a consistency model using an "eager" PS communication mechanism, and
implement it as a new PS system that enables ML algorithms to reach their
solution more quickly.Comment: 19 pages, 2 figure
- …