Search CORE

145 research outputs found

Polynomial time algorithms for multicast network code construction

Author: Chou Philip A.
Effros Michelle
Egner Sebastian
Jaggi Sidharth
Jain Kamal
Sanders Peter
Tolhiuzen Ludo M. G. M.
Publication venue
Publication date: 01/01/2005
Field of study

The famous max-flow min-cut theorem states that a source node s can send information through a network (V, E) to a sink node t at a rate determined by the min-cut separating s and t. Recently, it has been shown that this rate can also be achieved for multicasting to several sinks provided that the intermediate nodes are allowed to re-encode the information they receive. We demonstrate examples of networks where the achievable rates obtained by coding at intermediate nodes are arbitrarily larger than if coding is not allowed. We give deterministic polynomial time algorithms and even faster randomized algorithms for designing linear codes for directed acyclic graphs with edges of unit capacity. We extend these algorithms to integer capacities and to codes that are tolerant to edge failures

CiteSeerX

Crossref

Caltech Authors

Explore Bristol Research

Recommended from our members

FutureGRID: A Program for long-term research into GRID systems architecture

Author: Crowcroft Jon
Hand SM
Harris TL
Herbert AJ
Parker Michael A
Pratt IA
Publication venue
Publication date: 26/06/2008
Field of study

Proceedings of the 2003 UK e-Science All Hands Meeting, 31st August - 3rd September, Nottingham UKThis is a project to carry out research into long-term GRID architecture, in the University of Cambridge Computer Laboratory and the Cambridge eScience Center, with support from the Microsoft Research Laboratory, Cambridge. It is part of a larger vision for future systems architectures for public computing platforms, including both scientitic GRID and commodity level computing such as games, peer2peer computing and storage services and so forth, based on work in the laboratories in recent years into massively scaleable distributed systems for storage, computation, content distribution and collaboration[26]

Apollo (Cambridge)

The Chameleon Architecture for Streaming DSP Applications

Author: Burgwal Marcel D. van de
Heysters Paul M.
Hölzenspies Philip K.F.
Kokkeler André B.J.
Smit Gerard J.M.
Wolkotte Pascal T.
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2007
Field of study

We focus on architectures for streaming DSP applications such as wireless baseband processing and image processing. We aim at a single generic architecture that is capable of dealing with different DSP applications. This architecture has to be energy efficient and fault tolerant. We introduce a heterogeneous tiled architecture and present the details of a domain-specific reconfigurable tile processor called Montium. This reconfigurable processor has a small footprint (1.8 mm

^2

in a 130 nm process), is power efficient and exploits the locality of reference principle. Reconfiguring the device is very fast, for example, loading the coefficients for a 200 tap FIR filter is done within 80 clock cycles. The tiles on the tiled architecture are connected to a Network-on-Chip (NoC) via a network interface (NI). Two NoCs have been developed: a packet-switched and a circuit-switched version. Both provide two types of services: guaranteed throughput (GT) and best effort (BE). For both NoCs estimates of power consumption are presented. The NI synchronizes data transfers, configures and starts/stops the tile processor. For dynamically mapping applications onto the tiled architecture, we introduce a run-time mapping tool

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

University of Twente Research Information

Improving Oblivious Reconfigurable Networks with High Probability

Author: Amir Daniel
Kleinberg Robert
Shrivastav Vishal
Weatherspoon Hakim
Wilson Tegan
Publication venue
Publication date: 28/08/2023
Field of study

Oblivious Reconfigurable Networks (ORNs) use rapidly reconfiguring switches to create a dynamic time-varying topology. Prior theoretical work on ORNs has focused on the tradeoff between maximum latency and guaranteed throughput. This work shows that by relaxing the notion of guaranteed throughput to an achievable rate with high probability, one can achieve a significant improvement in the latency/throughput tradeoff. For a fixed maximum latency, we show that almost twice the maximum possible guaranteed throughput rate can be achieved with high probability. Alternatively for a fixed throughput value, relaxing to achievement with high probability decreases the maximum latency to almost the square root of the latency required to guarantee the throughput rate. We first give a lower bound on the best maximum latency possible given an achieved throughput rate with high probability. This is done using an LP duality style argument. We then give a family of ORN designs which achieves these tradeoffs. The connection schedule is based on the Vandermonde Basis Scheme of Amir, Wilson, Shrivastav, Weatherspoon, Kleinberg, and Agarwal, although the period and routing scheme differ significantly. We prove achievable throughput with high probability by interpreting the amount of flow on each edge as a sum of negatively associated variables, and applying a Chernoff bound. This gives us a design with maximum latency that is tight with our lower bound (up to a log factor) for almost all constant throughput values.Comment: 19 pages, 1 figur

arXiv.org e-Print Archive

Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs

Author: Assadi Sepehr
Bateni MohammadHossein
Bernstein Aaron
Mirrokni Vahab
Stein Cliff
Publication venue
Publication date: 27/12/2018
Field of study

As massive graphs become more prevalent, there is a rapidly growing need for scalable algorithms that solve classical graph problems, such as maximum matching and minimum vertex cover, on large datasets. For massive inputs, several different computational models have been introduced, including the streaming model, the distributed communication model, and the massively parallel computation (MPC) model that is a common abstraction of MapReduce-style computation. In each model, algorithms are analyzed in terms of resources such as space used or rounds of communication needed, in addition to the more traditional approximation ratio. In this paper, we give a single unified approach that yields better approximation algorithms for matching and vertex cover in all these models. The highlights include: * The first one pass, significantly-better-than-2-approximation for matching in random arrival streams that uses subquadratic space, namely a

(1.5+\epsilon)

-approximation streaming algorithm that uses

O(n^{1.5})

space for constant

\epsilon > 0

. * The first 2-round, better-than-2-approximation for matching in the MPC model that uses subquadratic space per machine, namely a

(1.5+\epsilon)

-approximation algorithm with

O(\sqrt{mn} + n)

memory per machine for constant

\epsilon > 0

. By building on our unified approach, we further develop parallel algorithms in the MPC model that give a

(1 + \epsilon)

-approximation to matching and an

O(1)

-approximation to vertex cover in only

O(\log\log{n})

MPC rounds and

O(n/poly\log{(n)})

memory per machine. These results settle multiple open questions posed in the recent paper of Czumaj~et.al. [STOC 2018]

arXiv.org e-Print Archive

Crossref

Parallel bug-finding in concurrent programs via reduced interleaving instances

Author: Fischer Bernd
La Torre Salvatore
Nguyen Truc L
Parlato Gennaro
Schrammel Peter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Concurrency poses a major challenge for program verification, but it can also offer an opportunity to scale when subproblems can be analysed in parallel. We exploit this opportunity here and use a parametrizable code-to-code translation to generate a set of simpler program instances, each capturing a reduced set of the original program’s interleavings. These instances can then be checked independently in parallel. Our approach does not depend on the tool that is chosen for the final analysis, is compatible with weak memory models, and amplifies the effectiveness of existing tools, making them find bugs faster and with fewer resources. We use Lazy-CSeq as an off-the-shelf final verifier to demonstrate that our approach is able, already with a small number of cores, to find bugs in the hardest known concurrency benchmarks in a matter of minutes, whereas other dynamic and static tools fail to do so in hours

Southampton (e-Prints Soton)

Crossref

Archivio della Ricerca - Università di Salerno

Sussex Research Online

Achieving High Performance and High Productivity in Next Generational Parallel Programming Languages

Author: Kumar Vivek
Publication venue
Publication date: 01/01/2014
Field of study

Processor design has turned toward parallelism and heterogeneity cores to achieve performance and energy efficiency. Developers find high-level languages attractive because they use abstraction to offer productivity and portability over hardware complexities. To achieve performance, some modern implementations of high-level languages use work-stealing scheduling for load balancing of dynamically created tasks. Work-stealing is a promising approach for effectively exploiting software parallelism on parallel hardware. A programmer who uses work-stealing explicitly identifies potential parallelism and the runtime then schedules work, keeping otherwise idle hardware busy while relieving overloaded hardware of its burden. However, work-stealing comes with substantial overheads. These overheads arise as a necessary side effect of the implementation and hamper parallel performance. In addition to runtime-imposed overheads, there is a substantial cognitive load associated with ensuring that parallel code is data-race free. This dissertation explores the overheads associated with achieving high performance parallelism in modern high-level languages. My thesis is that, by exploiting existing underlying mechanisms of managed runtimes; and by extending existing language design, high-level languages will be able to deliver productivity and parallel performance at the levels necessary for widespread uptake. The key contributions of my thesis are: 1) a detailed analysis of the key sources of overhead associated with a work-stealing runtime, namely sequential and dynamic overheads; 2) novel techniques to reduce these overheads that use rich features of managed runtimes such as the yieldpoint mechanism, on-stack replacement, dynamic code-patching, exception handling support, and return barriers; 3) comprehensive analysis of the resulting benefits, which demonstrate that work-stealing overheads can be significantly reduced, leading to substantial performance improvements; and 4) a small set of language extensions that achieve both high performance and high productivity with minimal programmer effort. A managed runtime forms the backbone of any modern implementation of a high-level language. Managed runtimes enjoy the benefits of a long history of research and their implementations are highly optimized. My thesis demonstrates that converging these highly optimized features together with the expressiveness of high-level languages, gives further hope for achieving high performance and high productivity on modern parallel hardwar

The Australian National University