Search CORE

25,118 research outputs found

Transformations of High-Level Synthesis Codes for High-Performance Computing

Author: Besta Maciej
Hoefler Torsten
Licht Johannes de Fine
Meierhans Simon
Publication venue
Publication date: 29/10/2019
Field of study

Specialized hardware architectures promise a major step in performance and energy efficiency over the traditional load/store devices currently employed in large scale computing systems. The adoption of high-level synthesis (HLS) from languages such as C/C++ and OpenCL has greatly increased programmer productivity when designing for such platforms. While this has enabled a wider audience to target specialized hardware, the optimization principles known from traditional software design are no longer sufficient to implement high-performance codes. Fast and efficient codes for reconfigurable platforms are thus still challenging to design. To alleviate this, we present a set of optimizing transformations for HLS, targeting scalable and efficient architectures for high-performance computing (HPC) applications. Our work provides a toolbox for developers, where we systematically identify classes of transformations, the characteristics of their effect on the HLS code and the resulting hardware (e.g., increases data reuse or resource consumption), and the objectives that each transformation can target (e.g., resolve interface contention, or increase parallelism). We show how these can be used to efficiently exploit pipelining, on-chip distributed fast memory, and on-chip streaming dataflow, allowing for massively parallel architectures. To quantify the effect of our transformations, we use them to optimize a set of throughput-oriented FPGA kernels, demonstrating that our enhancements are sufficient to scale up parallelism within the hardware constraints. With the transformations covered, we hope to establish a common framework for performance engineers, compiler developers, and hardware developers, to tap into the performance potential offered by specialized hardware architectures using HLS

arXiv.org e-Print Archive

Repository for Publications and Research Data

Towards Tight Bounds for the Streaming Set Cover Problem

Author: Har-Peled Sariel
Indyk Piotr
Mahabadi Sepideh
Vakilian Ali
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 02/05/2016
Field of study

We consider the classic Set Cover problem in the data stream model. For

n

elements and

m

sets (

m\geq n

) we give a

O(1/\delta)

-pass algorithm with a strongly sub-linear

\tilde{O}(mn^{\delta})

space and logarithmic approximation factor. This yields a significant improvement over the earlier algorithm of Demaine et al. [DIMV14] that uses exponentially larger number of passes. We complement this result by showing that the tradeoff between the number of passes and space exhibited by our algorithm is tight, at least when the approximation factor is equal to

1

. Specifically, we show that any algorithm that computes set cover exactly using

({1 \over 2\delta}-1)

passes must use

\tilde{\Omega}(mn^{\delta})

space in the regime of

m=O(n)

. Furthermore, we consider the problem in the geometric setting where the elements are points in

\mathbb{R}^2

and sets are either discs, axis-parallel rectangles, or fat triangles in the plane, and show that our algorithm (with a slight modification) uses the optimal

\tilde{O}(n)

space to find a logarithmic approximation in

O(1/\delta)

passes. Finally, we show that any randomized one-pass algorithm that distinguishes between covers of size 2 and 3 must use a linear (i.e.,

\Omega(mn)

) amount of space. This is the first result showing that a randomized, approximate algorithm cannot achieve a space bound that is sublinear in the input size. This indicates that using multiple passes might be necessary in order to achieve sub-linear space bounds for this problem while guaranteeing small approximation factors.Comment: A preliminary version of this paper is to appear in PODS 201

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Crossref

Asymptotic Analysis of Plausible Tree Hash Modes for SHA-3

Author: Atighehchi Kevin
Bonnecaze Alexis
Publication venue
Publication date: 01/01/2017
Field of study

Discussions about the choice of a tree hash mode of operation for a standardization have recently been undertaken. It appears that a single tree mode cannot address adequately all possible uses and specifications of a system. In this paper, we review the tree modes which have been proposed, we discuss their problems and propose remedies. We make the reasonable assumption that communicating systems have different specifications and that software applications are of different types (securing stored content or live-streamed content). Finally, we propose new modes of operation that address the resource usage problem for the three most representative categories of devices and we analyse their asymptotic behavior

arXiv.org e-Print Archive

HAL - Normandie Université

HAL AMU

Directory of Open Access Journals

Ruhr-Universität Bochum (RUB): Open Journal Systems

Cryptology ePrint Archive

MOSDEN: A Scalable Mobile Collaborative Platform for Opportunistic Sensing Applications

Author: Georgakopoulos Dimitrios
Jayaraman Prem Prakash
Perera Charith
Zaslavsky Arkady
Publication venue
Publication date: 01/05/2014
Field of study

Mobile smartphones along with embedded sensors have become an efficient enabler for various mobile applications including opportunistic sensing. The hi-tech advances in smartphones are opening up a world of possibilities. This paper proposes a mobile collaborative platform called MOSDEN that enables and supports opportunistic sensing at run time. MOSDEN captures and shares sensor data across multiple apps, smartphones and users. MOSDEN supports the emerging trend of separating sensors from application-specific processing, storing and sharing. MOSDEN promotes reuse and re-purposing of sensor data hence reducing the efforts in developing novel opportunistic sensing applications. MOSDEN has been implemented on Android-based smartphones and tablets. Experimental evaluations validate the scalability and energy efficiency of MOSDEN and its suitability towards real world applications. The results of evaluation and lessons learned are presented and discussed in this paper.Comment: Accepted to be published in Transactions on Collaborative Computing, 2014. arXiv admin note: substantial text overlap with arXiv:1310.405

arXiv.org e-Print Archive

Crossref

Online Research @ Cardiff

Directory of Open Access Journals

RMIT Research Repository

MapReduce and Streaming Algorithms for Diversity Maximization in Metric Spaces of Bounded Doubling Dimension

Author: Ceccarello Matteo
Pietracaprina Andrea
Pucci Geppino
Upfal Eli
Publication venue
Publication date: 01/01/2017
Field of study

Given a dataset of points in a metric space and an integer

k

, a diversity maximization problem requires determining a subset of

k

points maximizing some diversity objective measure, e.g., the minimum or the average distance between two points in the subset. Diversity maximization is computationally hard, hence only approximate solutions can be hoped for. Although its applications are mainly in massive data analysis, most of the past research on diversity maximization focused on the sequential setting. In this work we present space and pass/round-efficient diversity maximization algorithms for the Streaming and MapReduce models and analyze their approximation guarantees for the relevant class of metric spaces of bounded doubling dimension. Like other approaches in the literature, our algorithms rely on the determination of high-quality core-sets, i.e., (much) smaller subsets of the input which contain good approximations to the optimal solution for the whole input. For a variety of diversity objective functions, our algorithms attain an

(\alpha+\epsilon)

-approximation ratio, for any constant

\epsilon>0

, where

\alpha

is the best approximation ratio achieved by a polynomial-time, linear-space sequential algorithm for the same diversity objective. This improves substantially over the approximation ratios attainable in Streaming and MapReduce by state-of-the-art algorithms for general metric spaces. We provide extensive experimental evidence of the effectiveness of our algorithms on both real world and synthetic datasets, scaling up to over a billion points.Comment: Extended version of http://www.vldb.org/pvldb/vol10/p469-ceccarello.pdf, PVLDB Volume 10, No. 5, January 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova