12,020 research outputs found
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
Real-time and distributed applications for dictionary-based data compression
The greedy approach to dictionary-based static text compression can be executed by a finite state machine.
When it is applied in parallel to different blocks of data independently, there is no lack of robustness
even on standard large scale distributed systems with input files of arbitrary size. Beyond standard large
scale, a negative effect on the compression effectiveness is caused by the very small size of the data blocks.
A robust approach for extreme distributed systems is presented in this paper, where this problem is fixed by
overlapping adjacent blocks and preprocessing the neighborhoods of the boundaries.
Moreover, we introduce the notion of pseudo-prefix dictionary, which allows optimal compression by means
of a real-time semi-greedy procedure and a slight improvement on the compression ratio obtained by the
distributed implementations
An adaptive prefix-assignment technique for symmetry reduction
This paper presents a technique for symmetry reduction that adaptively
assigns a prefix of variables in a system of constraints so that the generated
prefix-assignments are pairwise nonisomorphic under the action of the symmetry
group of the system. The technique is based on McKay's canonical extension
framework [J.~Algorithms 26 (1998), no.~2, 306--324]. Among key features of the
technique are (i) adaptability---the prefix sequence can be user-prescribed and
truncated for compatibility with the group of symmetries; (ii)
parallelizability---prefix-assignments can be processed in parallel
independently of each other; (iii) versatility---the method is applicable
whenever the group of symmetries can be concisely represented as the
automorphism group of a vertex-colored graph; and (iv) implementability---the
method can be implemented relying on a canonical labeling map for
vertex-colored graphs as the only nontrivial subroutine. To demonstrate the
practical applicability of our technique, we have prepared an experimental
open-source implementation of the technique and carry out a set of experiments
that demonstrate ability to reduce symmetry on hard instances. Furthermore, we
demonstrate that the implementation effectively parallelizes to compute
clusters with multiple nodes via a message-passing interface.Comment: Updated manuscript submitted for revie
Mapping DSP algorithms to a reconfigurable architecture Adaptive Wireless Networking (AWGN)
This report will discuss the Adaptive Wireless Networking project. The vision of the Adaptive Wireless Networking project will be given. The strategy of the project will be the implementation of multiple communication systems in dynamically reconfigurable heterogeneous hardware. An overview of a wireless LAN communication system, namely HiperLAN/2, and a Bluetooth communication system will be given. Possible implementations of these systems in a dynamically reconfigurable architecture are discussed. Suggestions for future activities in the Adaptive Wireless Networking project are also given
Visualization on colour based flow vector of thermal image for movement detection during interactive session
Recently thermal imaging is exploited in applications such as motion and face detection. It has drawn attention many researchers to build such technology to improve lifestyle. This work proposed a technique to detect and identify a motion in sequence images for the application in security monitoring system or outdoor surveillance. Conventional system might cause false information with the present of shadow. Thus, methods employed in this work are Canny edge detector method, Lucas Kanade and Horn Shunck algorithms, to overcome the major problem when using thresholding method, which is only intensity or pixel magnitude is considered instead of relationships between the pixels. The results obtained could be observed in flow vector parameter and the segmentation colour based image for the time frame from 1 to 10 seconds. The visualization of both the parameters clarified the movement and changes of pixel intensity between two frames by the supportive colour segmentation, either in smooth or rough motion. Thus, this technique may contribute to others application such as biometrics, military system, and surveillance machine
Hardware Based Projection onto The Parity Polytope and Probability Simplex
This paper is concerned with the adaptation to hardware of methods for
Euclidean norm projections onto the parity polytope and probability simplex. We
first refine recent efforts to develop efficient methods of projection onto the
parity polytope. Our resulting algorithm can be configured to have either
average computational complexity or worst case
complexity on a serial processor where
is the dimension of projection space. We show how to adapt our projection
routine to hardware. Our projection method uses a sub-routine that involves
another Euclidean projection; onto the probability simplex. We therefore
explain how to adapt to hardware a well know simplex projection algorithm. The
hardware implementations of both projection algorithms achieve area scalings of
at a delay of
. Finally, we present numerical results in
which we evaluate the fixed-point accuracy and resource scaling of these
algorithms when targeting a modern FPGA
CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication
Sparse matrix-vector multiplication (SpMV) is a fundamental building block
for numerous applications. In this paper, we propose CSR5 (Compressed Sparse
Row 5), a new storage format, which offers high-throughput SpMV on various
platforms including CPUs, GPUs and Xeon Phi. First, the CSR5 format is
insensitive to the sparsity structure of the input matrix. Thus the single
format can support an SpMV algorithm that is efficient both for regular
matrices and for irregular matrices. Furthermore, we show that the overhead of
the format conversion from the CSR to the CSR5 can be as low as the cost of a
few SpMV operations. We compare the CSR5-based SpMV algorithm with 11
state-of-the-art formats and algorithms on four mainstream processors using 14
regular and 10 irregular matrices as a benchmark suite. For the 14 regular
matrices in the suite, we achieve comparable or better performance over the
previous work. For the 10 irregular matrices, the CSR5 obtains average
performance improvement of 17.6\%, 28.5\%, 173.0\% and 293.3\% (up to 213.3\%,
153.6\%, 405.1\% and 943.3\%) over the best existing work on dual-socket Intel
CPUs, an nVidia GPU, an AMD GPU and an Intel Xeon Phi, respectively. For
real-world applications such as a solver with only tens of iterations, the CSR5
format can be more practical because of its low-overhead for format conversion.
The source code of this work is downloadable at
https://github.com/bhSPARSE/Benchmark_SpMV_using_CSR5Comment: 12 pages, 10 figures, In Proceedings of the 29th ACM International
Conference on Supercomputing (ICS '15
- …