570 research outputs found
Discriminative fine-grained mixing for adaptive compression of data streams
Cataloged from PDF version of article.This paper introduces an adaptive compression algorithm for transfer of data streams across operators in stream processing
systems. The algorithm is adaptive in the sense that it can adjust the amount of compression applied based on the bandwidth, Cpu, and
workload availability. It is discriminative in the sense that it can judiciously apply partial compression by selecting a subset of attributes
that can provide good reduction in the used bandwidth at a low cost. The algorithm relies on the significant differences that exist among
stream attributes with respect to their relative sizes, compression ratios, compression costs, and their amenability to application of
custom compressors. As part of this study, we present a modeling of uniform and discriminative mixing, and provide various greedy
algorithms and associated metrics to locate an effective setting when model parameters are available at run-time. Furthermore, we
provide online and adaptive algorithms for real-world systems in which system parameters that can be measured at run-time are limited.
We present a detailed experimental study that illustrates the superiority of discriminative mixing over uniform mixing
Partitioning functions for steteful data parallelism in stream processing
Cataloged from PDF version of article.In this paper we study partitioning functions
for stream processing systems that employ stateful
data parallelism to improve application throughput.
In particular, we develop partitioning functions
that are effective under workloads where the domain
of the partitioning key is large and its value distribution
is skewed. We define various desirable properties
for partitioning functions, ranging from balance
properties such as memory, processing, and communication
balance, structural properties such as compactness
and fast lookup, and adaptation properties such as
fast computation and minimal migration. We introduce
a partitioning function structure that is compact and
develop several associated heuristic construction techniques
that exhibit good balance and low migration cost
under skewed workloads. We provide experimental results
that compare our partitioning functions to more
traditional approaches such as uniform and consistent
hashing, under different workload and application characteristics,
and show superior performance
Generic windowing support for extensible stream processing systems
Cataloged from PDF version of article.Stream processing applications process high volume, continuous feeds from live data sources, employ data-in-motion analytics to analyze these feeds, and produce near real-time insights with low latency. One of the fundamental characteristics of such applications is the on-the-fly nature of the computation, which does not require access to disk resident data. Stream processing applications store the most recent history of streams in memory and use it to perform the necessary modeling and analysis tasks. This recent history is often managed using windows. All data stream management systems provide some form of windowing functionality. Windowing makes it possible to implement streaming versions of the traditionally blocking relational operators, such as streaming aggregations, joins, and sorts, as well as any other analytic operator that requires keeping the most recent tuples as state, such as time series analysis operators and signal processing operators. In this paper, we provide a categorization of different window types and policies employed in stream processing applications and give detailed operational semantics for various window configurations. We describe an extensibility mechanism that makes it possible to integrate windowing support into user-defined operators, enabling consistent syntax and semantics across system-provided and third-party toolkits of streaming operators. We describe the design and implementation of a runtime windowing library that significantly simplifies the construction of window-based operators by decoupling the handling of window policies and operator logic from each other. We present our experience using the windowing library to implement a relational operators toolkit and compare the efficacy of the solution to an earlier implementation that did not employ a common windowing library. Copyright (c) 2013 John Wiley & Sons, Ltd
Auto-tuning similarity search algorithms on multi-core architectures
Cataloged from PDF version of article.In recent times, large high-dimensional datasets have become ubiquitous.
Video and image repositories, financial, and sensor data are just a few examples of
such datasets in practice. Many applications that use such datasets require the retrieval
of data items similar to a given query item, or the nearest neighbors (NN or k-NN) of
a given item. Another common query is the retrieval of multiple sets of nearest neighbors,
i.e., multi k-NN, for different query items on the same data. With commodity
multi-core CPUs becoming more and more widespread at lower costs, developing parallel
algorithms for these search problems has become increasingly important. While
the core nearest neighbor search problem is relatively easy to parallelize, it is challenging
to tune it for optimality. This is due to the fact that the various performance-specific
algorithmic parameters, or “tuning knobs”, are inter-related and also depend on the data
and query workloads. In this paper, we present (1) a detailed study of the various tuning
knobs and their contributions on increasing the query throughput for parallelized
versions of the two most common classes of high-dimensional multi-NN search algorithms:
linear scan and tree traversal, and (2) an offline auto-tuner for setting these
knobs by iteratively measuring actual query execution times for a given workload and
dataset. We show experimentally that our auto-tuner reaches near-optimal performance
and significantly outperforms un-tuned versions of parallel multi-NN algorithms for
real video repository data on a variety of multi-core platforms. © Springer Science+Business Media New York 201
Absence of Magnetic Fluctuations in the Ferromagnetic/Topological Heterostructure EuS/BiSe
Heterostructures of topological insulators and ferromagnets offer new
opportunities in spintronics and a route to novel anomalous Hall states. In one
such structure, EuS/BiSe a dramatic enhancement of the Curie
temperature was recently observed. We performed Raman spectroscopy on a similar
set of thin films to investigate the magnetic and lattice excitations.
Interfacial strain was monitored through its effects on the BiSe
phonon modes while the magnetic system was probed through the EuS Raman mode.
Despite its appearance in bare EuS, the heterostructures lack the corresponding
EuS Raman signal. Through numerical calculations we rule out the possibility of
Fabry-Perot interference suppressing the mode. We attribute the absence of a
magnetic signal in EuS to a large charge transfer with the BiSe.
This could provide an additional pathway for manipulating the magnetic,
optical, or electronic response of topological heterostructures.Comment: 6 pages, 3 figure
Elastic scaling for data stream processing
Cataloged from PDF version of article.This article addresses the profitability problem associated with auto-parallelization of general-purpose distributed data stream processing applications. Auto-parallelization involves locating regions in the application's data flow graph that can be replicated at run-time to apply data partitioning, in order to achieve scale. In order to make auto-parallelization effective in practice, the profitability question needs to be answered: How many parallel channels provide the best throughput? The answer to this question changes depending on the workload dynamics and resource availability at run-time. In this article, we propose an elastic auto-parallelization solution that can dynamically adjust the number of channels used to achieve high throughput without unnecessarily wasting resources. Most importantly, our solution can handle partitioned stateful operators via run-time state migration, which is fully transparent to the application developers. We provide an implementation and evaluation of the system on an industrial-strength data stream processing platform to validate our solution
4D visualization of embryonic, structural crystallization by single-pulse microscopy
In many physical and biological systems the transition from an amorphous to ordered native structure involves complex energy landscapes, and understanding such transformations requires not only their thermodynamics but also the structural dynamics during the process. Here, we extend our 4D visualization method with electron imaging to include the study of irreversible processes with a single pulse in the same ultrafast electron microscope (UEM) as used before in the single-electron mode for the study of reversible processes. With this augmentation, we report on the transformation of amorphous to crystalline structure with silicon as an example. A single heating pulse was used to initiate crystallization from the amorphous phase while a single packet of electrons imaged selectively in space the transformation as the structure continuously changes with time. From the evolution of crystallinity in real time and the changes in morphology, for nanosecond and femtosecond pulse heating, we describe two types of processes, one that occurs at early time and involves a nondiffusive motion and another that takes place on a longer time scale. Similar mechanisms of two distinct time scales may perhaps be important in biomolecular folding
Quantum correlations in a few-atom spin-1 Bose-Hubbard model
We study the thermal quantum correlations and entanglement in spin-1 Bose-Hubbard model with two and three particles. While we use negativity to calculate entanglement, more general non-classical correlations are quantified using a new measure based on a necessary and sufficient condition for zero-discord state. We demonstrate that the energy level crossings in the ground state of the system are signalled by both the behavior of thermal quantum correlations and entanglement
- …
