41,464 research outputs found
Efficient Deterministic Replay Using Complete Race Detection
Data races can significantly affect the executions of multi-threaded
programs. Hence, one has to recur the results of data races to
deterministically replay a multi-threaded program. However, data races are
concealed in enormous number of memory operations in a program. Due to the
difficulty of accurately identifying data races, previous multi-threaded
deterministic record/replay schemes for commodity multi-processor system give
up to record data races directly. Consequently, they either record all shared
memory operations, which brings remarkable slowdown to the production run, or
record the synchronization only, which introduces significant efforts to
replay.
Inspired by the advances in data race detection, we propose an efficient
software-only deterministic replay scheme for commodity multi-processor
systems, which is named RacX. The key insight of RacX is as follows: although
it is NP-hard to accurately identify the existence of data races between a pair
of memory operations, we can find out all potential data races in a
multi-threaded program, in which the false positives can be reduced to a small
amount with our automatic false positive reduction techniques. As a result,
RacX can efficiently monitor all potential data races to deterministically
replay a multi-threaded program.
To evaluate RacX, we have carried out experiments over a number of well-known
multi-threaded programs from SPLASH-2 benchmark suite and large-scale
commercial programs. RacX can precisely recur production runs of these programs
with value determinism. Averagely, RacX causes only about 1.21%, 1.89%, 2.20%,
and 8.41% slowdown to the original run during recording (for 2-, 4-, 8- and
16-thread programs, respectively). The soundness, efficiency, scalability, and
portability of RacX well demonstrate its superiority.Comment: 18 pages, 7 figure
Finding the maximum eigenvalue of a class of tensors with applications in copositivity test and hypergraphs
Finding the maximum eigenvalue of a symmetric tensor is an important topic in
tensor computation and numerical multilinear algebra. This paper is devoted to
a semi-definite program algorithm for computing the maximum -eigenvalue of a
class of tensors with sign structure called -tensors. The class of
-tensors extends the well-studied nonnegative tensors and essentially
nonnegative tensors, and covers some important tensors arising naturally from
spectral hypergraph theory. Our algorithm is based on a new structured
sums-of-squares (SOS) decomposition result for a nonnegative homogeneous
polynomial induced by a -tensor. This SOS decomposition enables us to show
that computing the maximum -eigenvalue of an even order symmetric -tensor
is equivalent to solving a semi-definite program, and hence can be accomplished
in polynomial time. Numerical examples are given to illustrate that the
proposed algorithm can be used to find maximum -eigenvalue of an even order
symmetric -tensor with dimension up to . We present two applications
for our proposed algorithm: we first provide a polynomial time algorithm for
computing the maximum -eigenvalues of large size Laplacian tensors of
hyper-stars and hyper-trees; second, we show that the proposed SOS algorithm
can be used to test the copositivity of a multivariate form associated with
symmetric extended -tensors, whose order may be even or odd. Numerical
experiments illustrate that our structured semi-definite program algorithm is
effective and promising
Trigonometric protocols for shortcuts to adiabatic transport of cold atoms in anharmonic traps
Shortcuts to adiabaticity have been proposed to speed up the "slow" adiabatic
transport of an atom or a wave packet of atoms. However, the freedom of the
inverse engineering approach with appropriate boundary conditions provides
thousands of trap trajectories for different purposes, for example, time and
energy minimizations. In this paper, we propose trigonometric protocols for
fast and robust atomic transport, taking into account cubic or quartic
anharmonicities. The numerical results have illustrated that such trigonometric
protocols, particular cosine ansatz, is more robust and the corresponding final
energy excitation is smaller, as compared to sine trajectories implemented in
previous experiments.Comment: 5 pages, 5 figure
A Hopf lemma and regularity for fractional Laplacians
In this paper, we study qualitative properties of the fractional
-Laplacian. Specifically, we establish a Hopf type lemma for positive weak
super-solutions of the fractional Laplacian equation with Dirichlet
condition. Moreover, an optimal condition is obtained to ensure
for smooth functions .Comment: 20 page
Understanding the Importance of Single Directions via Representative Substitution
Understanding the internal representations of deep neural networks (DNNs) is
crucal to explain their behavior. The interpretation of individual units, which
are neurons in MLPs or convolution kernels in convolutional networks, has been
paid much attention given their fundamental role. However, recent research
(Morcos et al. 2018) presented a counterintuitive phenomenon, which suggests
that an individual unit with high class selectivity, called interpretable
units, has poor contributions to generalization of DNNs. In this work, we
provide a new perspective to understand this counterintuitive phenomenon, which
makes sense when we introduce Representative Substitution (RS). Instead of
individually selective units with classes, the RS refers to the independence of
a unit's representations in the same layer without any annotation. Our
experiments demonstrate that interpretable units have high RS which are not
critical to network's generalization. The RS provides new insights into the
interpretation of DNNs and suggests that we need to focus on the independence
and relationship of the representations.Comment: 4 pages, 6 figure
On Modular Training of Neural Acoustics-to-Word Model for LVCSR
End-to-end (E2E) automatic speech recognition (ASR) systems directly map
acoustics to words using a unified model. Previous works mostly focus on E2E
training a single model which integrates acoustic and language model into a
whole. Although E2E training benefits from sequence modeling and simplified
decoding pipelines, large amount of transcribed acoustic data is usually
required, and traditional acoustic and language modelling techniques cannot be
utilized. In this paper, a novel modular training framework of E2E ASR is
proposed to separately train neural acoustic and language models during
training stage, while still performing end-to-end inference in decoding stage.
Here, an acoustics-to-phoneme model (A2P) and a phoneme-to-word model (P2W) are
trained using acoustic data and text data respectively. A phone synchronous
decoding (PSD) module is inserted between A2P and P2W to reduce sequence
lengths without precision loss. Finally, modules are integrated into an
acousticsto-word model (A2W) and jointly optimized using acoustic data to
retain the advantage of sequence modeling. Experiments on a 300- hour
Switchboard task show significant improvement over the direct A2W model. The
efficiency in both training and decoding also benefits from the proposed
method.Comment: accepted by ICASSP201
Dimer XXZ Spin Ladders: Phase diagram and a Non-Trivial Antiferromagnetic Phase
We study the dimer spin model on two-leg ladders with isotropic
Heisenberg interactions on the rung and anisotropic interactions along
the rail in an external field. Combining both analytical and numerical methods,
we set up the ground state phase diagram and investigate the quantum phase
transitions and the properties of rich phases, including the full polarized,
singlet dimer, Luttinger liquid, triplon solid, and a non-trivial
antiferromagnetic phases with gap. The analytical analyses based on solvable
effective Hamiltonians are presented for clear view of the phases and
transitions. Quantum Monte Carlo and exact diagonalization methods are employed
on finite system to verify the exact nature of the phases and transitions. Of
all the phases, we pay a special attention to the gapped antiferromagnetic
phase, which is disclosed to be a non-trivial one that exhibits the
time-reversal symmetry. We also discuss how our findings could be detected in
experiment in the light of ultracold atoms technology advances.Comment: 13 pages, 7 figure
A density compensation-based path computing model for measuring semantic similarity
The shortest path between two concepts in a taxonomic ontology is commonly
used to represent the semantic distance between concepts in the edge-based
semantic similarity measures. In the past, the edge counting is considered to
be the default method for the path computation, which is simple, intuitive and
has low computational complexity. However, a large lexical taxonomy of such as
WordNet has the irregular densities of links between concepts due to its broad
domain but. The edge counting-based path computation is powerless for this
non-uniformity problem. In this paper, we advocate that the path computation is
able to be separated from the edge-based similarity measures and form various
general computing models. Therefore, in order to solve the problem of
non-uniformity of concept density in a large taxonomic ontology, we propose a
new path computing model based on the compensation of local area density of
concepts, which is equal to the number of direct hyponyms of the subsumers of
concepts in their shortest path. This path model considers the local area
density of concepts as an extension of the edge-based path and converts the
local area density divided by their depth into the compensation for edge-based
path with an adjustable parameter, which idea has been proven to be consistent
with the information theory. This model is a general path computing model and
can be applied in various edge-based similarity algorithms. The experiment
results show that the proposed path model improves the average correlation
between edge-based measures with human judgments on Miller and Charles
benchmark from less than 0.8 to more than 0.85, and has a big advantage in
efficiency than information content (IC) computation in a dynamic ontology,
thereby successfully solving the non-uniformity problem of taxonomic ontology.Comment: 17 pages,11 figure
Making Availability as a Service in the Clouds
Cloud computing has achieved great success in modern IT industry as an
excellent computing paradigm due to its flexible management and elastic
resource sharing. To date, cloud computing takes an irrepalceable position in
our socioeconomic system and influences almost every aspect of our daily life.
However, it is still in its infancy, many problems still exist.Besides the
hotly-debated security problem, availability is also an urgent issue.With the
limited power of availability mechanisms provided in present cloud platform, we
can hardly get detailed availability information of current applications such
as the root causes of availability problem,mean time to failure, etc. Thus a
new mechanism based on deep avaliability analysis is neccessary and
benificial.Following the prevalent terminology 'XaaS',this paper proposes a new
win-win concept for cloud users and providers in term of 'Availability as a
Service' (abbreviated as 'AaaS').The aim of 'AaaS' is to provide comprehensive
and aimspecific runtime avaliabilty analysis services for cloud users by
integrating plent of data-driven and modeldriven approaches. To illustrate this
concept, we realize a prototype named 'EagleEye' with all features of 'AaaS'.
By subscribing corresponding services in 'EagleEye', cloud users could get
specific availability information of their applications deployed in cloud
platform. We envision this new kind of service will be merged into the cloud
management mechanism in the near future.Comment:
The A-Cycle Problem for Transverse Ising Ring
Traditionally, the transverse Ising model is mapped to the fermionic c-cycle
problem, which neglects the boundary effect due to thermodynamic limit. If
persisting on a perfect periodic boundary condition, we can get a so-called
a-cycle problem that has not been treated seriously so far (Lieb et al., 1961
\textit{Ann. of Phys.} \textbf{16} 407). In this work, we show a little
surprising but exact result in this respect. We find the odevity of the number
of lattice sites, , in the a-cycle problem plays an unexpected role even in
the thermodynamic limit, , due to the boundary constraint.
We pay a special attention to the system with ,
which is in contrast to the one with , because
the former suffers a ring frustration. As a new effect, we find the ring
frustration induces a low-energy gapless spectrum above the ground state. By
proving a theorem for a new type of Toeplitz determinant, we demonstrate that
the ground state in the gapless region exhibits a peculiar longitudinal
spin-spin correlation. The entangled nature of the ground state is also
disclosed by the evaluation of its entanglement entropy. At low temperatures,
new behavior of specific heat is predicted. We also propose an experimental
protocol for observing the new phenomenon due to the ring frustration.Comment: 24 pages, 9 figure
- β¦