147 research outputs found
Interrogating the Explanatory Power of Attention in Neural Machine Translation
Attention models have become a crucial component in neural machine
translation (NMT). They are often implicitly or explicitly used to justify the
model's decision in generating a specific token but it has not yet been
rigorously established to what extent attention is a reliable source of
information in NMT. To evaluate the explanatory power of attention for NMT, we
examine the possibility of yielding the same prediction but with counterfactual
attention models that modify crucial aspects of the trained attention model.
Using these counterfactual attention mechanisms we assess the extent to which
they still preserve the generation of function and content words in the
translation process. Compared to a state of the art attention model, our
counterfactual attention models produce 68% of function words and 21% of
content words in our German-English dataset. Our experiments demonstrate that
attention models by themselves cannot reliably explain the decisions made by a
NMT model.Comment: Accepted at the 3rd Workshop on Neural Generation and Translation
(WNGT 2019) held at EMNLP-IJCNLP 2019 (Camera ready
Spectral Perturbation and Reconstructability of Complex Networks
In recent years, many network perturbation techniques, such as topological
perturbations and service perturbations, were employed to study and improve the
robustness of complex networks. However, there is no general way to evaluate
the network robustness. In this paper, we propose a new global measure for a
network, the reconstructability coefficient {\theta}, defined as the maximum
number of eigenvalues that can be removed, subject to the condition that the
adjacency matrix can be reconstructed exactly. Our main finding is that a
linear scaling law, E[{\theta}]=aN, seems universal, in that it holds for all
networks that we have studied.Comment: 9 pages, 10 figure
Ubiquitous Computing for Remote Cardiac Patient Monitoring: A Survey
New wireless technologies, such as wireless LAN and sensor networks, for telecardiology purposes give new possibilities for monitoring vital parameters with wearable biomedical sensors, and give patients the freedom to be mobile and still be under continuous monitoring and thereby better quality of patient care. This paper will detail the architecture and quality-of-service (QoS) characteristics in integrated wireless telecardiology platforms. It will also discuss the current promising hardware/software platforms for wireless cardiac monitoring. The design methodology and challenges are provided for realistic implementation
EFFICIENT UTILIZATION OF BARE METAL CORES WITH DYNAMIC MONITORING AND CALIBRATION
In existing cloud environments it is not possible to mix, on the same server at the same time, workloads that use part of a processor core, or that use cores on a best-effort basis, with workloads that must both be assigned to a single core and have that core dedicated to their use (i.e., nothing else runs on the core). To address these challenges and inefficiencies, techniques are presented herein that support a division of resources in a way that they can then be appropriately assigned to workloads. One logical pool of cores may be assigned for workloads requiring shared resources and another pool may be assigned for workloads requiring dedicated resources. The boundary between those pools may shift dynamically as, for example, additional resources are required
Improving Sparse Representation-Based Classification Using Local Principal Component Analysis
Sparse representation-based classification (SRC), proposed by Wright et al.,
seeks the sparsest decomposition of a test sample over the dictionary of
training samples, with classification to the most-contributing class. Because
it assumes test samples can be written as linear combinations of their
same-class training samples, the success of SRC depends on the size and
representativeness of the training set. Our proposed classification algorithm
enlarges the training set by using local principal component analysis to
approximate the basis vectors of the tangent hyperplane of the class manifold
at each training sample. The dictionary in SRC is replaced by a local
dictionary that adapts to the test sample and includes training samples and
their corresponding tangent basis vectors. We use a synthetic data set and
three face databases to demonstrate that this method can achieve higher
classification accuracy than SRC in cases of sparse sampling, nonlinear class
manifolds, and stringent dimension reduction.Comment: Published in "Computational Intelligence for Pattern Recognition,"
editors Shyi-Ming Chen and Witold Pedrycz. The original publication is
available at http://www.springerlink.co
- …