33,496 research outputs found
Bayesian Design of Tandem Networks for Distributed Detection With Multi-bit Sensor Decisions
We consider the problem of decentralized hypothesis testing under
communication constraints in a topology where several peripheral nodes are
arranged in tandem. Each node receives an observation and transmits a message
to its successor, and the last node then decides which hypothesis is true. We
assume that the observations at different nodes are, conditioned on the true
hypothesis, independent and the channel between any two successive nodes is
considered error-free but rate-constrained. We propose a cyclic numerical
design algorithm for the design of nodes using a person-by-person methodology
with the minimum expected error probability as a design criterion, where the
number of communicated messages is not necessarily equal to the number of
hypotheses. The number of peripheral nodes in the proposed method is in
principle arbitrary and the information rate constraints are satisfied by
quantizing the input of each node. The performance of the proposed method for
different information rate constraints, in a binary hypothesis test, is
compared to the optimum rate-one solution due to Swaszek and a method proposed
by Cover, and it is shown numerically that increasing the channel rate can
significantly enhance the performance of the tandem network. Simulation results
for -ary hypothesis tests also show that by increasing the channel rates the
performance of the tandem network significantly improves
Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments
Eliminating the negative effect of non-stationary environmental noise is a
long-standing research topic for automatic speech recognition that stills
remains an important challenge. Data-driven supervised approaches, including
ones based on deep neural networks, have recently emerged as potential
alternatives to traditional unsupervised approaches and with sufficient
training, can alleviate the shortcomings of the unsupervised methods in various
real-life acoustic environments. In this light, we review recently developed,
representative deep learning approaches for tackling non-stationary additive
and convolutional degradation of speech with the aim of providing guidelines
for those involved in the development of environmentally robust speech
recognition systems. We separately discuss single- and multi-channel techniques
developed for the front-end and back-end of speech recognition systems, as well
as joint front-end and back-end training frameworks
Exploiting partial reconfiguration through PCIe for a microphone array network emulator
The current Microelectromechanical Systems (MEMS) technology enables the deployment of relatively low-cost wireless sensor networks composed of MEMS microphone arrays for accurate sound source localization. However, the evaluation and the selection of the most accurate and power-efficient network’s topology are not trivial when considering dynamic MEMS microphone arrays. Although software simulators are usually considered, they consist of high-computational intensive tasks, which require hours to days to be completed. In this paper, we present an FPGA-based platform to emulate a network of microphone arrays. Our platform provides a controlled simulated acoustic environment, able to evaluate the impact of different network configurations such as the number of microphones per array, the network’s topology, or the used detection method. Data fusion techniques, combining the data collected by each node, are used in this platform. The platform is designed to exploit the FPGA’s partial reconfiguration feature to increase the flexibility of the network emulator as well as to increase performance thanks to the use of the PCI-express high-bandwidth interface. On the one hand, the network emulator presents a higher flexibility by partially reconfiguring the nodes’ architecture in runtime. On the other hand, a set of strategies and heuristics to properly use partial reconfiguration allows the acceleration of the emulation by exploiting the execution parallelism. Several experiments are presented to demonstrate some of the capabilities of our platform and the benefits of using partial reconfiguration
Recommended from our members
Neural correlates of subjective timing precision and confidence
Humans perceptual judgments are imprecise, as repeated exposures to the same physical stimulation (e.g. audio-visual inputs separated by a constant temporal offset) can result in different decisions. Moreover, there can be marked individual differences – precise judges will repeatedly make the same decision about a given input, whereas imprecise judges will make different decisions. The causes are unclear. We examined this using audio-visual (AV) timing and confidence judgments, in conjunction with electroencephalography (EEG) and multivariate pattern classification analyses. One plausible cause of differences in timing precision is that it scales with variance in the dynamics of evoked brain activity. Another possibility is that equally reliable patterns of brain activity are evoked, but there are systematic differences that scale with precision. Trial-by-trial decoding of input timings from brain activity suggested precision differences may not result from variable dynamics. Instead, precision was associated with evoked responses that were exaggerated (more different from baseline) ~300 ms after initial physical stimulations. We suggest excitatory and inhibitory interactions within a winner-take-all neural code for AV timing might exaggerate responses, such that evoked response magnitudes post-stimulation scale with encoding success
TandemNet: Distilling Knowledge from Medical Images Using Diagnostic Reports as Optional Semantic References
In this paper, we introduce the semantic knowledge of medical images from
their diagnostic reports to provide an inspirational network training and an
interpretable prediction mechanism with our proposed novel multimodal neural
network, namely TandemNet. Inside TandemNet, a language model is used to
represent report text, which cooperates with the image model in a tandem
scheme. We propose a novel dual-attention model that facilitates high-level
interactions between visual and semantic information and effectively distills
useful features for prediction. In the testing stage, TandemNet can make
accurate image prediction with an optional report text input. It also
interprets its prediction by producing attention on the image and text
informative feature pieces, and further generating diagnostic report
paragraphs. Based on a pathological bladder cancer images and their diagnostic
reports (BCIDR) dataset, sufficient experiments demonstrate that our method
effectively learns and integrates knowledge from multimodalities and obtains
significantly improved performance than comparing baselines.Comment: MICCAI2017 Ora
- …