15,411 research outputs found
A Learning Framework for Morphological Operators using Counter-Harmonic Mean
We present a novel framework for learning morphological operators using
counter-harmonic mean. It combines concepts from morphology and convolutional
neural networks. A thorough experimental validation analyzes basic
morphological operators dilation and erosion, opening and closing, as well as
the much more complex top-hat transform, for which we report a real-world
application from the steel industry. Using online learning and stochastic
gradient descent, our system learns both the structuring element and the
composition of operators. It scales well to large datasets and online settings.Comment: Submitted to ISMM'1
Robust Distributed Fusion with Labeled Random Finite Sets
This paper considers the problem of the distributed fusion of multi-object
posteriors in the labeled random finite set filtering framework, using
Generalized Covariance Intersection (GCI) method. Our analysis shows that GCI
fusion with labeled multi-object densities strongly relies on label
consistencies between local multi-object posteriors at different sensor nodes,
and hence suffers from a severe performance degradation when perfect label
consistencies are violated. Moreover, we mathematically analyze this phenomenon
from the perspective of Principle of Minimum Discrimination Information and the
so called yes-object probability. Inspired by the analysis, we propose a novel
and general solution for the distributed fusion with labeled multi-object
densities that is robust to label inconsistencies between sensors.
Specifically, the labeled multi-object posteriors are firstly marginalized to
their unlabeled posteriors which are then fused using GCI method. We also
introduce a principled method to construct the labeled fused density and
produce tracks formally. Based on the developed theoretical framework, we
present tractable algorithms for the family of generalized labeled
multi-Bernoulli (GLMB) filters including -GLMB, marginalized
-GLMB and labeled multi-Bernoulli filters. The robustness and
efficiency of the proposed distributed fusion algorithm are demonstrated in
challenging tracking scenarios via numerical experiments.Comment: 17pages, 23 figure
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
Distributed Bayesian Filtering using Logarithmic Opinion Pool for Dynamic Sensor Networks
The discrete-time Distributed Bayesian Filtering (DBF) algorithm is presented
for the problem of tracking a target dynamic model using a time-varying network
of heterogeneous sensing agents. In the DBF algorithm, the sensing agents
combine their normalized likelihood functions in a distributed manner using the
logarithmic opinion pool and the dynamic average consensus algorithm. We show
that each agent's estimated likelihood function globally exponentially
converges to an error ball centered on the joint likelihood function of the
centralized multi-sensor Bayesian filtering algorithm. We rigorously
characterize the convergence, stability, and robustness properties of the DBF
algorithm. Moreover, we provide an explicit bound on the time step size of the
DBF algorithm that depends on the time-scale of the target dynamics, the
desired convergence error bound, and the modeling and communication error
bounds. Furthermore, the DBF algorithm for linear-Gaussian models is cast into
a modified form of the Kalman information filter. The performance and robust
properties of the DBF algorithm are validated using numerical simulations
CABE : a cloud-based acoustic beamforming emulator for FPGA-based sound source localization
Microphone arrays are gaining in popularity thanks to the availability of low-cost microphones. Applications including sonar, binaural hearing aid devices, acoustic indoor localization techniques and speech recognition are proposed by several research groups and companies. In most of the available implementations, the microphones utilized are assumed to offer an ideal response in a given frequency domain. Several toolboxes and software can be used to obtain a theoretical response of a microphone array with a given beamforming algorithm. However, a tool facilitating the design of a microphone array taking into account the non-ideal characteristics could not be found. Moreover, generating packages facilitating the implementation on Field Programmable Gate Arrays has, to our knowledge, not been carried out yet. Visualizing the responses in 2D and 3D also poses an engineering challenge. To alleviate these shortcomings, a scalable Cloud-based Acoustic Beamforming Emulator (CABE) is proposed. The non-ideal characteristics of microphones are considered during the computations and results are validated with acoustic data captured from microphones. It is also possible to generate hardware description language packages containing delay tables facilitating the implementation of Delay-and-Sum beamformers in embedded hardware. Truncation error analysis can also be carried out for fixed-point signal processing. The effects of disabling a given group of microphones within the microphone array can also be calculated. Results and packages can be visualized with a dedicated client application. Users can create and configure several parameters of an emulation, including sound source placement, the shape of the microphone array and the required signal processing flow. Depending on the user configuration, 2D and 3D graphs showing the beamforming results, waterfall diagrams and performance metrics can be generated by the client application. The emulations are also validated with captured data from existing microphone arrays.</jats:p
Compressive Embedding and Visualization using Graphs
Visualizing high-dimensional data has been a focus in data analysis
communities for decades, which has led to the design of many algorithms, some
of which are now considered references (such as t-SNE for example). In our era
of overwhelming data volumes, the scalability of such methods have become more
and more important. In this work, we present a method which allows to apply any
visualization or embedding algorithm on very large datasets by considering only
a fraction of the data as input and then extending the information to all data
points using a graph encoding its global similarity. We show that in most
cases, using only samples is sufficient to diffuse the
information to all data points. In addition, we propose quantitative
methods to measure the quality of embeddings and demonstrate the validity of
our technique on both synthetic and real-world datasets
Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function
This paper addresses the problem of speech separation and enhancement from
multichannel convolutive and noisy mixtures, \emph{assuming known mixing
filters}. We propose to perform the speech separation and enhancement task in
the short-time Fourier transform domain, using the convolutive transfer
function (CTF) approximation. Compared to time-domain filters, CTF has much
less taps, consequently it has less near-common zeros among channels and less
computational complexity. The work proposes three speech-source recovery
methods, namely: i) the multichannel inverse filtering method, i.e. the
multiple input/output inverse theorem (MINT), is exploited in the CTF domain,
and for the multi-source case, ii) a beamforming-like multichannel inverse
filtering method applying single source MINT and using power minimization,
which is suitable whenever the source CTFs are not all known, and iii) a
constrained Lasso method, where the sources are recovered by minimizing the
-norm to impose their spectral sparsity, with the constraint that the
-norm fitting cost, between the microphone signals and the mixing model
involving the unknown source signals, is less than a tolerance. The noise can
be reduced by setting a tolerance onto the noise power. Experiments under
various acoustic conditions are carried out to evaluate the three proposed
methods. The comparison between them as well as with the baseline methods is
presented.Comment: Submitted to IEEE/ACM Transactions on Audio, Speech and Language
Processin
- …