2,874 research outputs found
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
Real Time Turbulent Video Perfecting by Image Stabilization and Super-Resolution
Image and video quality in Long Range Observation Systems (LOROS) suffer from
atmospheric turbulence that causes small neighbourhoods in image frames to
chaotically move in different directions and substantially hampers visual
analysis of such image and video sequences. The paper presents a real-time
algorithm for perfecting turbulence degraded videos by means of stabilization
and resolution enhancement. The latter is achieved by exploiting the turbulent
motion. The algorithm involves generation of a reference frame and estimation,
for each incoming video frame, of a local image displacement map with respect
to the reference frame; segmentation of the displacement map into two classes:
stationary and moving objects and resolution enhancement of stationary objects,
while preserving real motion. Experiments with synthetic and real-life
sequences have shown that the enhanced videos, generated in real time, exhibit
substantially better resolution and complete stabilization for stationary
objects while retaining real motion.Comment: Submitted to The Seventh IASTED International Conference on
Visualization, Imaging, and Image Processing (VIIP 2007) August, 2007 Palma
de Mallorca, Spai
Digital Signal Processing
Contains introduction and reports on seventeen research projects.U.S. Navy - Office of Naval Research (Contract N00014-81-K-0742)U.S. Navy - Office of Naval Research (Contract N00014-77-C-0266)National Science Foundation (Grant ECS80-07102)Bell Laboratories FellowshipAmoco Foundation FellowshipSchlumberger-Doll Research Center FellowshipSanders Associates, Inc.Toshiba Company FellowshipM.I.T. Vinton Hayes FellowshipHertz Foundation Fellowshi
Automatic Speech Recognition Using LP-DCTC/DCS Analysis Followed by Morphological Filtering
Front-end feature extraction techniques have long been a critical component in Automatic Speech Recognition (ASR). Nonlinear filtering techniques are becoming increasingly important in this application, and are often better than linear filters at removing noise without distorting speech features. However, design and analysis of nonlinear filters are more difficult than for linear filters. Mathematical morphology, which creates filters based on shape and size characteristics, is a design structure for nonlinear filters. These filters are limited to minimum and maximum operations that introduce a deterministic bias into filtered signals.
This work develops filtering structures based on a mathematical morphology that utilizes the bias while emphasizing spectral peaks. The combination of peak emphasis via LP analysis with morphological filtering results in more noise robust speech recognition rates.
To help understand the behavior of these pre-processing techniques the deterministic and statistical properties of the morphological filters are compared to the properties of feature extraction techniques that do not employ such algorithms. The robust behavior of these algorithms for automatic speech recognition in the presence of rapidly fluctuating speech signals with additive and convolutional noise is illustrated. Examples of these nonlinear feature extraction techniques are given using the Aurora 2.0 and Aurora 3.0 databases. Features are computed using LP analysis alone to emphasize peaks, morphological filtering alone, or a combination of the two approaches. Although absolute best results are normally obtained using a combination of the two methods, morphological filtering alone is nearly as effective and much more computationally efficient
Inference skipping for more efficient real-time speech enhancement with parallel RNNs
Deep neural network (DNN) based speech enhancement models have attracted
extensive attention due to their promising performance. However, it is
difficult to deploy a powerful DNN in real-time applications because of its
high computational cost. Typical compression methods such as pruning and
quantization do not make good use of the data characteristics. In this paper,
we introduce the Skip-RNN strategy into speech enhancement models with parallel
RNNs. The states of the RNNs update intermittently without interrupting the
update of the output mask, which leads to significant reduction of
computational load without evident audio artifacts. To better leverage the
difference between the voice and the noise, we further regularize the skipping
strategy with voice activity detection (VAD) guidance, saving more
computational load. Experiments on a high-performance speech enhancement model,
dual-path convolutional recurrent network (DPCRN), show the superiority of our
strategy over strategies like network pruning or directly training a smaller
model. We also validate the generalization of the proposed strategy on two
other competitive speech enhancement models.Comment: 11 pages, 8 figures, accepted by IEEE/ACM TASL
- …