Search CORE

2,874 research outputs found

Graph Spectral Image Processing

Author: Cheung Gene
Magli Enrico
Ng Michael
Tanaka Yuichi
Publication venue
Publication date: 16/01/2018
Field of study

Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Real Time Turbulent Video Perfecting by Image Stabilization and Super-Resolution

Author: A. Mitiche
A.T. Mohammed
B. Cohen
B. Ellerbroek
B. Horn
B.M. Welsh
B.R. Frieden
Barak Fishbain
D. Sadot
D.G. Sheppard
H.H. Nagel
Ianir A. Ideses
J. Weickert
L. Alvarez
L.J. Barron
L.P. Yaroslavsky
L.P. Yaroslavsky
Leonid P. Yaroslavsky
S.C. Cheung
Y. Glick
Y. Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/04/2007
Field of study

Image and video quality in Long Range Observation Systems (LOROS) suffer from atmospheric turbulence that causes small neighbourhoods in image frames to chaotically move in different directions and substantially hampers visual analysis of such image and video sequences. The paper presents a real-time algorithm for perfecting turbulence degraded videos by means of stabilization and resolution enhancement. The latter is achieved by exploiting the turbulent motion. The algorithm involves generation of a reference frame and estimation, for each incoming video frame, of a local image displacement map with respect to the reference frame; segmentation of the displacement map into two classes: stationary and moving objects and resolution enhancement of stationary objects, while preserving real motion. Experiments with synthetic and real-life sequences have shown that the enhanced videos, generated in real time, exhibit substantially better resolution and complete stabilization for stationary objects while retaining real motion.Comment: Submitted to The Seventh IASTED International Conference on Visualization, Imaging, and Image Processing (VIIP 2007) August, 2007 Palma de Mallorca, Spai

arXiv.org e-Print Archive

Crossref

Digital Signal Processing

Author: Baggeroer Arthur B.
Bordley Thomas E.
Chan Philip
Curtis Susan R.
Davis Randall
Deadrick Douglas S.
Dove Webster P.
Dowla Farid U.
Frisk George V.
Griffin Daniel W.
Harrison William A.
Izraelevitz David
Lim Jae S.
Martinez Dennis M.
Milios Evangelos E.
Musicus Bruce R.
Myers Cory
Oppenheim Alan V.
Pappas Thrasyvoulos N.
Richard Michael D.
Sekiguchi Hiroshi
Sundaram Ramakrishnan
Wengrovitz Michael S.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date: 01/01/1984
Field of study

Contains introduction and reports on seventeen research projects.U.S. Navy - Office of Naval Research (Contract N00014-81-K-0742)U.S. Navy - Office of Naval Research (Contract N00014-77-C-0266)National Science Foundation (Grant ECS80-07102)Bell Laboratories FellowshipAmoco Foundation FellowshipSchlumberger-Doll Research Center FellowshipSanders Associates, Inc.Toshiba Company FellowshipM.I.T. Vinton Hayes FellowshipHertz Foundation Fellowshi

DSpace@MIT

Automatic Speech Recognition Using LP-DCTC/DCS Analysis Followed by Morphological Filtering

Author: Hix Penny
Publication venue: ODU Digital Commons
Publication date: 01/01/2006
Field of study

Front-end feature extraction techniques have long been a critical component in Automatic Speech Recognition (ASR). Nonlinear filtering techniques are becoming increasingly important in this application, and are often better than linear filters at removing noise without distorting speech features. However, design and analysis of nonlinear filters are more difficult than for linear filters. Mathematical morphology, which creates filters based on shape and size characteristics, is a design structure for nonlinear filters. These filters are limited to minimum and maximum operations that introduce a deterministic bias into filtered signals. This work develops filtering structures based on a mathematical morphology that utilizes the bias while emphasizing spectral peaks. The combination of peak emphasis via LP analysis with morphological filtering results in more noise robust speech recognition rates. To help understand the behavior of these pre-processing techniques the deterministic and statistical properties of the morphological filters are compared to the properties of feature extraction techniques that do not employ such algorithms. The robust behavior of these algorithms for automatic speech recognition in the presence of rapidly fluctuating speech signals with additive and convolutional noise is illustrated. Examples of these nonlinear feature extraction techniques are given using the Aurora 2.0 and Aurora 3.0 databases. Features are computed using LP analysis alone to emphasize peaks, morphological filtering alone, or a combination of the two approaches. Although absolute best results are normally obtained using a combination of the two methods, morphological filtering alone is nearly as effective and much more computationally efficient

Old Dominion University

Inference skipping for more efficient real-time speech enhancement with parallel RNNs

Author: Chen Kai
Le Xiaohuai
Lei Tong
Lu Jing
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/07/2022
Field of study

Deep neural network (DNN) based speech enhancement models have attracted extensive attention due to their promising performance. However, it is difficult to deploy a powerful DNN in real-time applications because of its high computational cost. Typical compression methods such as pruning and quantization do not make good use of the data characteristics. In this paper, we introduce the Skip-RNN strategy into speech enhancement models with parallel RNNs. The states of the RNNs update intermittently without interrupting the update of the output mask, which leads to significant reduction of computational load without evident audio artifacts. To better leverage the difference between the voice and the noise, we further regularize the skipping strategy with voice activity detection (VAD) guidance, saving more computational load. Experiments on a high-performance speech enhancement model, dual-path convolutional recurrent network (DPCRN), show the superiority of our strategy over strategies like network pruning or directly training a smaller model. We also validate the generalization of the proposed strategy on two other competitive speech enhancement models.Comment: 11 pages, 8 figures, accepted by IEEE/ACM TASL

arXiv.org e-Print Archive