Search CORE

1,404 research outputs found

Low bit rate digital apeech signal processing systems

Author: Ahmadi S.
Ahmadi S.
Publication venue: Department of Electrical Engineering, Imperial College London
Publication date: 01/01/1980
Field of study

Imperial Users onl

Spiral - Imperial College Digital Repository

Estimation of Sparse MIMO Channels with Common Support

Author: Barbotin Yann
Hormati Ali
Rangan Sundeep
Vetterli Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2011
Field of study

We consider the problem of estimating sparse communication channels in the MIMO context. In small to medium bandwidth communications, as in the current standards for OFDM and CDMA communication systems (with bandwidth up to 20 MHz), such channels are individually sparse and at the same time share a common support set. Since the underlying physical channels are inherently continuous-time, we propose a parametric sparse estimation technique based on finite rate of innovation (FRI) principles. Parametric estimation is especially relevant to MIMO communications as it allows for a robust estimation and concise description of the channels. The core of the algorithm is a generalization of conventional spectral estimation methods to multiple input signals with common support. We show the application of our technique for channel estimation in OFDM (uniformly/contiguous DFT pilots) and CDMA downlink (Walsh-Hadamard coded schemes). In the presence of additive white Gaussian noise, theoretical lower bounds on the estimation of SCS channel parameters in Rayleigh fading conditions are derived. Finally, an analytical spatial channel model is derived, and simulations on this model in the OFDM setting show the symbol error rate (SER) is reduced by a factor 2 (0 dB of SNR) to 5 (high SNR) compared to standard non-parametric methods - e.g. lowpass interpolation.Comment: 12 pages / 7 figures. Submitted to IEEE Transactions on Communicatio

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Data reduction for the transmission of time encoded speech.

Author: Longshaw Stephen
Publication venue
Publication date: 01/01/1985
Field of study

OPUS

Orthogonal transform feasibility study

Author: Robinson G. S.
Publication venue
Publication date
Field of study

The application of various orthogonal transformations to communication was investigated, with particular emphasis placed on speech and visual signal processing. The fundamentals of the one- and two-dimensional orthogonal transforms and their application to speech and visual signals are treated in detail

NASA Technical Reports Server

A Review of Analog Audio Scrambling Methods for Residual Intelligibility

Author: Selvan P.Arul
Srinivasan A.
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 31/07/2012
Field of study

In this paper, a review of the techniques available in different categories of audio scrambling schemes is done with respect to Residual Intelligibility. According to Shannon's secure communication theory, for the residual intelligibility to be zero the scrambled signal must represent a white signal. Thus the scrambling scheme that has zero residual intelligibility is said to be highly secure. Many analog audio scrambling algorithms that aim to achieve lower levels of residual intelligibility are available. In this paper a review of all the existing analog audio scrambling algorithms proposed so far and their properties and limitations has been presented. The aim of this paper is to provide an insight for evaluating various analog audio scrambling schemes available up-to-date. The review shows that the algorithms have their strengths and weaknesses and there is no algorithm that satisfies all the factors to the maximum extent. Keywords: residual Intelligibility, audio scrambling, speech scramblin

International Institute for Science, Technology and Education (IISTE): E-Journals

Quantum Computing Assisted Speech Processing

Author: Strobl Melvin
Publication venue: Karlsruher Institut für Technologie
Publication date: 03/01/2022
Field of study

Mensch-Maschine-Interaktion im Allgemeinen und Sprachverarbeitung im Besonderen sind Schlüsseldisziplinen in der heutigen Unterhaltungselektronik. Obwohl die Rechenleistung mobiler Geräte in den letzten Jahren stark zugenommen hat, sind Aufgaben wie Spracherkennung immernoch hauptsächlich auf cloudbasierte Lösungen angewiesen. Bei solchen Architekturen is nicht nur eine hohe Genauigkeit, sondern auch eine schnelle Reaktionszeit für eine reale und nutzerfreundliche Anwendung unerlässlich. Moderne Ansätze verwenden maschinelles Lernen für die Erkennung der Sprache, die hoch performante Hardware und umfassende Datensätze erfordert. Neben dem eigentlichen Training und der Inferenz solcher Modelle für das maschinelle Lernen erfordert Spracherkennung die Extraktion von akustischen Merkmalen aus der aufgenommenen Sprache. Spektrogramme haben sich hierbei als gut geeigneter Merkmalsraum erwiesen und sich in heutigen Systemen etabliert. Eine Anwendung von Quantencomputern in der Spracherkennung wurde zuvor in der Arbeit von [YQC+20b] vorgeschlagen, in welcher ein Neuronales Netz, das auf mittels von einem Quantencomputer manipulierten Spektrogrammen trainiert wurde, die Validierungsgenauigkeit des klassischen Ansatzes übertraf. Quantencomputer sind jedoch vor allem für ihre Überlegenheit gegenüber klassischen Computern im Berechnen bestimmter Algorithmen bekannt. Da die Quanten-Fourier-Transformation, das Äquivalent der klassischen Fourier-Transformation auf einem Quantencomputer, ein solcher Algorithmus ist, stellt sich die natürliche Frage und somit das Thema dieser Arbeit, ob es Möglichkeiten oder sogar Vorteile gibt, die Quanten-Fourier-Transformation für die Spektrogrammerzeugung zu nutzen. Die Untersuchung dieser Frage erfordert den Aufbau eines geeigneten Frameworks, in dem eine kurzzeit-Quanten-Fourier-Transformation entwickelt, optimiert und ggf. Rauschunterdrückung angewandt wird. Anschließend wird die Genauigkeit eines Neuronalen Netzes, trainiert auf den mittels der kurzzeit-Quanten-Fourier-Transformation erzeugten Merkmalen, evaluiert und diskutiert. Da die Sprachsynthese, als eine weitere Unterkategorie der Sprachverarbeitung, ein völlig anderes Framework erfordert und ein ganzes Set an weiteren Herausfoderungen beherbergt, wenngleich viele aus der Spracherkennung gewonnenen Erkenntnisse darin übertragen werden können, konzentriert sich diese Arbeit ausschließlich auf die Spracherkennung. Durch die Verwendung eines modularen Ansatzes können verschiedene Signaltypen sowie Transformationen schnell ausgetauscht und entweder in der Simulation oder auf realen Quantencomputern getestet werden. Für die Bewertung der Genauigkeit des Neuronalen Netzwerks, gegebenen den Merkmale aus verschiedenen Konﬁgurationen der kurzzeitQuanten-Fourier-Transformation, wird die in [YQC + 20b] vorgeschlagene Architektur als Ausgangspunkt verwendet und mit ihrer Genauigkeit von 95.12 % als Referenzwert verglichen. Experimente zeigen, dass Quantencomputer der “Noisy Intermediate Scale Quantum”Ära zwar in der Lage sind, die Quanten-Fourier-Transformation von stark bandbegrenzten harmonischen Schwingungen zu verarbeiten. Jedoch verbietet der beschränkte Zugang zu komplexeren Quantencomputern, die notwendig sind um den Anforderungen an die Abtastfrequenz von Sprachsignalen in Bezug auf Zeit- und Frequenzauﬂösung zu erfüllen, ix eine Anwendung in praktischen Spracherkennungsszenarien. Durch die Verwendung einer Simulationsumgebung mit dem Rauschmodell eines Quantencomputers in Kombination mit den in dieser Arbeit entwickelten Ansätze, ermöglicht das mit dem kurzzeit-Quanten-Fourier-Transformation erzeugte Spektrogramm dem Neuronalen Netzwerk eine Testgenauigkeit von 89.92 %, während jedoch die auf realen Geräten potentielle Geschwindigkeitssteigerung verloren geht. Obwohl die Genauigkeit nicht über der Referenz liegt und das Rauschen und die Kapazität von “Noisy Intermediate Scale Quantum”Geräten die Anwendbarkeit von Spracherkennung mit Quantenvorteil einschränkt, motivieren die Ergebnisse zu weiteren Untersuchungen in praktischen Anwendungen der Quanten-Fourier-Transformation für die Sprachverarbeitung

KITopen

Picture coding in viewdata systems

Author: K. N. Ngan (7203722)
Publication venue
Publication date: 01/01/1982
Field of study

Viewdata systems in commercial use at present offer the facility for transmitting alphanumeric text and graphic displays via the public switched telephone network. An enhancement to the system would be to transmit true video images instead of graphics. Such a system, under development in Britain at present uses Differential Pulse Code Modulation (DPCM) and a transmission rate of 1200 bits/sec. Error protection is achieved by the use of error protection codes, which increases the channel requirement. In this thesis, error detection and correction of DPCM coded video signals without the use of channel error protection is studied. The scheme operates entirely at the receiver by examining the local statistics of the received data to determine the presence of errors. Error correction is then undertaken by interpolation from adjacent correct or previousiy corrected data. DPCM coding of pictures has the inherent disadvantage of a slow build-up of the displayed picture at the receiver and difficulties with image size manipulation. In order to fit the pictorial information into a viewdata page, its size has to be reduced. Unitary transforms, typically the discrete Fourier transform (DFT), the discrete cosine transform (DCT) and the Hadamard transform (HT) enable lowpass filtering and decimation to be carried out in a single operation in the transform domain. Size reductions of different orders are considered and the merits of the DFT, DCT and HT are investigated. With limited channel capacity, it is desirable to remove the redundancy present in the source picture in order to reduce the bit rate. Orthogonal transformation decorrelates the spatial sample distribution and packs most of the image energy in the low order coefficients. This property is exploited in bit-reduction schemes which are adaptive to the local statistics of the different source pictures used. In some cases, bit rates of less than 1.0 bit/pel are achieved with satisfactory received picture quality. Unlike DPCM systems, transform coding has the advantage of being able to display rapidly a picture of low resolution by initial inverse transformation of the low order coefficients only. Picture resolution is then progressively built up as more coefficients are received and decoded. Different sequences of picture update are investigated to find that which achieves the best subjective quality with the fewest possible coefficients transmitted

Loughborough University Institutional Repository

"Rewiring" Filterbanks for Local Fourier Analysis: Theory and Practice

Author: Hirakawa Keigo
Wolfe Patrick J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/09/2009
Field of study

This article describes a series of new results outlining equivalences between certain "rewirings" of filterbank system block diagrams, and the corresponding actions of convolution, modulation, and downsampling operators. This gives rise to a general framework of reverse-order and convolution subband structures in filterbank transforms, which we show to be well suited to the analysis of filterbank coefficients arising from subsampled or multiplexed signals. These results thus provide a means to understand time-localized aliasing and modulation properties of such signals and their subband representations--notions that are notably absent from the global viewpoint afforded by Fourier analysis. The utility of filterbank rewirings is demonstrated by the closed-form analysis of signals subject to degradations such as missing data, spatially or temporally multiplexed data acquisition, or signal-dependent noise, such as are often encountered in practical signal processing applications

arXiv.org e-Print Archive

Crossref