33,684 research outputs found
Hybrid Method for Digits Recognition using Fixed-Frame Scores and Derived Pitch
This paper presents a procedure of frame normalization based on the traditional dynamic time warping (DTW) using the LPC coefficients. The redefined method is called as the DTW frame-fixing method (DTW-FF), it works by normalizing the word frames of the input against the
reference frames. The enthusiasm to this study is due to neural network limitation that entails a fix number of input nodes for when processing multiple inputs in parallel. Due to this problem, this research is initiated to reduce the amount of computation and complexity in a neural network by reducing the number of inputs into the network. In this study, dynamic warping process is used, in which local distance scores of the warping path are fixed and collected so that their scores are of equal number of frames. Also studied in this paper is the
consideration of pitch as a contributing feature to the speech recognition. Results showed a good performance and
improvement when using pitch along with DTW-FF feature.
The convergence rate between using the steepest gradient
descent is also compared to another method namely conjugate
gradient method. Convergence rate is also improved when
conjugate gradient method is introduced in the back-propagation algorithm
Fourier phase and pitch-class sum
Music theorists have proposed two very different geometric models of musical objects, one based on voice leading and the other based on the Fourier transform. On the surface these models are completely different, but they converge in special cases, including many geometries that are of particular analytical interest.Accepted manuscrip
A Novel Method For Speech Segmentation Based On Speakers' Characteristics
Speech Segmentation is the process change point detection for partitioning an
input audio stream into regions each of which corresponds to only one audio
source or one speaker. One application of this system is in Speaker Diarization
systems. There are several methods for speaker segmentation; however, most of
the Speaker Diarization Systems use BIC-based Segmentation methods. The main
goal of this paper is to propose a new method for speaker segmentation with
higher speed than the current methods - e.g. BIC - and acceptable accuracy. Our
proposed method is based on the pitch frequency of the speech. The accuracy of
this method is similar to the accuracy of common speaker segmentation methods.
However, its computation cost is much less than theirs. We show that our method
is about 2.4 times faster than the BIC-based method, while the average accuracy
of pitch-based method is slightly higher than that of the BIC-based method.Comment: 14 pages, 8 figure
Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification
The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards
GOES-I/M ascent maneuvers from transfer orbit to station
The Geostationary Operational Environmental Satellite (GOES)-I/M station acquisition sequence consists nominally of three in-plane/out-of-plane maneuvers at apogee on the line of relative nodes and a small in-plane maneuver at perigee. Existing software to determine maneuver attitude, ignition time, and burn duration required modification to optimize the out-of-plane parts and admit the noninertial, three-axis stabilized attitude. The Modified Multiple Impulse Station Acquisition Maneuver Planning Program (SENARIO2) was developed from its predecessor, SCENARIO, to optimize the out-of-plane components of the impulsive delta-V vectors. Additional new features include commputation of short term J sub 2 perturbations and output of all premaneuver and postmaneuver orbit elements, coarse maneuver attitudes, propellant usage, spacecraft antenna aspect angles, and ground station coverage. The output data are intended to be used in the launch window computation and by the maneuver targeting computation (General Maneuver (GMAN) Program) software. The maneuver targeting computation in GMAN was modified to admit the GOES-I/M maneuver attitude. Appropriate combinations of ignition time, burn duration, and attitude enable any reasonable target orbit to be achieved
Efficient Bayesian inference for harmonic models via adaptive posterior factorization
NOTICE: this is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in NEUROCOMPUTING, [VOL72, ISSUE 1-3, (2008)] DOI10.1016/j.neucom.2007.12.05
- …