10,474 research outputs found
Automatic Segmentation of Punjabi Speech Signal using Group Delay
Th is paper describes the concept of automatic segmentation of continuous speech signal. The language used for segmentation is the most widely spoken language i.e. Punjabi. Like all other Indian languages, Punjabi is a syllabic language, thus syllables are selected as the basic unit of segmentation. The traditional way of representing the speech signal is in terms of features derived from short - time Fourier analysis. It is difficult to compute the phase and processing the phase function from the FT phase. By processing the derivative of the FT phase, the information in the short - time FT phase function can be extrac ted. This paper describes the process of automatic segmentation of speech using group delay technique. This includes segmentation of continuous Punjabi speech into syllable like units by using the high resolution properties of group delay. This group delay function is found to be a better representative of the STE function for syllable boundary detection
Automatic Segmentation of Indonesian Speech into Syllables using Fuzzy Smoothed Energy Contour with Local Normalization, Splitting, and Assimilation
This paper discusses the usage of short term energy contour of a speech smoothed by a fuzzy-based method to automatically segment the speech into syllabic units. Two additional procedures, local normalization and postprocessing, are proposed to improve the method. Testing to Indonesian speech dataset shows that local normalization significantly improves the accuracy of fuzzy smoothing. In postprocessing step, the procedure of splitting missed short syllables reduces the deletion errors, but unfortunately it increases the insertion ones. On the other hand, an assimilation of a single consonant segment into its previous or next segment reduces the insertion errors, but increases the deletion ones. The sequential combination of splitting and then assimilation gives quite significant improvement of accuracy as well as reduction of deletion errors, but it slightly increases the insertion ones
Time-varying group delay as a basis for clustering and segmentation of seismic signals
In this paper the applications of group delay in seismic vibration signals analysis are discussed. A method which bases on the autoregressive model with sliding-window is used to track volatility of signal’s properties in time. The analysis of time-frequency maps of group delay can be used in a process of distinguishing signals of different characteristics. Moreover, the method is robust for the different parameters of the sliding-window AR model. In the article applications of the time-frequency maps of group delay in a signal segmentation and clustering are also discussed. In seismic analysis an ability to distinguish signals with different seismic nature is very important, especially in case of safety in copper-ore underground mines. Creation of tools for revealing the origin of vibration will have positive impact on evaluation of hazard level
Automatic annotation of musical audio for interactive applications
PhDAs machines become more and more portable, and part of our everyday life, it becomes
apparent that developing interactive and ubiquitous systems is an important
aspect of new music applications created by the research community. We are interested
in developing a robust layer for the automatic annotation of audio signals, to
be used in various applications, from music search engines to interactive installations,
and in various contexts, from embedded devices to audio content servers. We
propose adaptations of existing signal processing techniques to a real time context.
Amongst these annotation techniques, we concentrate on low and mid-level tasks
such as onset detection, pitch tracking, tempo extraction and note modelling. We
present a framework to extract these annotations and evaluate the performances of
different algorithms.
The first task is to detect onsets and offsets in audio streams within short latencies.
The segmentation of audio streams into temporal objects enables various
manipulation and analysis of metrical structure. Evaluation of different algorithms
and their adaptation to real time are described. We then tackle the problem of
fundamental frequency estimation, again trying to reduce both the delay and the
computational cost. Different algorithms are implemented for real time and experimented
on monophonic recordings and complex signals. Spectral analysis can be
used to label the temporal segments; the estimation of higher level descriptions is
approached. Techniques for modelling of note objects and localisation of beats are
implemented and discussed.
Applications of our framework include live and interactive music installations,
and more generally tools for the composers and sound engineers. Speed optimisations
may bring a significant improvement to various automated tasks, such as
automatic classification and recommendation systems. We describe the design of
our software solution, for our research purposes and in view of its integration within
other systems.EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music
Audio Contents);
EPSRC grants GR/R54620; GR/S75802/01
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
- …