7,589 research outputs found

    Sound and noise

    Get PDF
    Sound and noise problems in space environment and human tolerance criteria at varying frequencies and intensitie

    Effects of an Artificially Lengthened Vocal Tract on Glottal Closed Quotient in Untrained Male Voices

    Get PDF
    The use of hard-walled narrow tubes, often called resonance tubes, for the purpose of voice therapy and voice training has a historical precedent and some theoretical support, but the mechanism of any potential benefit from the application of this technique has remained poorly understood. Fifteen vocally untrained male participants produced a series of spoken /Ψ/ vowels at a modal pitch and constant loudness, followed by a minute of repeated phonation into a hard-walled glass tube at the same pitch and loudness targets. The tube parameters and tube phonation task criteria were selected according to theoretical calculations predicting an increase in the acoustic load such that phonation would occur under conditions of near-maximum inertive reactance. Following tube phonation, each participant repeated a similar series of spoken /Ψ/ vowels. Electroglottography (EGG) was used to measure the glottal closed quotient (CQ) during each phase of the experiment. A single-subject, multiple-baseline design with direct replication across subjects was used to identify any changes in CQ across the phases of the experiment. Single-subject analysis using the method of Statistical Process Control (SPC) revealed statistically significant changes in CQ during tube phonation, but with no discernable pattern across the 15 participants. These results indicate that the use of resonance tubes can have a distinct effect on glottal closure, but the mechanism behind this change remains unclear. The implication is that vocal loading techniques such as this need to be studied further with specific attention paid to the underlying mechanism of any measured changes in glottal behavior, and especially to the role of instruction and feedback in the therapeutic and pedagogical application of these techniques

    Respiratory kinematics and the regulation of subglottic pressure for phonation of pitch jumps - a dynamic MRI study

    Get PDF
    The respiratory system is a central part of voice production as it contributes to the generation of subglottic pressure, which has an impact on voice parameters including fundamental frequency and sound pressure level. Both parameters need to be adjusted precisely during complex phonation tasks such as singing. In particular, the underlying functions of the diaphragm and rib cage in relation to the phonation of pitch jumps are not yet understood in detail. This study aims to analyse respiratory movements during phonation of pitch jumps using dynamic MRI of the lungs. Dynamic images of the breathing apparatus of 7 professional singers were acquired in the supine position during phonation of upwards and downwards pitch jumps in a high, medium, and low range of the singer's tessitura. Distances between characteristic anatomical landmarks in the lung were measured from the series of images obtained. During sustained phonation, the diaphragm elevates, and the rib cage is lowered in a monotonic manner. During downward pitch jumps the diaphragm suddenly changed its movement direction and presented with a short inspiratory activation which was predominant in the posterior part and was associated with a shift of the cupola in an anterior direction. The magnitude of this inspiratory movement was greater for jumps that started at higher compared to lower fundamental frequency. In contrast, expiratory movement of the rib cage and anterior diaphragm were simultaneous and continued constantly during the jump. The data underline the theory of a regulation of subglottic pressure via a sudden diaphragm contraction during phonation of pitch jumps downwards, while the rib cage is not involved in short term adaptations. This strengthens the idea of a differentiated control of rib cage and diaphragm as different functional units during singing phonation

    Engineering data compendium. Human perception and performance. User's guide

    Get PDF
    The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use

    Statistical Spectral Parameter Estimation of Acoustic Signals with Applications to Byzantine Music

    Get PDF
    Digitized acoustical signals of Byzantine music performed by Iakovos Nafpliotis are used to extract the fundamental frequency of each note of the diatonic scale. These empirical results are then contrasted to the theoretical suggestions and previous empirical findings. Several parametric and non-parametric spectral parameter estimation methods are implemented. These include: (1) Phase vocoder method, (2) McAulay-Quatieri method, (3) Levinson-Durbin algorithm,(4) YIN, (5) Quinn & Fernandes Estimator, (6) Pisarenko Frequency Estimator, (7) MUltiple SIgnal Characterization (MUSIC) algorithm, (8) Periodogram method, (9) Quinn & Fernandes Filtered Periodogram, (10) Rife & Vincent Estimator, and (11) the Fourier transform. Algorithm performance was very precise. The psychophysical aspect of human pitch discrimination is explored. The results of eight (8) psychoacoustical experiments were used to determine the aural just noticeable difference (jnd) in pitch and deduce patterns utilized to customize acceptable performable pitch deviation to the application at hand. These customizations [Acceptable Performance Difference (a new measure of frequency differential acceptability), Perceptual Confidence Intervals (a new concept of confidence intervals based on psychophysical experiment rather than statistics of performance data), and one based purely on music-theoretical asymphony] are proposed, discussed, and used in interpretation of results. The results suggest that Nafpliotis\u27 intervals are closer to just intonation than Byzantine theory (with minor exceptions), something not generally found in Thrasivoulos Stanitsas\u27 data. Nafpliotis\u27 perfect fifth is identical to the just intonation, even though he overstretches his octaveby fifteen (15)cents. His perfect fourth is also more just, as opposed to Stanitsas\u27 fourth which is directionally opposite. Stanitsas\u27 tendency to exaggerate the major third interval A4-F4 is still seen in Nafpliotis, but curbed. This is the only noteworthy departure from just intonation, with Nafpliotis being exactly Chrysanthian (the most exaggerated theoretical suggestion of all) and Stanitsas overstretching it even more than Nafpliotis and Chrysanth. Nafpliotis ascends in the second tetrachord more robustly diatonically than Stanitsas. The results are reported and interpreted within the framework of Acceptable Performance Differences

    KLASYFIKACJA CHOROBY PARKINSONA I INNYCH ZABURZEŃ NEUROLOGICZNYCH Z WYKORZYSTANIEM EKSTRAKCJI CECH GŁOSOWYCH I TECHNIK REDUKCJI

    Get PDF
    This study aimed to differentiate individuals with Parkinson's disease (PD) from those with other neurological disorders (ND) by analyzing voice samples, considering the association between voice disorders and PD. Voice samples were collected from 76 participants using different recording devices and conditions, with participants instructed to sustain the vowel /a/ comfortably. PRAAT software was employed to extract features including autocorrelation (AC), cross-correlation (CC), and Mel frequency cepstral coefficients (MFCC) from the voice samples. Principal component analysis (PCA) was utilized to reduce the dimensionality of the features. Classification Tree (CT), Logistic Regression, Naive Bayes (NB), Support Vector Machines (SVM), and Ensemble methods were employed as supervised machine learning techniques for classification. Each method provided distinct strengths and characteristics, facilitating a comprehensive evaluation of their effectiveness in distinguishing PD patients from individuals with other neurological disorders. The Naive Bayes kernel, using seven PCA-derived components, achieved the highest accuracy rate of 86.84% among the tested classification methods. It is worth noting that classifier performance may vary based on the dataset and specific characteristics of the voice samples. In conclusion, this study demonstrated the potential of voice analysis as a diagnostic tool for distinguishing PD patients from individuals with other neurological disorders. By employing a variety of voice analysis techniques and utilizing different machine learning algorithms, including Classification Tree, Logistic Regression, Naive Bayes, Support Vector Machines, and Ensemble methods, a notable accuracy rate was attained. However, further research and validation using larger datasets are required to consolidate and generalize these findings for future clinical applications.Przedstawione badanie miało na celu różnicowanie osób z chorobą Parkinsona (PD) od osób z innymi zaburzeniami neurologicznymi poprzez analizę próbek głosowych, biorąc pod uwagę związek między zaburzeniami głosu a PD. Próbki głosowe zostały zebrane od 76 uczestników przy użyciu różnych urządzeń i warunków nagrywania, a uczestnicy byli instruowani, aby wydłużyć samogłoskę /a/ w wygodnym tempie. Oprogramowanie PRAAT zostało zastosowane do ekstrakcji cech, takich jak autokorelacja (AC), krzyżowa korelacja (CC) i współczynniki cepstralne Mel (MFCC) z próbek głosowych. Analiza składowych głównych (PCA) została wykorzystana w celu zmniejszenia wymiarowości cech. Jako techniki nadzorowanego uczenia maszynowego wykorzystano drzewa decyzyjne (CT), regresję logistyczną, naiwny klasyfikator Bayesa (NB), maszyny wektorów nośnych (SVM) oraz metody zespołowe. Każda z tych metod posiadała swoje unikalne mocne strony i charakterystyki, umożliwiając kompleksową ocenę ich skuteczności w rozróżnianiu pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Naiwny klasyfikator Bayesa, wykorzystujący siedem składowych PCA, osiągnął najwyższy wskaźnik dokładności na poziomie 86,84% wśród przetestowanych metod klasyfikacji. Należy jednak zauważyć, że wydajność klasyfikatora może się różnić w zależności od zbioru danych i konkretnych cech próbek głosowych. Podsumowując, to badanie wykazało potencjał analizy głosu jako narzędzia diagnostycznego do rozróżniania pacjentów z PD od osób z innymi zaburzeniami neurologicznymi. Poprzez zastosowanie różnych technik analizy głosu i wykorzystanie różnych algorytmów uczenia maszynowego, takich jak drzewa decyzyjne, regresja logistyczna, naiwny klasyfikator Bayesa, maszyny wektorów nośnych i metody zespołowe, osiągnięto znaczący poziom dokładności. Niemniej jednak, konieczne są dalsze badania i walidacja na większych zbiorach danych w celu skonsolidowania i uogólnienia tych wyników dla przyszłych zastosowań klinicznych

    Proceedings of the Second International Mobile Satellite Conference (IMSC 1990)

    Get PDF
    Presented here are the proceedings of the Second International Mobile Satellite Conference (IMSC), held June 17-20, 1990 in Ottawa, Canada. Topics covered include future mobile satellite communications concepts, aeronautical applications, modulation and coding, propagation and experimental systems, mobile terminal equipment, network architecture and control, regulatory and policy considerations, vehicle antennas, and speech compression

    A Study of Accomodation of Prosodic and Temporal Features in Spoken Dialogues in View of Speech Technology Applications

    Get PDF
    Inter-speaker accommodation is a well-known property of human speech and human interaction in general. Broadly it refers to the behavioural patterns of two (or more) interactants and the effect of the (verbal and non-verbal) behaviour of each to that of the other(s). Implementation of thisbehavior in spoken dialogue systems is desirable as an improvement on the naturalness of humanmachine interaction. However, traditional qualitative descriptions of accommodation phenomena do not provide sufficient information for such an implementation. Therefore, a quantitativedescription of inter-speaker accommodation is required. This thesis proposes a methodology of monitoring accommodation during a human or humancomputer dialogue, which utilizes a moving average filter over sequential frames for each speaker. These frames are time-aligned across the speakers, hence the name Time Aligned Moving Average (TAMA). Analysis of spontaneous human dialogue recordings by means of the TAMA methodology reveals ubiquitous accommodation of prosodic features (pitch, intensity and speech rate) across interlocutors, and allows for statistical (time series) modeling of the behaviour, in a way which is meaningful for implementation in spoken dialogue system (SDS) environments.In addition, a novel dialogue representation is proposed that provides an additional point of view to that of TAMA in monitoring accommodation of temporal features (inter-speaker pause length and overlap frequency). This representation is a percentage turn distribution of individual speakercontributions in a dialogue frame which circumvents strict attribution of speaker-turns, by considering both interlocutors as synchronously active. Both TAMA and turn distribution metrics indicate that correlation of average pause length and overlap frequency between speakers can be attributed to accommodation (a debated issue), and point to possible improvements in SDS “turntaking” behaviour. Although the findings of the prosodic and temporal analyses can directly inform SDS implementations, further work is required in order to describe inter-speaker accommodation sufficiently, as well as to develop an adequate testing platform for evaluating the magnitude ofperceived improvement in human-machine interaction. Therefore, this thesis constitutes a first step towards a convincingly useful implementation of accommodation in spoken dialogue systems

    Fractal based speech recognition and synthesis

    Get PDF
    Transmitting a linguistic message is most often the primary purpose of speech com­munication and the recognition of this message by machine that would be most useful. This research consists of two major parts. The first part presents a novel and promis­ing approach for estimating the degree of recognition of speech phonemes and makes use of a new set of features based fractals. The main methods of computing the frac­tal dimension of speech signals are reviewed and a new speaker-independent speech recognition system developed at De Montfort University is described in detail. Fi­nally, a Least Square Method as well as a novel Neural Network algorithm is employed to derive the recognition performance of the speech data. The second part of this work studies the synthesis of speech words, which is based mainly on the fractal dimension to create natural sounding speech. The work shows that by careful use of the fractal dimension together with the phase of the speech signal to ensure consistent intonation contours, natural-sounding speech synthesis is achievable with word level speech. In order to extend the flexibility of this framework, we focused on the filtering and the compression of the phase to maintain and produce natural sounding speech. A ‘naturalness level’ is achieved as a result of the fractal characteristic used in the synthesis process. Finally, a novel speech synthesis system based on fractals developed at De Montfort University is discussed. Throughout our research simulation experiments were performed on continuous speech data available from the Texas Instrument Massachusetts institute of technology ( TIMIT) database, which is designed to provide the speech research community with a standarised corpus for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition system
    corecore