6,664 research outputs found
Recommended from our members
Non-Negative Tensor Factorization Applied to Music Genre Classification
Music genre classification techniques are typically applied to the data matrix whose columns are the feature vectors extracted from music recordings. In this paper, a feature vector is extracted using a texture window of one sec, which enables the representation of any 30 sec long music recording as a time sequence of feature vectors, thus yielding a feature matrix. Consequently, by stacking the feature matrices associated to any dataset recordings, a tensor is created, a fact which necessitates studying music genre classification using tensors. First, a novel algorithm for non-negative tensor factorization (NTF) is derived that extends the non-negative matrix factorization. Several variants of the NTF algorithm emerge by employing different cost functions from the class of Bregman divergences. Second, a novel supervised NTF classifier is proposed, which trains a basis for each class separately and employs basis orthogonalization. A variety of spectral, temporal, perceptual, energy, and pitch descriptors is extracted from 1000 recordings of the GTZAN dataset, which are distributed across 10 genre classes. The NTF classifier performance is compared against that of the multilayer perceptron and the support vector machines by applying a stratified 10-fold cross validation. A genre classification accuracy of 78.9% is reported for the NTF classifier demonstrating the superiority of the aforementioned multilinear classifier over several data matrix-based state-of-the-art classifiers
Exploiting Nonlinear Recurrence and Fractal Scaling Properties for Voice Disorder Detection
Background: Voice disorders affect patients profoundly, and acoustic tools can potentially measure voice function objectively. Disordered sustained vowels exhibit wide-ranging phenomena, from nearly periodic to highly complex, aperiodic vibrations, and increased "breathiness". Modelling and surrogate data studies have shown significant nonlinear and non-Gaussian random properties in these sounds. Nonetheless, existing tools are limited to analysing voices displaying near periodicity, and do not account for this inherent biophysical nonlinearity and non-Gaussian randomness, often using linear signal processing methods insensitive to these properties. They do not directly measure the two main biophysical symptoms of disorder: complex nonlinear aperiodicity, and turbulent, aeroacoustic, non-Gaussian randomness. Often these tools cannot be applied to more severe disordered voices, limiting their clinical usefulness.

Methods: This paper introduces two new tools to speech analysis: recurrence and fractal scaling, which overcome the range limitations of existing tools by addressing directly these two symptoms of disorder, together reproducing a "hoarseness" diagram. A simple bootstrapped classifier then uses these two features to distinguish normal from disordered voices.

Results: On a large database of subjects with a wide variety of voice disorders, these new techniques can distinguish normal from disordered cases, using quadratic discriminant analysis, to overall correct classification performance of 91.8% plus or minus 2.0%. The true positive classification performance is 95.4% plus or minus 3.2%, and the true negative performance is 91.5% plus or minus 2.3% (95% confidence). This is shown to outperform all combinations of the most popular classical tools.

Conclusions: Given the very large number of arbitrary parameters and computational complexity of existing techniques, these new techniques are far simpler and yet achieve clinically useful classification performance using only a basic classification technique. They do so by exploiting the inherent nonlinearity and turbulent randomness in disordered voice signals. They are widely applicable to the whole range of disordered voice phenomena by design. These new measures could therefore be used for a variety of practical clinical purposes.

Affective games:a multimodal classification system
Affective gaming is a relatively new field of research that exploits human emotions to influence gameplay for an enhanced player experience. Changes in player’s psychology reflect on their behaviour and physiology, hence recognition of such variation is a core element in affective games. Complementary sources of affect offer more reliable recognition, especially in contexts where one modality is partial or unavailable. As a multimodal recognition system, affect-aware games are subject to the practical difficulties met by traditional trained classifiers. In addition, inherited game-related challenges in terms of data collection and performance arise while attempting to sustain an acceptable level of immersion. Most existing scenarios employ sensors that offer limited freedom of movement resulting in less realistic experiences. Recent advances now offer technology that allows players to communicate more freely and naturally with the game, and furthermore, control it without the use of input devices. However, the affective game industry is still in its infancy and definitely needs to catch up with the current life-like level of adaptation provided by graphics and animation
On the analysis of EEG power, frequency and asymmetry in Parkinson's disease during emotion processing
Objective: While Parkinson’s disease (PD) has traditionally been described as a movement disorder, there is growing evidence of disruption in emotion information processing associated with the disease. The aim of this study was to investigate whether there are specific electroencephalographic (EEG) characteristics that discriminate PD patients and normal controls during emotion information processing.
Method: EEG recordings from 14 scalp sites were collected from 20 PD patients and 30 age-matched normal controls. Multimodal (audio-visual) stimuli were presented to evoke specific targeted emotional states such as happiness, sadness, fear, anger, surprise and disgust. Absolute and relative power, frequency and asymmetry measures derived from spectrally analyzed EEGs were subjected to repeated ANOVA measures for group comparisons as well as to discriminate function analysis to examine their utility as classification indices. In addition, subjective ratings were obtained for the used emotional stimuli.
Results: Behaviorally, PD patients showed no impairments in emotion recognition as measured by subjective ratings. Compared with normal controls, PD patients evidenced smaller overall relative delta, theta, alpha and beta power, and at bilateral anterior regions smaller absolute theta, alpha, and beta power and higher mean total spectrum frequency across different emotional states. Inter-hemispheric theta, alpha, and beta power asymmetry index differences were noted, with controls exhibiting greater right than left hemisphere activation. Whereas intra-hemispheric alpha power asymmetry reduction was exhibited in patients bilaterally at all regions. Discriminant analysis correctly classified 95.0% of the patients and controls during emotional stimuli.
Conclusion: These distributed spectral powers in different frequency bands might provide meaningful information about emotional processing in PD patients
Machine Analysis of Facial Expressions
No abstract
- …