103 research outputs found
Natural Language Processing Methods for Acoustic and Landmark Event-Based Features in Speech-Based Depression Detection
The processing of speech as an explicit sequence of events is common in automatic speech recognition (linguistic events), but has received relatively little attention in paralinguistic speech classification despite its potential for characterizing broad acoustic event sequences. This paper proposes a framework for analyzing speech as a sequence of acoustic events, and investigates its application to depression detection. In this framework, acoustic space regions are tokenized to 'words' representing speech events at fixed or irregular intervals. This tokenization allows the exploitation of acoustic word features using proven natural language processing methods. A key advantage of this framework is its ability to accommodate heterogeneous event types: herein we combine acoustic words and speech landmarks, which are articulation-related speech events. Another advantage is the option to fuse such heterogeneous events at various levels, including the embedding level. Evaluation of the proposed framework on both controlled laboratory-grade supervised audio recordings as well as unsupervised self-administered smartphone recordings highlight the merits of the proposed framework across both datasets, with the proposed landmark-dependent acoustic words achieving improvements in F1(depressed) of up to 15% and 13% for SH2-FS and DAIC-WOZ respectively, relative to acoustic speech baseline approaches
A study on marine boundary layer processes in the ITCZ and non-ITCZ regimes over Indian Ocean with INDOEX IFP-99 data
A one-dimensional numerical planetary boundary layer (PBL) model was applied to simulate the dynamical and thermodynamical characteristics of the tropical Indian Ocean under varying convective regimes. Using sounding as well as surface meteorological data obtained during the INDOEX field phase, the PBL was validated for three different regions within the INDOEX domain. The three regions identified were, a coastal location representing suppressed convection, an open ocean region with medium convection, and a region of intense convection in the vicinity of the Inter-Tropical Convergence Zone (ITCZ). The model was integrated using observed sounding as initial as well as lateral boundary conditions, for a period up to 48 h. The model simulated surface fields as well as vertical profiles were compared with observations for the three cases. In general the model performance was good. The one-dimensional model could not simulate the dynamical features associated with advection and winds satisfactorily. However, the convective regimes are well simulated. As such, the PBL processes near the ITCZ were better simulated compared to the coastal regions. Results suggest that such a model can be used as a tool to develop high resolution, time-varying profiles over data-sparse regions to enhance mesoscale analysis
Recommended from our members
Multimodal Affect Models: An Investigation of Relative Salience of Audio and Visual Cues for Emotion Prediction
People perceive emotions via multiple cues, predominantly speech and visual cues, and a number of emotion recognition systems utilize both audio and visual cues. Moreover, the perception of static aspects of emotion (speaker's arousal level is high/low) and the dynamic aspects of emotion (speaker is becoming more aroused) might be perceived via different expressive cues and these two aspects are integrated to provide a unified sense of emotion state. However, existing multimodal systems only focus on single aspect of emotion perception and the contributions of different modalities toward modeling static and dynamic emotion aspects are not well explored. In this paper, we investigate the relative salience of audio and video modalities to emotion state prediction and emotion change prediction using a Multimodal Markovian affect model. Experiments conducted in the RECOLA database showed that audio modality is better at modeling the emotion state of arousal and video for emotion state of valence, whereas audio shows superior advantages over video in modeling emotion changes for both arousal and valence.</jats:p
FracDetect: A novel algorithm for 3D fracture detection in digital fractured rocks
Fractures have a governing effect on the physical properties of fractured rocks, such as permeability. Accurate representation of 3D fractures is, therefore, required for precise analysis of digital fractured rocks. However, conventional segmentation methods fail to detect and label the fractures with aperture sizes near or below the resolution of 3D micro-computed tomographic (micro-CT) images, which are visible in the greyscale images, and where greyscale intensity convolution between different phases exists. In addition, conventional methods are highly subjective to user interpretation. Herein, a novel algorithm for the automatic detection of fractures from greyscale 3D micro-CT images is proposed. The algorithm involves a low-level early vision stage, which identifies potential fractures, followed by a high-level interpretative stage, which enforces planar continuity to reject false positives and more reliably extract planar fractures from digital rock images. A manually segmented fractured shale sample was used as the groundtruth, with which the efficacy of the algorithm in 3D fracture detection was validated. Following this, the proposed and conventional methods were applied to detect fractures in digital fractured coal and shale samples. Based on these analyses, the impact of fracture detection accuracy on the analysis of fractured rocks' physical properties was inferred
Investigating word affect features and fusion of probabilistic predictions incorporating uncertainty in AVEC 2017
© 2017 Association for Computing Machinery. Predicting emotion intensity and severity of depression are both challenging and important problems within the broader field of affective computing. As part of the AVEC 2017, we developed a number of systems to accomplish these tasks. In particular, word affect features, which derive human affect ratings (e.g. arousal and valence) from transcripts, were investigated for predicting depression severity and liking, showing great promise. A simple system based on the word affect features achieved an RMSE of 6.02 on the test set, yielding a relative improvement of 13.6% over the baseline. For the emotion prediction sub-challenge, we investigated multimodal fusion, which incorporated a measure of uncertainty associated with each prediction within an Output-Associative fusion framework for arousal and valence prediction, whilst liking prediction systems mainly focused on text-based features. Our best emotion prediction systems provided significant relative improvements over the baseline on the test set of 39.5%, 17.6%, and 29.3% for arousal, valence, and liking. Of particular note is that consistent improvements were observed when incorporating prediction uncertainty across various system configurations for predicting arousal and valence, suggesting the importance of taking into consideration prediction uncertainty for fusion and more broadly the advantages of probabilistic predictions
Recommended from our members
Awareness regarding eye donation among stakeholders in Srikakulam district in South India
Background
There is a huge need for the availability of transplantable donor corneas worldwide to reduce the burden of corneal blindness due to corneal opacity. Voluntary eye donation depends on the awareness levels of various stakeholders in the community. This study aimed to assess the awareness level regarding eye donation among various stakeholders in Srikakulam district in the state of Andhra Pradesh, India.
Methods
355 subjects were selected from the district using multi stage random sampling. A pre tested semi structured questionnaire was used to collect information regarding each individual’s awareness, knowledge, and perception regarding eye donation. Each response was scored individually and a total score was calculated. Univariate and multivariate regression analysis was used to determine the factors associated with willingness towards eye donation and increased awareness levels.
Results
Of the 355 subjects interviewed, 192 (54%) were male and 163 (46%) were female. The mean age of the stakeholders was 35.9 years (SD ±16.1) and all the study subjects were literate. Ninety-three percent of subjects were aware of the concept of eye donation. Knowledge levels were similar among the teaching community and persons engaged in social service, but lower among students (p < 0.05). Among the stakeholders, there was considerable ambiguity regarding whether persons currently wearing spectacles or suffering from a chronic illnesses could donate their eyes. Older age group (p < 0.001), female gender (p < 0.001) and education (p < 0.001) were associated with increased knowledge levels. 82% of the subjects were willing to donate their eyes and this was unaffected by gender or geographical location (rural vs urban).
Conclusions
Awareness levels and willingness to donate eyes are high among the stakeholders in Srikakulam district in India. The services of stakeholders could be utilized, in conjunction with other community based eye donation counselors, to promote awareness regarding eye donation among the general population
Recommended from our members
Prevalence of refractive errors among school-going children in a multistate study in India
Aim
Much existing data on childhood refractive error prevalence in India were gathered in local studies, many now dated. The aim of this study was to estimate the prevalence, severity and determinants of refractive errors among school-going children participating in a multistate vision screening programme across India.
Methods
In this cross-sectional study, vision screening was conducted in children aged 5–18 years at schools in five states using a pocket vision screener. Refractive error was measured using retinoscopy, and subjective refraction and was defined both by spherical equivalent (SE) and spherical ametropia, as myopia ≤−0.5 diopters (D), hyperopia ≥+1.0 D and/or astigmatism as >0.5 D. Data from the eye with less refractive error were used to determine prevalence.
Results
Among 2 240 804 children (50.9% boys, mean age 11.5 years, SD ±3.3), the prevalence of SE myopia was 1.57% (95% CI 1.54% to 1.60%) at 5–9 years, 3.13% (95% CI 3.09% to 3.16%) at 10–14 years and 4.8% (95% CI 4.73% to 4.86%) at 15–18 years. Hyperopia prevalence was 0.59% (95% CI 0.57% to 0.61%), 0.54% (95% CI 0.53% to 0.56%) and 0.39% (95% CI 0.37% to 0.41%), respectively. When defined by spherical ametropia, these values for myopia were 0.84%, 2.50% and 4.24%, and those for hyperopia were 2.11%, 2.41% and 2.07%, respectively.
Myopia was associated with older age, female gender, private school attendance, urban location and state. The latter appeared to be driven by higher literacy rates.
Conclusions
Refractive error, especially myopia, is common in India. Differences in prevalence between states appear to be driven by literacy rates, suggesting that the burden of myopia may rise as literacy increases
Role of metabolically active hormones in the insulin resistance associated with short-term glucocorticoid treatment
BACKGROUND: The mechanisms by which glucocorticoid therapy promotes obesity and insulin resistance are incompletely characterized. Modulations of the metabolically active hormones, tumour necrosis factor alpha (TNF alpha), ghrelin, leptin and adiponectin are all implicated in the development of these cardiovascular risk factors. Little is known about the effects of short-term glucocorticoid treatment on levels of these hormones. RESEARCH METHODS AND PROCEDURES: Using a blinded, placebo-controlled approach, we randomised 25 healthy men (mean (SD) age: 24.2 (5.4) years) to 5 days of treatment with either placebo or oral dexamethasone 3 mg twice daily. Fasting plasma TNFα, ghrelin, leptin and adiponectin were measured before and after treatment. RESULTS: Mean changes in all hormones were no different between treatment arms, despite dexamethasone-related increases in body weight, blood pressure, HDL cholesterol and insulin. Changes in calculated indices of insulin sensitivity (HOMA-S, insulin sensitivity index) were strongly related to dexamethasone treatment (p < 0.001). DISCUSSION: Our data do not support a role for TNF alpha, ghrelin, leptin or adiponectin in the insulin resistance associated with short-term glucocorticoid treatment
A microfluidic device with fluorimetric detection for intracellular components analysis
An integrated microfluidic system that coupled lysis of two cell lines: L929 fibroblasts and A549 epithelial cells, with fluorescence-based enzyme assay was developed to determine β-glucocerebrosidase activity. The microdevice fabricated in poly(dimethylsiloxane) consists of three main parts: a chemical cell lysis zone based on the sheath flow geometry, a micromeander and an optical fibers detection zone. Unlike many methods described in literature that are designed to analyse intracellular components, the presented system enables to perform enzyme assays just after cell lysis process. It reduces the effect of proteases released in lysis process on determined enzymes. Glucocerebrosidase activity, the diagnostic marker for Gaucher’s disease, is the most commonly measured in leukocytes and fibroblasts using 4-methylumbelliferyl-β-D-glucopyranoside as synthetic β-glucoside. The enzyme cleavage releases the fluorescent product, i.e. 4-methylumbelliferone, and its fluorescence is measured as a function of time. The method of enzyme activity determination described in this paper was adapted for flow measurements in the microdevice. The curve of the enzymatic reaction advancement was prepared for three reaction times obtained from application of different flow rates of solutions introduced to the microsystem. Afterwards, determined β-glucocerebrosidase activity was recalculated with regard to 105 cells present in samples used for the tests. The obtained results were compared with a cuvette-based measurements. The lysosomal β-glucosidase activities determined in the microsystem were in good correlation with the values determined during macro-scale measurements
The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016
The 2016 speaker recognition evaluation (SRE'16) is the latest edition in the series of benchmarking events conducted by the National Institute of Standards and Technology (NIST). I4U is a joint entry to SRE'16 as the result from the collaboration and active exchange of information among researchers from sixteen Institutes and Universities across 4 continents. The joint submission and several of its 32 sub-systems were among top-performing systems. A lot of efforts have been devoted to two major challenges, namely, unlabeled training data and dataset shift from Switchboard-Mixer to the new Call My Net dataset. This paper summarizes the lessons learned, presents our shared view from the sixteen research groups on recent advances, major paradigm shift, and common tool chain used in speaker recognition as we have witnessed in SRE'16. More importantly, we look into the intriguing question of fusing a large ensemble of sub-systems and the potential benefit of large-scale collaboration.Peer reviewe
- …