256 research outputs found
Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates
Recruitment Market Trend Analysis with Sequential Latent Variable Models
Recruitment market analysis provides valuable understanding of
industry-specific economic growth and plays an important role for both
employers and job seekers. With the rapid development of online recruitment
services, massive recruitment data have been accumulated and enable a new
paradigm for recruitment market analysis. However, traditional methods for
recruitment market analysis largely rely on the knowledge of domain experts and
classic statistical models, which are usually too general to model large-scale
dynamic recruitment data, and have difficulties to capture the fine-grained
market trends. To this end, in this paper, we propose a new research paradigm
for recruitment market analysis by leveraging unsupervised learning techniques
for automatically discovering recruitment market trends based on large-scale
recruitment data. Specifically, we develop a novel sequential latent variable
model, named MTLVM, which is designed for capturing the sequential dependencies
of corporate recruitment states and is able to automatically learn the latent
recruitment topics within a Bayesian generative framework. In particular, to
capture the variability of recruitment topics over time, we design hierarchical
dirichlet processes for MTLVM. These processes allow to dynamically generate
the evolving recruitment topics. Finally, we implement a prototype system to
empirically evaluate our approach based on real-world recruitment data in
China. Indeed, by visualizing the results from MTLVM, we can successfully
reveal many interesting findings, such as the popularity of LBS related jobs
reached the peak in the 2nd half of 2014, and decreased in 2015.Comment: 11 pages, 30 figure, SIGKDD 201
Quantum Spectrometry for Arbitrary Noise
We present a technique for recovering the spectrum of a non-Markovian bosonic bath and/or non-Markovian noises coupled to a harmonic oscillator. The treatment is valid under the conditions that the environment is large and hot compared to the oscillator, and that its temporal autocorrelation functions are symmetric with respect to time translation and reflection—criteria which we consider fairly minimal. We model a demonstration of the technique as deployed in the experimental scenario of a nanosphere levitated in a Paul trap, and show that it would effectively probe the spectrum of an electric field noise source from
1
0
2
to
1
0
6
Hz
with a resolution inversely proportional to the measurement time. This technique may be deployed in quantum sensing, metrology, computing, and in experimental probes of foundational questions
Towards a Robuster Interpretive Parsing
The input data to grammar learning algorithms often consist of overt forms that do not contain full structural descriptions. This lack of information may contribute to the failure of learning. Past work on Optimality Theory introduced Robust Interpretive Parsing (RIP) as a partial solution to this problem. We generalize RIP and suggest replacing the winner candidate with a weighted mean violation of the potential winner candidates. A Boltzmann distribution is introduced on the winner set, and the distribution’s parameter is gradually decreased. Finally, we show that GRIP, the Generalized Robust Interpretive Parsing Algorithm significantly improves the learning success rate in a model with standard constraints for metrical stress assignment
Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase speech recognition error rates
International audienceDespite years of speech recognition research, little is known about which words tend to be misrecognized and why. Previous work has shown that errors increase for infrequent words, short words, and very loud or fast speech, but many other presumed causes of error (e.g., nearby disfluencies, turn-initial words, phonetic neighborhood density) have never been carefully tested. The reasons for the huge differences found in error rates between speakers also remain largely mysterious. Using a mixed-effects regression model, we investigate these and other factors by analyzing the errors of two state-of-the-art recognizers on conversational speech. Words with higher error rates include those with extreme prosodic characteristics, those occurring turn-initially or as discourse markers, and : acoustically similar words that also have similar language model probabilities. Words preceding disfluent interruption points (first repetition tokens and words before fragments) also have higher error rates. Finally, even after accounting for other factors, speaker differences cause enormous variance in error rates, suggesting that speaker error rate variance is not fully explained by differences in word choice, fluency, or prosodic characteristics. We also propose that doubly confusable pairs, rather than high neighborhood density, may better explain phonetic neighborhood errors in human speech processing
Continuous Interaction with a Virtual Human
Attentive Speaking and Active Listening require that a Virtual Human be capable of simultaneous perception/interpretation and production of communicative behavior. A Virtual Human should be able to signal its attitude and attention while it is listening to its interaction partner, and be able to attend to its interaction partner while it is speaking – and modify its communicative behavior on-the-fly based on what it perceives from its partner. This report presents the results of a four week summer project that was part of eNTERFACE’10. The project resulted in progress on several aspects of continuous interaction such as scheduling and interrupting multimodal behavior, automatic classification of listener responses, generation of response eliciting behavior, and models for appropriate reactions to listener responses. A pilot user study was conducted with ten participants. In addition, the project yielded a number of deliverables that are released for public access
Bibliographic analysis on research publications using authors, categorical labels and the citation network
Modelling Students’ Thematically Associated Knowledge : Networked Knowledge from Affinity Statistics
Peer reviewe
Glucolipotoxicity initiates pancreatic beta-cell death through TNFR5/CD40-mediated STAT1 and NF-kappa B activation
Funded by Diabetes UK, NovoNordisk UK Research Foundation, and St. Bartholomew’s & The Royal London Charitable
Foundation grant awards
Learning and Long-Term Retention of Large-Scale Artificial Languages
Recovering discrete words from continuous speech is one of the first challenges facing language learners. Infants and adults can make use of the statistical structure of utterances to learn the forms of words from unsegmented input, suggesting that this ability may be useful for bootstrapping language-specific cues to segmentation. It is unknown, however, whether performance shown in small-scale laboratory demonstrations of “statistical learning” can scale up to allow learning of the lexicons of natural languages, which are orders of magnitude larger. Artificial language experiments with adults can be used to test whether the mechanisms of statistical learning are in principle scalable to larger lexicons. We report data from a large-scale learning experiment that demonstrates that adults can learn words from unsegmented input in much larger languages than previously documented and that they retain the words they learn for years. These results suggest that statistical word segmentation could be scalable to the challenges of lexical acquisition in natural language learning.National Science Foundation (U.S.) (NSF DDRIG #0746251
- …