1,318 research outputs found

    Context-Dependent Acoustic Modeling without Explicit Phone Clustering

    Full text link
    Phoneme-based acoustic modeling of large vocabulary automatic speech recognition takes advantage of phoneme context. The large number of context-dependent (CD) phonemes and their highly varying statistics require tying or smoothing to enable robust training. Usually, Classification and Regression Trees are used for phonetic clustering, which is standard in Hidden Markov Model (HMM)-based systems. However, this solution introduces a secondary training objective and does not allow for end-to-end training. In this work, we address a direct phonetic context modeling for the hybrid Deep Neural Network (DNN)/HMM, that does not build on any phone clustering algorithm for the determination of the HMM state inventory. By performing different decompositions of the joint probability of the center phoneme state and its left and right contexts, we obtain a factorized network consisting of different components, trained jointly. Moreover, the representation of the phonetic context for the network relies on phoneme embeddings. The recognition accuracy of our proposed models on the Switchboard task is comparable and outperforms slightly the hybrid model using the standard state-tying decision trees.Comment: Submitted to Interspeech 202

    A cost-sensitive learning algorithm for fuzzy rule-based classifiers

    Get PDF
    Designing classifiers may follow different goals. Which goal to prefer among others depends on the given cost situation and the class distribution. For example, a classifier designed for best accuracy in terms of misclassifica- tions may fail when the cost of misclassification of one class is much higher than that of the other. This paper presents a decision-theoretic extension to make fuzzy rule generation cost-sensitive. Furthermore, it will be shown how interpretability aspects and the costs of feature acquisition can be ac- counted for during classifier design. Natural language text is used to explain the generated fuzzy rules and their design proces

    Prediction of treatment outcome in a clinical sample of problem drinkers: self-efficacy, alcohol expectancies, and readiness to change

    Get PDF
    Cognitive processes related to client motivation are important mediators of alcoholism treatment outcome. The present study aimed to expand previous research on client motivation and treatment outcome by establishing the predictive utility of self-efficacy, alcohol expectancies, and readiness to change in a sample of alcohol-dependent inpatients (N = 83). Treatment outcome was assessed three months following discharge. According to self-reported alcohol use, 22 clients were classified as abstainers and 41 clients as relapsers. Twenty participants were lost to follow-up. Readiness to change and anticipated reinforcement from alcohol predicted abstinence at follow-up. Client motivation was unrelated to both frequency and quantity of alcohol use. In accordance with social learning theory, self-efficacy was inversely correlated with alcohol expectancies. The results of the present study suggest that once abstinence has been violated factors other than pretreatment motivation determine drinking behavior

    RWTH ASR Systems for LibriSpeech: Hybrid vs Attention -- w/o Data Augmentation

    Full text link
    We present state-of-the-art automatic speech recognition (ASR) systems employing a standard hybrid DNN/HMM architecture compared to an attention-based encoder-decoder design for the LibriSpeech task. Detailed descriptions of the system development, including model design, pretraining schemes, training schedules, and optimization approaches are provided for both system architectures. Both hybrid DNN/HMM and attention-based systems employ bi-directional LSTMs for acoustic modeling/encoding. For language modeling, we employ both LSTM and Transformer based architectures. All our systems are built using RWTHs open-source toolkits RASR and RETURNN. To the best knowledge of the authors, the results obtained when training on the full LibriSpeech training set, are the best published currently, both for the hybrid DNN/HMM and the attention-based systems. Our single hybrid system even outperforms previous results obtained from combining eight single systems. Our comparison shows that on the LibriSpeech 960h task, the hybrid DNN/HMM system outperforms the attention-based system by 15% relative on the clean and 40% relative on the other test sets in terms of word error rate. Moreover, experiments on a reduced 100h-subset of the LibriSpeech training corpus even show a more pronounced margin between the hybrid DNN/HMM and attention-based architectures.Comment: Proceedings of INTERSPEECH 201

    Bayes on the court: Evidence for continuous prior-knowledge integration in virtual tennis returns

    Get PDF
    Recent decades of research suggest that humans integrate current sensory information and prior expectations in a Bayesian way to guide behaviour. However, while Bayesian integration provides a powerful framework for perception, cognition and motor control, evidence is largely limited to simple lab tasks so far (Beck et al., 2023). Here we provide evidence for core Bayesian predictions in a complex sensorimotor task at the limit of human performance: returning tennis serves at a speed of 180 km/h or even 260 km/h

    Bayes on the court: Evidence for continuous prior-knowledge integration in virtual-reality tennis returns

    Get PDF
    Due to noisy signals in the sensorimotor system, our perception is constantly subject to uncertainty. This is particularly evident in highly dynamic situations, such as returning a tennis serve. In fundamental research taking on a Bayesian approach to decision-making and sensorimotor control, it is argued that uncertainty is reduced by the reliability-weighted integration of current sensory information and accumulated prior knowledge (Körding & Wolpert, 2006). Therefore, we investigated this mechanism in a virtual-tennis return situation. To this end, 32 young adults (22 females and 10 males, Mage = 21.0, SD = 2.5) learned two probability distributions of serve’s impact locations in a within-subject design over two days that differed regarding the central tendency closer to the left or the right of the service field. The kinematic information in the serving movement remained identical over all trials due to the identical avatar simulation. The perceptual demands in tracking the ball were high because of a speed similar to a serve in professional tennis. As an indicator of participants’ expectation of the ball-bounce location in action, we assessed the gaze fixation after the predictive saccade before the ball’s bounce. A shift of the fixation in relation to the ball’s actual impact location towards the respective distribution’s central tendency was detected that, on top of this, increased over the acquisition period. These results perfectly fit a Bayesian explanatory framework since (1) Körding and Wolpert’s (2006) claim that prior knowledge is integrated into tennis returns according to Bayesian principles is empirically confirmed, and (2) prior-knowledge integration must be understood as a dynamic process in which the eye movements in the early phase of the return movement are increasingly affected by accumulated prior knowledge – which, to our knowledge, was empirically confirmed for complex sensorimotor behaviour for the first time by our study. Körding, K. P. & Wolpert, D. M. (2006). Bayesian decision theory in sensorimotor control. Trends in Cognitive Sciences, 10(7), 319–326. doi: https://doi.org/10.1016/j.tics.2006.05.00

    The diversity of cyanobacterial metabolism: genome analysis of multiple phototrophic microorganisms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Cyanobacteria are among the most abundant organisms on Earth and represent one of the oldest and most widespread clades known in modern phylogenetics. As the only known prokaryotes capable of oxygenic photosynthesis, cyanobacteria are considered to be a promising resource for renewable fuels and natural products. Our efforts to harness the sun's energy using cyanobacteria would greatly benefit from an increased understanding of the genomic diversity across multiple cyanobacterial strains. In this respect, the advent of novel sequencing techniques and the availability of several cyanobacterial genomes offers new opportunities for understanding microbial diversity and metabolic organization and evolution in diverse environments.</p> <p>Results</p> <p>Here, we report a whole genome comparison of multiple phototrophic cyanobacteria. We describe genetic diversity found within cyanobacterial genomes, specifically with respect to metabolic functionality. Our results are based on pair-wise comparison of protein sequences and concomitant construction of clusters of likely ortholog genes. We differentiate between core, shared and unique genes and show that the majority of genes are associated with a single genome. In contrast, genes with metabolic function are strongly overrepresented within the core genome that is common to all considered strains. The analysis of metabolic diversity within core carbon metabolism reveals parts of the metabolic networks that are highly conserved, as well as highly fragmented pathways.</p> <p>Conclusions</p> <p>Our results have direct implications for resource allocation and further sequencing projects. It can be extrapolated that the number of newly identified genes still significantly increases with increasing number of new sequenced genomes. Furthermore, genome analysis of multiple phototrophic strains allows us to obtain a detailed picture of metabolic diversity that can serve as a starting point for biotechnological applications and automated metabolic reconstructions.</p
    • 

    corecore