384 research outputs found

    Predicting the age of social network users from user-generated texts with word embeddings

    Get PDF
    © 2016 FRUCT.Many web-based applications such as advertising or recommender systems often critically depend on the demographic information, which may be unavailable for new or anonymous users. We study the problem of predicting demographic information based on user-generated texts on a Russian-language dataset from a large social network. We evaluate the efficiency of age prediction algorithms based on word2vec word embeddings and conduct a comprehensive experimental evaluation, comparing these algorithms with each other and with classical baseline approaches

    Speaker Clustering for Multilingual Synthesis

    Get PDF

    An investigation of the electrolytic plasma oxidation process for corrosion protection of pure magnesium and magnesium alloy AM50.

    Get PDF
    In this study, silicate and phosphate EPO coatings were produced on pure magnesium using an AC power source. It was found that the silicate coatings possess good wear resistance, while the phosphate coatings provide better corrosion protection. A Design of Experiment (DOE) technique, the Taguchi method, was used to systematically investigate the effect of the EPO process parameters on the corrosion protection properties of a coated magnesium alloy AM50 using a DC power. The experimental design consisted of four factors (treatment time, current density, and KOH and NaAlO2 concentrations), with three levels of each factor. Potentiodynamic polarization measurements were conducted to determine the corrosion resistance of the coated samples. The optimized processing parameters are 12 minutes, 12 mA/cm2 current density, 0.9 g/l KOH, 15.0 g/l NaAlO2. The results of the percentage contribution of each factor determined by the analysis of variance (ANOVA) imply that the KOH concentration is the most significant factor affecting the corrosion resistance of the coatings, while treatment time is a major factor affecting the thickness of the coatings. (Abstract shortened by UMI.)Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .M323. Source: Masters Abstracts International, Volume: 44-03, page: 1479. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005

    An investigation of grammar design in natural-language speech-recognition.

    Get PDF
    With the growing interest and demand for human-machine interaction, much work concerning speech-recognition has been carried out over the past three decades. Although a variety of approaches have been proposed to address speech-recognition issues, such as stochastic (statistical) techniques, grammar-based techniques, techniques integrated with linguistic features, and other approaches, recognition accuracy and robustness remain among the major problems that need to be addressed. At the state of the art, most commercial speech products are constructed using grammar-based speech-recognition technology. In this thesis, we investigate a number of features involved in grammar design in natural-language speech-recognition technology. We hypothesize that: with the same domain, a semantic grammar, which directly encodes some semantic constraints into the recognition grammar, achieves better accuracy, but less robustness; a syntactic grammar defines a language with a larger size, thereby it has better robustness, but less accuracy; a word-sequence grammar, which includes neither semantics nor syntax, defines the largest language, therefore, is the most robust, but has very poor recognition accuracy. In this Master\u27s thesis, we claim that proper grammar design can achieve the appropriate compromise between recognition accuracy and robustness. The thesis has been proven by experiments using the IBM Voice-Server SDK, which consists of a VoiceXML browser, IBM ViaVoice Speech Recognition and Text-To-Speech (TTS) engines, sample applications, and other tools for developing and testing VoiceXML applications. The experimental grammars are written in the Java Speech Grammar Format (JSGF), and the testing applications are written in VoiceXML. The tentative experimental results suggest that grammar design is a good area for further study. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2003 .S555. Source: Masters Abstracts International, Volume: 43-01, page: 0244. Adviser: Richard A. Frost. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Ensemble deep learning: A review

    Get PDF
    Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning models with multilayer processing architecture is showing better performance as compared to the shallow or traditional classification models. Deep ensemble learning models combine the advantages of both the deep learning models as well as the ensemble learning such that the final model has better generalization performance. This paper reviews the state-of-art deep ensemble models and hence serves as an extensive summary for the researchers. The ensemble models are broadly categorised into ensemble models like bagging, boosting and stacking, negative correlation based deep ensemble models, explicit/implicit ensembles, homogeneous /heterogeneous ensemble, decision fusion strategies, unsupervised, semi-supervised, reinforcement learning and online/incremental, multilabel based deep ensemble models. Application of deep ensemble models in different domains is also briefly discussed. Finally, we conclude this paper with some future recommendations and research directions

    experimental study on extreme learning machine applications for speech enhancement

    Get PDF
    In wireless telephony and audio data mining applications, it is desirable that noise suppression can be made robust against changing noise conditions and operates in real time (or faster). The learning effectiveness and speed of artificial neural networks are therefore critical factors in applications for speech enhancement tasks. To address these issues, we present an extreme learning machine (ELM) framework, aimed at the effective and fast removal of background noise from a single-channel speech signal, based on a set of randomly chosen hidden units and analytically determined output weights. Because feature learning with shallow ELM may not be effective for natural signals, such as speech, even with a large number of hidden nodes, hierarchical ELM (H-ELM) architectures are deployed by leveraging sparse auto-encoders. In this manner, we not only keep all the advantages of deep models in approximating complicated functions and maintaining strong regression capabilities, but we also overcome the cumbersome and time-consuming features of both greedy layer-wise pre-training and back-propagation (BP)-based fine tuning schemes, which are typically adopted for training deep neural architectures. The proposed ELM framework was evaluated on the Aurora–4 speech database. The Aurora–4 task provides relatively limited training data, and test speech data corrupted with both additive noise and convolutive distortions for matched and mismatched channels and signal-to-noise ratio (SNR) conditions. In addition, the task includes a subset of testing data involving noise types and SNR levels that are not seen in the training data. The experimental results indicate that when the amount of training data is limited, both ELM- and H-ELM-based speech enhancement techniques consistently outperform the conventional BP-based shallow and deep learning algorithms, in terms of standardized objective evaluations, under various testing conditions

    Sparse and structured decomposition of audio signals on hybrid dictionaries using musical priors

    No full text
    International audienceThis paper investigates the use of musical priors for sparse expansion of audio signals of music, on an overcomplete dual-resolution dictionary taken from the union of two orthonormal bases that can describe both transient and tonal components of a music audio signal. More specifically, chord and metrical structure information are used to build a structured model that takes into account dependencies between coefficients of the decomposition, both for the tonal and for the transient layer. The denoising task application is used to provide a proof of concept of the proposed musical priors. Several configurations of the model are analyzed. Evaluation on monophonic and complex polyphonic excerpts of real music signals shows that the proposed approach provides results whose quality measured by the signal-to-noise ratio is competitive with state-of-the-art approaches, and more coherent with the semantic content of the signal. A detailed analysis of the model in terms of sparsity and in terms of interpretability of the representation is also provided, and shows that the model is capable of giving a relevant and legible representation of Western tonal music audio signals
    corecore