118 research outputs found

    KNOWLEDGE TRANSFER FOR RUSSIAN CONVERSATIONAL TELEPHONE AUTOMATIC SPEECH RECOGNITION

    Get PDF
    This paper describes the method of knowledge transfer between the ensemble of neural network acoustic models and student-network. This method is used to reduce computational costs and improve the quality of the speech recognition system. The experiments consider two variants of generation of class labels from the ensemble of models: interpolation with alignment, and the posteriori probabilities. Also, the quality of models was studied in relation with the smoothing coefficient. This coefficient was built into the output log-linear classifier of the neural network (softmax layer) and was used both in the ensemble and in the student-network. Additionally, the initial and final learning rates were analyzed. We were successful in relationship establishing between the usage of the smoothing coefficient for generation of the posteriori probabilities and the parameters of the learning rate. Finally, the application of the knowledge transfer for the automatic recognition of Russian conversational telephone speech gave the possibility to reduce the WER (Word Error Rate) by 2.49%, in comparison with the model trained on alignment from the ensemble of neural networks

    Bringing Statistical Methodologies for Enterprise Integration of Conversational Agents

    Get PDF
    Proceedings of: 9th International Conference on Practical Applications of Agents and Multiagent Systems (PAAMS 11). Salamanca, 6-8 April, 2011In this paper we present a methodology to develop commercial conversational agents that avoids the effort of manually defining the dialog strategy for the dialog management module. Our corpus-based methodology is based on selecting the next system answer by means of a classification process in which the complete dialog history is considered. This way, system developers can employ standards like VoiceXML to simply define system prompts and the associated grammars to recognize the users responses to the prompt, and the statistical dialog model automatically selects the next system prompt.We have applied this methodology for the development of an academic conversational agent.Funded by projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC 2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485), and DPS2008-07029- C02-02.Publicad

    Speckle observations of the binary asteroid (22) Kalliope with C2PU/PISCO

    Full text link
    We present new speckle measurements of the position of Linus, the satellite of the asteroid (22) Kalliope, obtained at the 1m C2PU-Epsilon telescope on the Plateau de Calern, France. Observations were made in the visible domain with the speckle camera PISCO. We obtained 122 measurements in February-March 2022 and April 2023, with a mean uncertainty close to 10 milli-arcseconds on the angular separation

    Towards a multimedia knowledge-based agent with social competence and human interaction capabilities

    Get PDF
    We present work in progress on an intelligent embodied conversation agent in the basic care and healthcare domain. In contrast to most of the existing agents, the presented agent is aimed to have linguistic cultural, social and emotional competence needed to interact with elderly and migrants. It is composed of an ontology-based and reasoning-driven dialogue manager, multimodal communication analysis and generation modules and a search engine for the retrieval of multimedia background content from the web needed for conducting a conversation on a given topic.The presented work is funded by the European Commission under the contract number H2020-645012-RIA

    Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

    No full text
    In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisakis model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: Linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers error rates and then to evaluate the information expressed by the classifiers confusion matrices. © Springer Science+Business Media, LLC 2011

    Answering Non-Monotonic Queries in Relational Data Exchange

    Full text link
    Relational data exchange is the problem of translating relational data from a source schema into a target schema, according to a specification of the relationship between the source data and the target data. One of the basic issues is how to answer queries that are posed against target data. While consensus has been reached on the definitive semantics for monotonic queries, this issue turned out to be considerably more difficult for non-monotonic queries. Several semantics for non-monotonic queries have been proposed in the past few years. This article proposes a new semantics for non-monotonic queries, called the GCWA*-semantics. It is inspired by semantics from the area of deductive databases. We show that the GCWA*-semantics coincides with the standard open world semantics on monotonic queries, and we further explore the (data) complexity of evaluating non-monotonic queries under the GCWA*-semantics. In particular, we introduce a class of schema mappings for which universal queries can be evaluated under the GCWA*-semantics in polynomial time (data complexity) on the core of the universal solutions.Comment: 55 pages, 3 figure

    Is spoken language all-or-nothing? Implications for future speech-based human-machine interaction

    Get PDF
    Recent years have seen significant market penetration for voice-based personal assistants such as Apple’s Siri. However, despite this success, user take-up is frustratingly low. This article argues that there is a habitability gap caused by the inevitablemismatch between the capabilities and expectations of human users and the features and benefits provided by contemporary technology. Suggestions aremade as to how such problems might be mitigated, but a more worrisome question emerges: “is spoken language all-or-nothing”? The answer, based on contemporary views on the special nature of (spoken) language, is that there may indeed be a fundamental limit to the interaction that can take place between mismatched interlocutors (such as humans and machines). However, it is concluded that interactions between native and non-native speakers, or between adults and children, or even between humans and dogs, might provide critical inspiration for the design of future speech-based human-machine interaction

    Achievement of the planetary defense investigations of the Double Asteroid Redirection Test (DART) mission

    Get PDF
    NASA's Double Asteroid Redirection Test (DART) mission was the first to demonstrate asteroid deflection, and the mission's Level 1 requirements guided its planetary defense investigations. Here, we summarize DART's achievement of those requirements. On 2022 September 26, the DART spacecraft impacted Dimorphos, the secondary member of the Didymos near-Earth asteroid binary system, demonstrating an autonomously navigated kinetic impact into an asteroid with limited prior knowledge for planetary defense. Months of subsequent Earth-based observations showed that the binary orbital period was changed by –33.24 minutes, with two independent analysis methods each reporting a 1σ uncertainty of 1.4 s. Dynamical models determined that the momentum enhancement factor, β, resulting from DART's kinetic impact test is between 2.4 and 4.9, depending on the mass of Dimorphos, which remains the largest source of uncertainty. Over five dozen telescopes across the globe and in space, along with the Light Italian CubeSat for Imaging of Asteroids, have contributed to DART's investigations. These combined investigations have addressed topics related to the ejecta, dynamics, impact event, and properties of both asteroids in the binary system. A year following DART's successful impact into Dimorphos, the mission has achieved its planetary defense requirements, although work to further understand DART's kinetic impact test and the Didymos system will continue. In particular, ESA's Hera mission is planned to perform extensive measurements in 2027 during its rendezvous with the Didymos–Dimorphos system, building on DART to advance our knowledge and continue the ongoing international collaboration for planetary defense
    corecore