2,542 research outputs found

    Statistical natural language generation for dialogue systems based on hierarchical models

    Get PDF
    Due to the increasing presence of natural-language interfaces in our life, natural language processing (NLP) is currently gaining more popularity every year. However, until recently, the main part of the research activity in this area was aimed to Natural Language Understanding (NLU), which is responsible for extracting meanings from natural language input. This is explained by a wider number of practical applications of NLU such as machine translation, etc., whereas Natural Language Generation is mainly used for providing output interfaces, which was considered more as a user interface problem rather than a functionality issue. Generally speaking, natural language generation (NLG) is the process of generating text from a semantic representation, which can be expressed in many different forms. The common application of NLG takes part in so called Spoken Dialogue System (SDS), where user interacts directly by voice with a computer- based system to receive information or perform a certain type of actions as, for example, buying a plane ticket or booking a table in a restaurant. Dialogue systems represent one of the most interesting applications within the field of speech technologies. Usually the NLG part in this kind of systems was provided by templates, only filling canned gaps with requested information. But nowadays, since SDS are increasing its complexity, more advanced and user-friendly interfaces should be provided, thereby creating a need for a more refined and adaptive approach. One of the solutions to be considered are the NLG models based on statistical frameworks, where the system’s response to user is generated in real-time, adjusting their response to the user performance, instead of just choosing a pertinent template. Due to the corpus-based approach, these systems are easy to adapt to the different tasks in a range of informational domain. The aim of this work is to present a statistical approach to the problem of utterance generation, which uses cooperation between two different language models (LM) in order to enhance the efficiency of NLG module. In the higher level, a class- based language model is used to build the syntactic structure of the sentence. Inthe second layer, a specific language model acts inside each class, dealing with the words. In the dialogue system described in this work, a user asks for an information regarding to a bus schedule, route schemes, fares and special information. Therefore in each dialogue the user has a specific dialogue goal, which needs to be met by the system. This could be used as one of the methods to measure the system performance, as well as the appropriate utterance generation and average dialogue length, which is important when speaking about an interactive information system. The work is organized as follows. In Section 2 the basic approaches to the NLG task are described, and their advantages and disadvantages are considered. Section 3 presents the objective of this work. In Section 4 the basic model and its novelty is explained. In Section 5 the details of the task features and the corpora employed are presented. Section 6 contains the experiments results and its explanation, as well as the evaluation of the obtained results. The Section 7 resumes the conclusions and the future investigation proposals

    Identifying Solar Flare Precursors Using Time Series of SDO/HMI Images and SHARP Parameters

    Full text link
    We present several methods towards construction of precursors, which show great promise towards early predictions, of solar flare events in this paper. A data pre-processing pipeline is built to extract useful data from multiple sources, Geostationary Operational Environmental Satellites (GOES) and Solar Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI), to prepare inputs for machine learning algorithms. Two classification models are presented: classification of flares from quiet times for active regions and classification of strong versus weak flare events. We adopt deep learning algorithms to capture both the spatial and temporal information from HMI magnetogram data. Effective feature extraction and feature selection with raw magnetogram data using deep learning and statistical algorithms enable us to train classification models to achieve almost as good performance as using active region parameters provided in HMI/Space-Weather HMI-Active Region Patch (SHARP) data files. Case studies show a significant increase in the prediction score around 20 hours before strong solar flare events

    Visually Guided Sound Source Separation using Cascaded Opponent Filter Network

    Get PDF
    The objective of this paper is to recover the original component signals from a mixture audio with the aid of visual cues of the sound sources. Such task is usually referred as visually guided sound source separation. The proposed Cascaded Opponent Filter (COF) framework consists of multiple stages, which recursively refine the source separation. A key element in COF is a novel opponent filter module that identifies and relocates residual components between sources. The system is guided by the appearance and motion of the source, and, for this purpose, we study different representations based on video frames, optical flows, dynamic images, and their combinations. Finally, we propose a Sound Source Location Masking (SSLM) technique, which, together with COF, produces a pixel level mask of the source location. The entire system is trained end-to-end using a large set of unlabelled videos. We compare COF with recent baselines and obtain the state-of-the-art performance in three challenging datasets (MUSIC, A-MUSIC, and A-NATURAL). Project page: https://ly-zhu.github.io/cof-net.Comment: main paper 14 pages, ref 3 pages, and supp 7 pages. Revised argument in section 3 and

    Bayesian distance metric learning and its application in automatic speaker recognition systems

    Get PDF
    This paper proposes state-of the-art Automatic Speaker Recognition System (ASR) based on Bayesian Distance Learning Metric as a feature extractor. In this modeling, I explored the constraints of the distance between modified and simplified i-vector pairs by the same speaker and different speakers. An approximation of the distance metric is used as a weighted covariance matrix from the higher eigenvectors of the covariance matrix, which is used to estimate the posterior distribution of the metric distance. Given a speaker tag, I select the data pair of the different speakers with the highest cosine score to form a set of speaker constraints. This collection captures the most discriminating variability between the speakers in the training data. This Bayesian distance learning approach achieves better performance than the most advanced methods. Furthermore, this method is insensitive to normalization compared to cosine scores. This method is very effective in the case of limited training data. The modified supervised i-vector based ASR system is evaluated on the NIST SRE 2008 database. The best performance of the combined cosine score EER 1.767% obtained using LDA200 + NCA200 + LDA200, and the best performance of Bayes_dml EER 1.775% obtained using LDA200 + NCA200 + LDA100. Bayesian_dml overcomes the combined norm of cosine scores and is the best result of the short2-short3 condition report for NIST SRE 2008 data

    Modeling Memory: Exploring the Relationship between Word Overlap and Single Word Norms When Predicting Judgments and Recall

    Get PDF
    This study examined the interactive relationship between associative, semantic, and thematic word pair strength when predicating item relatedness judgments and cued-recall performance. In Experiment One, 112 participants were shown word pairs with varied levels of associative, semantic, and thematic overlap (measured with forward strength, cosine, and latent semantic analysis) and were asked to judge how related item pairs were before taking a cued-recall test. Experiment One had four goals. First, the judgment of associative memory task (JAM) was expanded to include three types of judgments. Next, the and interaction between database norms (FSG, COS, and LSA) was for when predicting judgments and recall. Finally, JAM slopes calculated in Hypothesis One were used to predict recall. Experiment Two sought to first replicate interaction findings from Experiment One using a new set of stimuli, and second to replicate these interactions when controlling for several single word norms. Overall, Experiment One found significant three-way interactions between the network norms when predicting judgments and recall. Experiment Two partially replicated these interactions. These results suggest that associative, semantic, and thematic memory networks form a set of interdependent memory systems used for both cognitive processes

    The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting

    Get PDF
    The numerous recent breakthroughs in machine learning (ML) make imperative to carefully ponder how the scientific community can benefit from a technology that, although not necessarily new, is today living its golden age. This Grand Challenge review paper is focused on the present and future role of machine learning in space weather. The purpose is twofold. On one hand, we will discuss previous works that use ML for space weather forecasting, focusing in particular on the few areas that have seen most activity: the forecasting of geomagnetic indices, of relativistic electrons at geosynchronous orbits, of solar flares occurrence, of coronal mass ejection propagation time, and of solar wind speed. On the other hand, this paper serves as a gentle introduction to the field of machine learning tailored to the space weather community and as a pointer to a number of open challenges that we believe the community should undertake in the next decade. The recurring themes throughout the review are the need to shift our forecasting paradigm to a probabilistic approach focused on the reliable assessment of uncertainties, and the combination of physics-based and machine learning approaches, known as gray-box.Comment: under revie
    corecore