21 research outputs found

    Language Modeling with Power Low Rank Ensembles

    Full text link
    We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context. Our method can be understood as a generalization of n-gram modeling to non-integer n, and includes standard techniques such as absolute discounting and Kneser-Ney smoothing as special cases. PLRE training is efficient and our approach outperforms state-of-the-art modified Kneser Ney baselines in terms of perplexity on large corpora as well as on BLEU score in a downstream machine translation task

    On adaptive decision rules and decision parameter adaptation for automatic speech recognition

    Get PDF
    Recent advances in automatic speech recognition are accomplished by designing a plug-in maximum a posteriori decision rule such that the forms of the acoustic and language model distributions are specified and the parameters of the assumed distributions are estimated from a collection of speech and language training corpora. Maximum-likelihood point estimation is by far the most prevailing training method. However, due to the problems of unknown speech distributions, sparse training data, high spectral and temporal variabilities in speech, and possible mismatch between training and testing conditions, a dynamic training strategy is needed. To cope with the changing speakers and speaking conditions in real operational conditions for high-performance speech recognition, such paradigms incorporate a small amount of speaker and environment specific adaptation data into the training process. Bayesian adaptive learning is an optimal way to combine prior knowledge in an existing collection of general models with a new set of condition-specific adaptation data. In this paper, the mathematical framework for Bayesian adaptation of acoustic and language model parameters is first described. Maximum a posteriori point estimation is then developed for hidden Markov models and a number of useful parameters densities commonly used in automatic speech recognition and natural language processing.published_or_final_versio

    Testing word embeddings for Polish

    Get PDF
    Testing word embeddings for Polish Distributional Semantics postulates the representation of word meaning in the form of numeric vectors which represent words which occur in context in large text data. This paper addresses the problem of constructing such models for the Polish language. The paper compares the effectiveness of models based on lemmas and forms created with Continuous Bag of Words (CBOW) and skip-gram approaches based on different Polish corpora. For the purposes of this comparison, the results of two typical tasks solved with the help of distributional semantics, i.e. synonymy and analogy recognition, are compared. The results show that it is not possible to identify one universal approach to vector creation applicable to various tasks. The most important feature is the quality and size of the data, but different strategy choices can also lead to significantly different results. 聽 Testowanie wektorowych reprezentacji dystrybucyjnych s艂贸w j臋zyka polskiego Semantyka dystrybucyjna opiera si臋 na za艂o偶eniu, 偶e znaczenie s艂贸w wyra偶one jest za pomoc膮 wektor贸w reprezentuj膮cych, w spos贸b bezpo艣redni b膮d藕 po艣redni, konteksty, w jakich s艂owo to jest u偶ywane w du偶ym zbiorze tekst贸w. Niniejszy artyku艂 dotyczy ewaluacji wielu takich modeli skonstruowanych dla j臋zyka polskiego. W pracy por贸wnano skuteczno艣膰 modeli opartych na lematach i formach s艂贸w, utworzonych przy wykorzystaniu sieci neuronowych na danych z dw贸ch r贸偶nych korpus贸w j臋zyka polskiego. Ewaluacji dokonano na podstawie wynik贸w dw贸ch typowych zada艅 rozwi膮zywanych za pomoc膮 metod semantyki dystrybucyjnej, tzn. rozpoznania wyst臋powania synonimii i analogii mi臋dzy konkretnymi parami s艂贸w. Uzyskane wyniki dowodz膮, 偶e nie mo偶na wskaza膰 jednego uniwersalnego podej艣cia do tworzenia modeli dystrybucyjnych, gdy偶 ich skuteczno艣膰 jest r贸偶na w zale偶no艣ci od zastosowania. Najwa偶niejsz膮 cech膮 wp艂ywaj膮c膮 na jako艣膰 modelu jest jako艣膰 oraz rozmiar danych, ale wybory r贸偶nych strategii uczenia sieci mog膮 r贸wnie偶 prowadzi膰 do istotnie odmiennych wynik贸w

    Adaptation and Augmentation: Towards Better Rescoring Strategies for Automatic Speech Recognition and Spoken Term Detection

    Full text link
    Selecting the best prediction from a set of candidates is an essential problem for many spoken language processing tasks, including automatic speech recognition (ASR) and spoken keyword spotting (KWS). Generally, the selection is determined by a confidence score assigned to each candidate. Calibrating these confidence scores (i.e., rescoring them) could make better selections and improve the system performance. This dissertation focuses on using tailored language models to rescore ASR hypotheses as well as keyword search results for ASR-based KWS. This dissertation introduces three kinds of rescoring techniques: (1) Freezing most model parameters while fine-tuning the output layer in order to adapt neural network language models (NNLMs) from the written domain to the spoken domain. Experiments on a large-scale Italian corpus show a 30.2% relative reduction in perplexity at the word-cluster level and a 2.3% relative reduction in WER in a state-of-the-art Italian ASR system. (2) Incorporating source application information associated with speech queries. By exploring a range of adaptation model architectures, we achieve a 21.3% relative reduction in perplexity compared to a fine-tuned baseline. Initial experiments using a state-of-the-art Italian ASR system show a 3.0% relative reduction in WER on top of an unadapted 5-gram LM. In addition, human evaluations show significant improvements by using the source application information. (3) Marrying machine learning algorithms (classification and ranking) with a variety of signals to rescore keyword search results in the context of KWS for low-resource languages. These systems, built for the IARPA BABEL Program, enhance search performance in terms of maximum term-weighted value (MTWV) across six different low-resource languages: Vietnamese, Tagalog, Pashto, Turkish, Zulu and Tamil

    Log analysis aided by latent semantic mapping

    Get PDF
    In an age of zero-day exploits and increased on-line attacks on computing infrastructure, operational security practitioners are becoming increasingly aware of the value of the information captured in log events. Analysis of these events is critical during incident response, forensic investigations related to network breaches, hacking attacks and data leaks. Such analysis has led to the discipline of Security Event Analysis, also known as Log Analysis. There are several challenges when dealing with events, foremost being the increased volumes at which events are often generated and stored. Furthermore, events are often captured as unstructured data, with very little consistency in the formats or contents of the events. In this environment, security analysts and implementers of Log Management (LM) or Security Information and Event Management (SIEM) systems face the daunting task of identifying, classifying and disambiguating massive volumes of events in order for security analysis and automation to proceed. Latent Semantic Mapping (LSM) is a proven paradigm shown to be an effective method of, among other things, enabling word clustering, document clustering, topic clustering and semantic inference. This research is an investigation into the practical application of LSM in the discipline of Security Event Analysis, showing the value of using LSM to assist practitioners in identifying types of events, classifying events as belonging to certain sources or technologies and disambiguating different events from each other. The culmination of this research presents adaptations to traditional natural language processing techniques that resulted in improved efficacy of LSM when dealing with Security Event Analysis. This research provides strong evidence supporting the wider adoption and use of LSM, as well as further investigation into Security Event Analysis assisted by LSM and other natural language or computer-learning processing techniques.LaTeX with hyperref packageAdobe Acrobat 9.54 Paper Capture Plug-i
    corecore