Search CORE

90,719 research outputs found

Homogenous Ensemble Phonotactic Language Recognition Based on SVM Supervector Reconstruction

Author: Johnson Michael T
Liu Jia
Liu Wei-Wei
Zhang Wei-Qiang
Publication venue: e-Publications@Marquette
Publication date: 01/01/2014
Field of study

Currently, acoustic spoken language recognition (SLR) and phonotactic SLR systems are widely used language recognition systems. To achieve better performance, researchers combine multiple subsystems with the results often much better than a single SLR system. Phonotactic SLR subsystems may vary in the acoustic features vectors or include multiple language-specific phone recognizers and different acoustic models. These methods achieve good performance but usually compute at high computational cost. In this paper, a new diversification for phonotactic language recognition systems is proposed using vector space models by support vector machine (SVM) supervector reconstruction (SSR). In this architecture, the subsystems share the same feature extraction, decoding, and N-gram counting preprocessing steps, but model in a different vector space by using the SSR algorithm without significant additional computation. We term this a homogeneous ensemble phonotactic language recognition (HEPLR) system. The system integrates three different SVM supervector reconstruction algorithms, including relative SVM supervector reconstruction, functional SVM supervector reconstruction, and perturbing SVM supervector reconstruction. All of the algorithms are incorporated using a linear discriminant analysis-maximum mutual information (LDA-MMI) backend for improving language recognition evaluation (LRE) accuracy. Evaluated on the National Institute of Standards and Technology (NIST) LRE 2009 task, the proposed HEPLR system achieves better performance than a baseline phone recognition-vector space modeling (PR-VSM) system with minimal extra computational cost. The performance of the HEPLR system yields 1.39%, 3.63%, and 14.79% equal error rate (EER), representing 6.06%, 10.15%, and 10.53% relative improvements over the baseline system, respectively, for the 30-, 10-, and 3-s test conditions

epublications@Marquette

Springer - Publisher Connector

Language Modeling with Power Low Rank Ensembles

Author: Dyer Chris
Parikh Ankur P.
Saluja Avneesh
Xing Eric P.
Publication venue
Publication date: 01/01/2014
Field of study

We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context. Our method can be understood as a generalization of n-gram modeling to non-integer n, and includes standard techniques such as absolute discounting and Kneser-Ney smoothing as special cases. PLRE training is efficient and our approach outperforms state-of-the-art modified Kneser Ney baselines in terms of perplexity on large corpora as well as on BLEU score in a downstream machine translation task

arXiv.org e-Print Archive

CiteSeerX

Crossref

Combining Residual Networks with LSTMs for Lipreading

Author: Stafylakis Themos
Tzimiropoulos Georgios
Publication venue
Publication date: 22/05/2017
Field of study

We propose an end-to-end deep learning architecture for word-level visual speech recognition. The system is a combination of spatiotemporal convolutional, residual and bidirectional Long Short-Term Memory networks. We train and evaluate it on the Lipreading In-The-Wild benchmark, a challenging database of 500-size target-words consisting of 1.28sec video excerpts from BBC TV broadcasts. The proposed network attains word accuracy equal to 83.0, yielding 6.8 absolute improvement over the current state-of-the-art, without using information about word boundaries during training or testing.Comment: Submitted to Interspeech 201

arXiv.org e-Print Archive

Nottingham eTheses

Crossref

Exploring Russian Cyberspace: Digitally-Mediated Collective Action and the Networked Public Sphere

Author: Bruce Etling
Hal Roberts
John G. Palfrey Jr.
John Kelly
Karina Alexanyan
Robert Faris
Urs Gasser
Vladimir Barash
Publication venue: Berkman Center for Internet & Society at Harvard Law School
Publication date: 03/03/2012
Field of study

This paper summarizes the major findings of a three-year research project to investigate the Internet's impact on Russian politics, media and society. We employed multiple methods to study online activity: the mapping and study of the structure, communities and content of the blogosphere; an analogous mapping and study of Twitter; content analysis of different media sources using automated and human-based evaluation approaches; and a survey of bloggers; augmented by infrastructure mapping, interviews and background research. We find the emergence of a vibrant and diverse networked public sphere that constitutes an independent alternative to the more tightly controlled offline media and political space, as well as the growing use of digital platforms in social mobilization and civic action. Despite various indirect efforts to shape cyberspace into an environment that is friendlier towards the government, we find that the Russian Internet remains generally open and free, although the current degree of Internet freedom is in no way a prediction of the future of this contested space

IssueLab