192,617 research outputs found

    Transfer learning of language-independent end-to-end ASR with language model fusion

    Full text link
    This work explores better adaptation methods to low-resource languages using an external language model (LM) under the framework of transfer learning. We first build a language-independent ASR system in a unified sequence-to-sequence (S2S) architecture with a shared vocabulary among all languages. During adaptation, we perform LM fusion transfer, where an external LM is integrated into the decoder network of the attention-based S2S model in the whole adaptation stage, to effectively incorporate linguistic context of the target language. We also investigate various seed models for transfer learning. Experimental evaluations using the IARPA BABEL data set show that LM fusion transfer improves performances on all target five languages compared with simple transfer learning when the external text data is available. Our final system drastically reduces the performance gap from the hybrid systems.Comment: Accepted at ICASSP201

    Emulating and evaluating hybrid memory for managed languages on NUMA hardware

    Get PDF
    Non-volatile memory (NVM) has the potential to become a mainstream memory technology and challenge DRAM. Researchers evaluating the speed, endurance, and abstractions of hybrid memories with DRAM and NVM typically use simulation, making it easy to evaluate the impact of different hardware technologies and parameters. Simulation is, however, extremely slow, limiting the applications and datasets in the evaluation. Simulation also precludes critical workloads, especially those written in managed languages such as Java and C#. Good methodology embraces a variety of techniques for evaluating new ideas, expanding the experimental scope, and uncovering new insights. This paper introduces a platform to emulate hybrid memory for managed languages using commodity NUMA servers. Emulation complements simulation but offers richer software experimentation. We use a thread-local socket to emulate DRAM and a remote socket to emulate NVM. We use standard C library routines to allocate heap memory on the DRAM and NVM sockets for use with explicit memory management or garbage collection. We evaluate the emulator using various configurations of write-rationing garbage collectors that improve NVM lifetimes by limiting writes to NVM, using 15 applications and various datasets and workload configurations. We show emulation and simulation confirm each other's trends in terms of writes to NVM for different software configurations, increasing our confidence in predicting future system effects. Emulation brings novel insights, such as the non-linear effects of multi-programmed workloads on NVM writes, and that Java applications write significantly more than their C++ equivalents. We make our software infrastructure publicly available to advance the evaluation of novel memory management schemes on hybrid memories

    Interrupt Timed Automata: verification and expressiveness

    Get PDF
    We introduce the class of Interrupt Timed Automata (ITA), a subclass of hybrid automata well suited to the description of timed multi-task systems with interruptions in a single processor environment. While the reachability problem is undecidable for hybrid automata we show that it is decidable for ITA. More precisely we prove that the untimed language of an ITA is regular, by building a finite automaton as a generalized class graph. We then establish that the reachability problem for ITA is in NEXPTIME and in PTIME when the number of clocks is fixed. To prove the first result, we define a subclass ITA- of ITA, and show that (1) any ITA can be reduced to a language-equivalent automaton in ITA- and (2) the reachability problem in this subclass is in NEXPTIME (without any class graph). In the next step, we investigate the verification of real time properties over ITA. We prove that model checking SCL, a fragment of a timed linear time logic, is undecidable. On the other hand, we give model checking procedures for two fragments of timed branching time logic. We also compare the expressive power of classical timed automata and ITA and prove that the corresponding families of accepted languages are incomparable. The result also holds for languages accepted by controlled real-time automata (CRTA), that extend timed automata. We finally combine ITA with CRTA, in a model which encompasses both classes and show that the reachability problem is still decidable. Additionally we show that the languages of ITA are neither closed under complementation nor under intersection

    Hybrid lemmatization in HuSpaCy

    Get PDF
    Lemmatization is still not a trivial task for morphologically rich languages. Previous studies showed that hybrid architectures usually work better for these languages and can yield great results. This paper presents a hybrid lemmatizer utilizing both a neural model, dictionaries and hand-crafted rules. We introduce a hybrid architecture along with empirical results on a widely used Hungarian dataset. The presented methods are published as three HuSpaCy models
    • …
    corecore