Search CORE

5,340 research outputs found

N-gram language models for massively parallel devices

Author: Bogoychev Nikolay
Lopez Adam
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 12/08/2016
Field of study

An Improved GPU Simulator For Spiking Neural P Systems

Author: Adorna Henry N.
Cabarle Francis George C.
Martínez del Amor Miguel Ángel
Publication venue: IEEE Computer Society
Publication date: 01/01/2011
Field of study

Spiking Neural P (SNP) systems, variants of Psystems (under Membrane and Natural computing), are computing models that acquire abstraction and inspiration from the way neurons 'compute' or process information. Similar to other P system variants, SNP systems are Turing complete models that by nature compute non-deterministically and in a maximally parallel manner. P systems usually trade (often exponential) space for (polynomial to constant) time. Due to this nature, P system variants are currently limited to parallel simulations, and several variants have already been simulated in parallel devices. In this paper we present an improved SNP system simulator based on graphics processing units (GPUs). Among other reasons, current GPUs are architectured for massively parallel computations, thus making GPUs very suitable for SNP system simulation. The computing model, hardware/software considerations, and simulation algorithm are presented, as well as the comparisons of the CPU only and CPU-GPU based simulators.Ministerio de Ciencia e Innovación TIN2009–13192Junta de Andalucía P08-TIC-0420

idUS. Depósito de Investigación Universidad de Sevilla

Marker-based filtering of bilingual phrase pairs for SMT

Author: Sánchez-Martínez Felipe
Way Andy
Publication venue: European Association for Machine Translation
Publication date: 01/01/2009
Field of study

State-of-the-art statistical machine translation systems make use of a large translation table obtained after scoring a set of bilingual phrase pairs automatically extracted from a parallel corpus. The number of bilingual phrase pairs extracted from a pair of aligned sentences grows exponentially as the length of the sentences increases; therefore, the number of entries in the phrase table used to carry out the translation may become unmanageable, especially when online, 'on demand' translation is required in real time. We describe the use of closed-class words to filter the set of bilingual phrase pairs extracted from the parallel corpus by taking into account the alignment information and the type of the words involved in the alignments. On four European language pairs, we show that our simple yet novel approach can filter the phrase table by up to a third yet still provide competitive results compared to the baseline. Furthermore, it provides a nice balance between the unfiltered approach and pruning using stop words, where the deterioration in translation quality is unacceptably high

Repositorio Institucional de la Universidad de Alicante

CiteSeerX

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

DCU Online Research Access Service

High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

Author: Chai J.
Ren J.
Su H. Y.
Wen M.
Wu N.
Zhang C. Y.
Publication venue: Společnost pro radioelektronické inženýrství
Publication date: 01/04/2012
Field of study

This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC) based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms

Directory of Open Access Journals

Digital library of Brno University of Technology

Decentralized Large-Scale Natural Language Processing Using Gossip Learning

Author: Alkathiri Abdul Aziz
Publication venue
Publication date: 23/06/2020
Field of study

Pure OAI Repository

A Continuously Growing Dataset of Sentential Paraphrases

Author: He Hua
Lan Wuwei
Qiu Siyu
Xu Wei
Publication venue
Publication date: 01/01/2017
Field of study

A major challenge in paraphrase research is the lack of parallel corpora. In this paper, we present a new method to collect large-scale sentential paraphrases from Twitter by linking tweets through shared URLs. The main advantage of our method is its simplicity, as it gets rid of the classifier or human in the loop needed to select data before annotation and subsequent application of paraphrase identification algorithms in the previous work. We present the largest human-labeled paraphrase corpus to date of 51,524 sentence pairs and the first cross-domain benchmarking for automatic paraphrase identification. In addition, we show that more than 30,000 new sentential paraphrases can be easily and continuously captured every month at ~70% precision, and demonstrate their utility for downstream NLP tasks through phrasal paraphrase extraction. We make our code and data freely available.Comment: 11 pages, accepted to EMNLP 201

arXiv.org e-Print Archive

Crossref

Analysis of P systems simulation on CUDA

Author: Cecilia José M.
García Carrasco José M.
Guerrero Ginés D.
Martínez del Amor Miguel Ángel
Pérez Hurtado de Mendoza Ignacio
Pérez Jiménez Mario de Jesús
Publication venue: SARTECO: Sociedad de Arquitectura y Tecnología de Computadores
Publication date: 01/01/2009
Field of study

GPUs (Graphics Processing Unit) have been con- solidated as a massively data-parallel coprocessor to develop many general purpose computations, and en- able developers to utilize several levels of parallelism to obtain better performance of their applications. The massively parallel nature of certain computa- tions leads to use GPUs as an underlying architec- ture, becoming a good alternative to other paral- lel approaches. P systems or membrane systems are theoretical devices inspired in the way that liv- ing cells work, providing computational models and a high level computational modeling framework for biological systems. They are massively parallel dis- tributed, and non-deterministic systems. In this pa- per, we evaluate the GPU as the underlying archi- tecture to simulate the class of recognizer P systems with active membranes. We analyze the performance of three simulators implemented on CPU, CPU-GPU and GPU respectively. We compare them using a pre- sented P system as a benchmark, showing that the GPU is better suited than the CPU to simulate those P systems due to its massively parallel nature.Ministerio de Educación y Ciencia TIN2006-13425Junta de Andalucía P08–TIC0420

idUS. Depósito de Investigación Universidad de Sevilla

Implementing P Systems Parallelism by Means of GPUs

Author: Cecilia José M.
García José M.
Guerrero Ginés D.
Martínez del Amor Miguel Ángel
Pérez Hurtado de Mendoza Ignacio
Pérez Jiménez Mario de Jesús
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Software development for Membrane Computing is growing up yielding new applications. Nowadays, the efficiency of P systems simulators have become a critical point when working with instances of large size. The newest generation of GPUs (Graphics Processing Units) provide a massively parallel framework to compute general purpose computations. We present GPUs as an alternative to obtain better performance in the simulation of P systems and we illustrate it by giving a solution to the N-Queens problem as an example.Ministerio de Educación y Ciencia TIN2006-13425Junta de Andalucía P08–TIC-0420

idUS. Depósito de Investigación Universidad de Sevilla

Relay: A New IR for Machine Learning Frameworks

Author: Abadi Martin
Chen Tianqi
Krizhevsky Alex
Rotem Nadav
Shankar Asim
Vasilache Nicolas
Wei Richard
Wiltschko Alex
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 25/09/2018
Field of study

Machine learning powers diverse services in industry including search, translation, recommendation systems, and security. The scale and importance of these models require that they be efficient, expressive, and portable across an array of heterogeneous hardware devices. These constraints are often at odds; in order to better accommodate them we propose a new high-level intermediate representation (IR) called Relay. Relay is being designed as a purely-functional, statically-typed language with the goal of balancing efficient compilation, expressiveness, and portability. We discuss the goals of Relay and highlight its important design constraints. Our prototype is part of the open source NNVM compiler framework, which powers Amazon's deep learning framework MxNet

arXiv.org e-Print Archive

Crossref