Search CORE

456 research outputs found

Better synchronous binarization for machine translation

Author: Dongdong Zhang
Jingbo Zhu
Ming Zhou
Mu Li
Tong Xiao
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2009
Field of study

Binarization of Synchronous Context Free Grammars (SCFG) is essential for achieving polynomial time complexity of decoding for SCFG parsing based machine translation sys-tems. In this paper, we first investigate the excess edge competition issue caused by a left-heavy binary SCFG derived with the method of Zhang et al. (2006). Then we propose a new binarization method to mitigate the problem by exploring other alternative equivalent bi-nary SCFGs. We present an algorithm that ite-ratively improves the resulting binary SCFG, and empirically show that our method can im-prove a string-to-tree statistical machine trans-lations system based on the synchronous bina-rization method in Zhang et al. (2006) on the NIST machine translation evaluation tasks.

CiteSeerX

Crossref

Motion estimation and CABAC VLSI co-processors for real-time high-quality H.264/AVC video coding

Author: Casula M.
Fanucci L.
Martina Maurizio
Masera Guido
Saponara S.
Publication venue: Elsevier
Publication date: 01/01/2010
Field of study

Real-time and high-quality video coding is gaining a wide interest in the research and industrial community for different applications. H.264/AVC, a recent standard for high performance video coding, can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require to sustain up to tens of Mbits/s. To that purpose this paper proposes optimized architectures for H.264/AVC most critical tasks, Motion estimation and context adaptive binary arithmetic coding. Post synthesis results on sub-micron CMOS standard-cells technologies show that the proposed architectures can actually process in real-time 720 × 480 video sequences at 30 frames/s and grant more than 50 Mbits/s. The achieved circuit complexity and power consumption budgets are suitable for their integration in complex VLSI multimedia systems based either on AHB bus centric on-chip communication system or on novel Network-on-Chip (NoC) infrastructures for MPSoC (Multi-Processor System on Chip

Archivio della Ricerca - Università di Pisa

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

XNOR Neural Engine: a Hardware Accelerator IP for 21.6 fJ/op Binary Neural Network Inference

Author: Benini Luca
Conti Francesco
Schiavone Pasquale Davide
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Binary Neural Networks (BNNs) are promising to deliver accuracy comparable to conventional deep neural networks at a fraction of the cost in terms of memory and energy. In this paper, we introduce the XNOR Neural Engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM / standard cell memory. The XNE is able to fully compute convolutional and dense layers in autonomy or in cooperation with the core in the MCU to realize more complex behaviors. We show post-synthesis results in 65nm and 22nm technology for the XNE IP and post-layout results in 22nm for the full MCU indicating that this system can drop the energy cost per binary operation to 21.6fJ per operation at 0.4V, and at the same time is flexible and performant enough to execute state-of-the-art BNN topologies such as ResNet-34 in less than 2.2mJ per frame at 8.9 fps.Comment: 11 pages, 8 figures, 2 tables, 3 listings. Accepted for presentation at CODES'18 and for publication in IEEE Transactions on Computer-Aided Design of Circuits and Systems (TCAD) as part of the ESWEEK-TCAD special issu

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A Parallel Implementation of the Thresholding Problem by Using Tissue-Like P Systems

Author: Berciano Ainhoa
Díaz Pernil Daniel
Gutiérrez Naranjo Miguel Ángel
Peña Cantillana Francisco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

In this paper we present a parallel algorithm to solve the thresholding problem by using Membrane Computing techniques. This bio-inspired algorithm has been implemented in a novel device architecture called CUDATM, (Compute Unified Device Architecture). We present some examples, compare the obtained time and present some research lines for the future.Ministerio de Ciencia e Innovación TIN2008-04487-EMinisterio de Ciencia e Innovación TIN-2009-13192Junta de Andalucía P08-TIC-04200Ministerio de Educación y Ciencia MTM2009-12716Junta de Andalucía PO6-TIC-02268Universidad del País Vasco EHU09/0

idUS. Depósito de Investigación Universidad de Sevilla

Smart cmos image sensor for 3d measurement

Author: Upendranath Vanam
Publication venue
Publication date: 01/02/2005
Field of study

3D measurements are concerned with extracting visual information from the geometry of visible surfaces and interpreting the 3D coordinate data thus obtained, to detect or track the position or reconstruct the profile of an object, often in real time. These systems necessitate image sensors with high accuracy of position estimation and high frame rate of data processing for handling large volumes of data. A standard imager cannot address the requirements of fast image acquisition and processing, which are the two figures of merit for 3D measurements. Hence, dedicated VLSI imager architectures are indispensable for designing these high performance sensors. CMOS imaging technology provides potential to integrate image processing algorithms on the focal plane of the device, resulting in smart image sensors, capable of achieving better processing features in handling massive image data. The objective of this thesis is to present a new architecture of smart CMOS image sensor for real time 3D measurement using the sheet-beam projection methods based on active triangulation. Proposing the vision sensor as an ensemble of linear sensor arrays, all working in parallel and processing the entire image in slices, the complexity of the image-processing task shifts from O (N 2 ) to O (N). Inherent also in the design is the high level of parallelism to achieve massive parallel processing at high frame rate, required in 3D computation problems. This work demonstrates a prototype of the smart linear sensor incorporating full testability features to test and debug both at device and system levels. The salient features of this work are the asynchronous position to pulse stream conversion, multiple images binarization, high parallelism and modular architecture resulting in frame rate and sub-pixel resolution suitable for real time 3D measurements

Unitn-eprints Research

Algebraic decoder specification: coupling formal-language theory and statistical machine translation: Algebraic decoder specification: coupling formal-language theory and statistical machine translation

Author: Büchse Matthias
Publication venue
Publication date: 18/12/2014
Field of study

The specification of a decoder, i.e., a program that translates sentences from one natural language into another, is an intricate process, driven by the application and lacking a canonical methodology. The practical nature of decoder development inhibits the transfer of knowledge between theory and application, which is unfortunate because many contemporary decoders are in fact related to formal-language theory. This thesis proposes an algebraic framework where a decoder is specified by an expression built from a fixed set of operations. As yet, this framework accommodates contemporary syntax-based decoders, it spans two levels of abstraction, and, primarily, it encourages mutual stimulation between the theory of weighted tree automata and the application

Technische Universität Dresden: Qucosa

A CYK+ Variant for SCFG Decoding Without a Dot Chart

Author: Sennrich Rico
Publication venue
Publication date: 01/10/2014
Field of study

Edinburgh Research Explorer

FPGA Implementation of Blob Recognition

Author: Xiong Jian
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2011
Field of study

Real-time embedded vision systems can be used in a wide range of applications and therefore the demand has been increasing for them. In this thesis, an FPGA-based embedded vision system capable of recognizing objects in real time is presented. The proposed system architecture consists of multiple Intellectual Properties (IPs), which are used as a set of complex instructions by an integrated 32-bit CPU Microblaze. Each IP is tailored specifically to meet the needs of the application and at the same time to consume the minimum FPGA logic resources. Integrating both hardware and software on a single FPGA chip, this system can achieve the real-time performance of full VGA video processing at 32 frames per second (fps). In addition, this work comes up with a new method called Dual Connected Component Labelling (DCCL) suitable for FPGA implementation

Scholarship at UWindsor

Syntactic and semantic features for statistical and neural machine translation

Author: Nădejde Maria
Publication venue: The University of Edinburgh
Publication date: 02/07/2018
Field of study

Machine Translation (MT) for language pairs with long distance dependencies and word reordering, such as German–English, is prone to producing output that is lexically or syntactically incoherent. Statistical MT (SMT) models used explicit or latent syntax to improve reordering, however failed at capturing other long distance dependencies. This thesis explores how explicit sentence-level syntactic information can improve translation for such complex linguistic phenomena. In particular, we work at the level of the syntactic-semantic interface with representations conveying the predicate-argument structures. These are essential to preserving semantics in translation and SMT systems have long struggled to model them. String-to-tree SMT systems use explicit target syntax to handle long-distance reordering, but make strong independence assumptions which lead to inconsistent lexical choices. To address this, we propose a Selectional Preferences feature which models the semantic affinities between target predicates and their argument fillers using the target dependency relations available in the decoder. We found that our feature is not effective in a string-to-tree system for German→English and that often the conditioning context is wrong because of mistranslated verbs. To improve verb translation, we proposed a Neural Verb Lexicon Model (NVLM) incorporating sentence-level syntactic context from the source which carries relevant semantic information for verb disambiguation. When used as an extra feature for re-ranking the output of a German→ English string-to-tree system, the NVLM improved verb translation precision by up to 2.7% and recall by up to 7.4%. While the NVLM improved some aspects of translation, other syntactic and lexical inconsistencies are not being addressed by a linear combination of independent models. In contrast to SMT, neural machine translation (NMT) avoids strong independence assumptions thus generating more fluent translations and capturing some long-distance dependencies. Still, incorporating additional linguistic information can improve translation quality. We proposed a method for tightly coupling target words and syntax in the NMT decoder. To represent syntax explicitly, we used CCG supertags, which encode subcategorization information, capturing long distance dependencies and attachments. Our method improved translation quality on several difficult linguistic constructs, including prepositional phrases which are the most frequent type of predicate arguments. These improvements over a strong baseline NMT system were consistent across two language pairs: 0.9 BLEU for German→English and 1.2 BLEU for Romanian→English

Edinburgh Research Archive

Edinburgh Research Explorer

Object tracking using a many-core embedded system

Author: Minozzo Laercio
Publication venue
Publication date: 01/01/2017
Field of study

Object localization and tracking is essential for many practical applications, such as mancomputer interaction, security and surveillance, robot competitions, and Industry 4.0. Because of the large amount of data present in an image, and the algorithmic complexity involved, this task can be computationally demanding, mainly for traditional embedded systems, due to their processing and storage limitations. This calls for investigation and experimentation with new approaches, as emergent heterogeneous embedded systems, that promise higher performance, without compromising energy e ciency. This work explores several real-time color-based object tracking techniques, applied to images supplied by a RGB-D sensor attached to di erent embedded platforms. The main motivation was to explore an heterogeneous Parallella board with a 16-core Epiphany coprocessor, to reduce image processing time. Another goal was to confront this platform with more conventional embedded systems, namely the popular Raspberry Pi family. In this regard, several processing options were pursued, from low-level implementations specially tailored to the Parallella, to higher-level multi-platform approaches. The results achieved allow to conclude that the programming e ort required to e - ciently use the Epiphany co-processor is considerable. Also, for the selected case study, the performance attained was bellow the one o ered by simpler approaches running on quad-core Raspberry Pi boards.A localização e o seguimento de objetos são essenciais para muitas aplicações práticas, como interação homem-computador, segurança e vigilância, competições de robôs e Industria 4.0. Devido `a grande quantidade de dados presentes numa imagem, e a` complexidade algorítmica envolvida, esta tarefa pode ser computacionalmente exigente, principalmente para os sistemas embebidos tradicionais, devido às suas limitações de processamento e armazenamento. Desta forma, ´e importante a investigação e experimentação com novas abordagens, tais como sistemas embebidos heterogéneos emergentes, que trazem consigo a promessa de melhor desempenho, sem comprometer a eficiência energética. Este trabalho explora várias t´técnicas de seguimento de objetos em tempo real baseado em imagens a cores adquiridas por um sensor RBD-D, conectado a diferentes sistemas em- bebidos. A motivação principal foi a exploração de uma placa heterogénea Parallella com um co-processador Epiphany de 16 núcleos, a fim de reduzir o tempo de processamento das imagens. Outro objetivo era confrontar esta plataforma com sistemas embebidos mais convencionais, nomeadamente a popular família Raspberry Pi. Nesse sentido, foram prosseguidas diversas opções de processamento, desde implementações de baixo nível, específicas da placa Parallella, até abordagens multi-plataforma de mais alto nível. Os resultados alcançados permitem concluir que o esforço de programação necessário para utilizar eficientemente o co-processador Epiphany é considerável. Adicionalmente, para o caso de estudo deste trabalho, o desempenho alcançado fica aquém do conseguido por abordagens mais simples executando em sistemas Raspberry Pi com quatro núcleos

Biblioteca Digital do IPB