118 research outputs found

    Syntactic and semantic features for statistical and neural machine translation

    Get PDF
    Machine Translation (MT) for language pairs with long distance dependencies and word reordering, such as German–English, is prone to producing output that is lexically or syntactically incoherent. Statistical MT (SMT) models used explicit or latent syntax to improve reordering, however failed at capturing other long distance dependencies. This thesis explores how explicit sentence-level syntactic information can improve translation for such complex linguistic phenomena. In particular, we work at the level of the syntactic-semantic interface with representations conveying the predicate-argument structures. These are essential to preserving semantics in translation and SMT systems have long struggled to model them. String-to-tree SMT systems use explicit target syntax to handle long-distance reordering, but make strong independence assumptions which lead to inconsistent lexical choices. To address this, we propose a Selectional Preferences feature which models the semantic affinities between target predicates and their argument fillers using the target dependency relations available in the decoder. We found that our feature is not effective in a string-to-tree system for German→English and that often the conditioning context is wrong because of mistranslated verbs. To improve verb translation, we proposed a Neural Verb Lexicon Model (NVLM) incorporating sentence-level syntactic context from the source which carries relevant semantic information for verb disambiguation. When used as an extra feature for re-ranking the output of a German→ English string-to-tree system, the NVLM improved verb translation precision by up to 2.7% and recall by up to 7.4%. While the NVLM improved some aspects of translation, other syntactic and lexical inconsistencies are not being addressed by a linear combination of independent models. In contrast to SMT, neural machine translation (NMT) avoids strong independence assumptions thus generating more fluent translations and capturing some long-distance dependencies. Still, incorporating additional linguistic information can improve translation quality. We proposed a method for tightly coupling target words and syntax in the NMT decoder. To represent syntax explicitly, we used CCG supertags, which encode subcategorization information, capturing long distance dependencies and attachments. Our method improved translation quality on several difficult linguistic constructs, including prepositional phrases which are the most frequent type of predicate arguments. These improvements over a strong baseline NMT system were consistent across two language pairs: 0.9 BLEU for German→English and 1.2 BLEU for Romanian→English

    Development of Automatic Digitization of Truck Number in Open Cast Mines Using Microcontroller

    Get PDF
    Geological condition in mines appears to be extremely complicated and there are many intelligence security problems. Production is falsely transfer by the unauthorized truck from mine pits also at loading point. It also lifted in wrong ways by malfunctioning of the truck weight in Weigh Bridge. Mining organizations are under the control of mafia and countless can be added to the mines mafia. An intelligence security system is need to monitor truck number in automatically using image acquisition method, automatic detection, recognition process, communication technology, information technology and microcontroller innovation to understand the working specification of the mining region. Tracking of the number plate from the truck is an important task, which demands intelligent solution. Intelligent surveillance in open casts mine security network using data accession is a prime task that protects the secure production of mines. So automatic truck number recognition technique is used to recognize the registration number of the truck which is used for transferring the mine production as well as track record the amount of the production. It also preserves the mines and thus improving its security. For extraction and recognition of number plate from truck image the system is uses MATLAB software tool. It is assumed that images of the truck have been captured from digital camera. The data acquisition terminal uses the PIC16F877A microcontroller as a core chip for sending data. The data are communicated through USB to TTL converter (RS232) with the main circuit to realize intelligent monitoring. To store the data in permanently it is uses EEPROM chip. Alphanumeric Characters on plate has been extracted and recognized using template images of alphanumeric characters. The proposed system performs the real time data monitoring to recognize the registration number plate of the trucks for getting required important information. It also provides to maintenance the history of data and support access contro

    Algebraic decoder specification: coupling formal-language theory and statistical machine translation: Algebraic decoder specification: coupling formal-language theory and statistical machine translation

    Get PDF
    The specification of a decoder, i.e., a program that translates sentences from one natural language into another, is an intricate process, driven by the application and lacking a canonical methodology. The practical nature of decoder development inhibits the transfer of knowledge between theory and application, which is unfortunate because many contemporary decoders are in fact related to formal-language theory. This thesis proposes an algebraic framework where a decoder is specified by an expression built from a fixed set of operations. As yet, this framework accommodates contemporary syntax-based decoders, it spans two levels of abstraction, and, primarily, it encourages mutual stimulation between the theory of weighted tree automata and the application

    FPGA Implementation of Blob Recognition

    Get PDF
    Real-time embedded vision systems can be used in a wide range of applications and therefore the demand has been increasing for them. In this thesis, an FPGA-based embedded vision system capable of recognizing objects in real time is presented. The proposed system architecture consists of multiple Intellectual Properties (IPs), which are used as a set of complex instructions by an integrated 32-bit CPU Microblaze. Each IP is tailored specifically to meet the needs of the application and at the same time to consume the minimum FPGA logic resources. Integrating both hardware and software on a single FPGA chip, this system can achieve the real-time performance of full VGA video processing at 32 frames per second (fps). In addition, this work comes up with a new method called Dual Connected Component Labelling (DCCL) suitable for FPGA implementation

    Cooperative development of logical modelling standards and tools with CoLoMoTo

    Get PDF
    The identification of large regulatory and signalling networks involved in the control of crucial cellular processes calls for proper modelling approaches. Indeed, models can help elucidate properties of these networks, understand their behaviour and provide (testable) predictions by performing in silico experiments. In this context, qualitative, logical frameworks have emerged as relevant approaches, as demonstrated by a growing number of published models, along with new methodologies and software tools. This productive activity now requires a concerted effort to ensure model reusability and interoperability between tools. Following an outline of the logical modelling framework, we present the most important achievements of the Consortium for Logical Models and Tools, along with future objectives. Our aim is to advertise this open community, which welcomes contributions from all researchers interested in logical modelling or in related mathematical and computational developments. Contact: [email protected]

    Prevention of Unauthorized Transport of Ore in Opencast Mines Using Automatic Number Plate Recognition

    Get PDF
    Security in mining is a primary concern, which mainly affects the production cost. An efficiently detecting and deterring theft will maximize the profitability of any mining organization. Many illegal transportation cases were registered in spite of rules imposed by central and state governments under Section 23 (c) of MMDR Act 1957. Use of an automated checkpoint gate based on license plate recognition and biometric fingerprint system for vehicle tracking enhances the security in mines. The method was tested on the number plates with various considerations like clean number plates, clean fingerprints, dusty and faded number plates, dusty fingerprints, and number plates captured by varying distance. By considering all the above conditions the pictures were processed by ANPR and bio-metric fingerprint modules. Vehicle license number plate was captured using a digital camera and the captured RGB image was converted to grayscale image. Thresholding was done to remove unwanted areas from the grayscale image. The characters of the number plate were segmented using Gabor filter. A track-sector matrix was generated by considering the number of pixels in each region and was matched with existing template to identify the character. The fingerprint scans the finger and matches with the template created at the time of fingerprint registration at the machine. The micro-controller accepted the processed output in binary form from ANPR and bio-metric fingerprint system. The micro-controller processed the binary output and the checkpoint gate was closed/open based on the output provided by the microcontroller to motor driver

    Adjunction in hierarchical phrase-based translation

    Get PDF

    Leveraging the Intrinsic Switching Behaviors of Spintronic Devices for Digital and Neuromorphic Circuits

    Get PDF
    With semiconductor technology scaling approaching atomic limits, novel approaches utilizing new memory and computation elements are sought in order to realize increased density, enhanced functionality, and new computational paradigms. Spintronic devices offer intriguing avenues to improve digital circuits by leveraging non-volatility to reduce static power dissipation and vertical integration for increased density. Novel hybrid spintronic-CMOS digital circuits are developed herein that illustrate enhanced functionality at reduced static power consumption and area cost. The developed spin-CMOS D Flip-Flop offers improved power-gating strategies by achieving instant store/restore capabilities while using 10 fewer transistors than typical CMOS-only implementations. The spin-CMOS Muller C-Element developed herein improves asynchronous pipelines by reducing the area overhead while adding enhanced functionality such as instant data store/restore and delay-element-free bundled data asynchronous pipelines. Spintronic devices also provide improved scaling for neuromorphic circuits by enabling compact and low power neuron and non-volatile synapse implementations while enabling new neuromorphic paradigms leveraging the stochastic behavior of spintronic devices to realize stochastic spiking neurons, which are more akin to biological neurons and commensurate with theories from computational neuroscience and probabilistic learning rules. Spintronic-based Probabilistic Activation Function circuits are utilized herein to provide a compact and low-power neuron for Binarized Neural Networks. Two implementations of stochastic spiking neurons with alternative speed, power, and area benefits are realized. Finally, a comprehensive neuromorphic architecture comprising stochastic spiking neurons, low-precision synapses with Probabilistic Hebbian Plasticity, and a novel non-volatile homeostasis mechanism is realized for subthreshold ultra-low-power unsupervised learning with robustness to process variations. Along with several case studies, implications for future spintronic digital and neuromorphic circuits are presented

    Unification-based constraints for statistical machine translation

    Get PDF
    Morphology and syntax have both received attention in statistical machine translation research, but they are usually treated independently and the historical emphasis on translation into English has meant that many morphosyntactic issues remain under-researched. Languages with richer morphologies pose additional problems and conventional approaches tend to perform poorly when either source or target language has rich morphology. In both computational and theoretical linguistics, feature structures together with the associated operation of unification have proven a powerful tool for modelling many morphosyntactic aspects of natural language. In this thesis, we propose a framework that extends a state-of-the-art syntax-based model with a feature structure lexicon and unification-based constraints on the target-side of the synchronous grammar. Whilst our framework is language-independent, we focus on problems in the translation of English to German, a language pair that has a high degree of syntactic reordering and rich target-side morphology. We first apply our approach to modelling agreement and case government phenomena. We use the lexicon to link surface form words with grammatical feature values, such as case, gender, and number, and we use constraints to enforce feature value identity for the words in agreement and government relations. We demonstrate improvements in translation quality of up to 0.5 BLEU over a strong baseline model. We then examine verbal complex production, another aspect of translation that requires the coordination of linguistic features over multiple words, often with long-range discontinuities. We develop a feature structure representation of verbal complex types, using constraint failure as an indicator of translation error and use this to automatically identify and quantify errors that occur in our baseline system. A manual analysis and classification of errors informs an extended version of the model that incorporates information derived from a parse of the source. We identify clause spans and use model features to encourage the generation of complete verbal complex types. We are able to improve accuracy as measured using precision and recall against values extracted from the reference test sets. Our framework allows for the incorporation of rich linguistic information and we present sketches of further applications that could be explored in future work
    corecore