Search CORE

131,215 research outputs found

An open source rule induction tool for transfer-based SMT

Author: Graham Yvette
van Genabith Josef
Publication venue: Charles University, Prague
Publication date: 01/01/2009
Field of study

In this paper we describe an open source tool for automatic induction of transfer rules. Transfer rule induction is carried out on pairs of dependency structures and their node alignment to produce all rules consistent with the node alignment. We describe an efficient algorithm for rule induction and give a detailed description of how to use the tool

Irish Universities

DCU Online Research Access Service

System combination with extra alignment information

Author: Liu Qun
Okita Tsuyoshi
van Genabith Josef
Wu Xiaofeng
Publication venue
Publication date: 09/12/2012
Field of study

This paper provides the system description of the IHMM team of Dublin City University for our participation in the system combination task in the Second Workshop on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid MT (ML4HMT-12). Our work is based on a confusion network-based approach to system combination. We propose a new method to build a confusion network for this: (1) incorporate extra alignment information extracted from given meta data, treating them as sure alignments, into the results from IHMM, and (2) decode together with this information. We also heuristically set one of the system outputs as the default backbone. Our results show that this backbone, which is the RBMT system output, achieves an 0.11% improvement in BLEU over the backbone chosen by TER, while the extra information we added in the decoding part does not improve the results

DCU Online Research Access Service

Integrating Semantic Knowledge to Tackle Zero-shot Text Classification

Author: Guo Yike
Lertvittayakumjorn Piyawat
Zhang Jingqing
Publication venue
Publication date: 01/01/2019
Field of study

Insufficient or even unavailable training data of emerging classes is a big challenge of many classification tasks, including text classification. Recognising text documents of classes that have never been seen in the learning stage, so-called zero-shot text classification, is therefore difficult and only limited previous works tackled this problem. In this paper, we propose a two-phase framework together with data augmentation and feature augmentation to solve this problem. Four kinds of semantic knowledge (word embeddings, class descriptions, class hierarchy, and a general knowledge graph) are incorporated into the proposed framework to deal with instances of unseen classes effectively. Experimental results show that each and the combination of the two phases achieve the best overall accuracy compared with baselines and recent approaches in classifying real-world texts under the zero-shot scenario.Comment: Accepted NAACL-HLT 201

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository