Search CORE

6,263 research outputs found

Statistical Machine Translation Features with Multitask Tensor Networks

Author: Devlin Jacob
Huang Zhongqiang
Lamar Thomas
Makhoul John
Schwartz Richard
Setiawan Hendra
Zbib Rabih
Publication venue
Publication date: 01/01/2015
Field of study

We present a three-pronged approach to improving Statistical Machine Translation (SMT), building on recent success in the application of neural networks to SMT. First, we propose new features based on neural networks to model various non-local translation phenomena. Second, we augment the architecture of the neural network with tensor layers that capture important higher-order interaction among the network units. Third, we apply multitask learning to estimate the neural network parameters jointly. Each of our proposed methods results in significant improvements that are complementary. The overall improvement is +2.7 and +1.8 BLEU points for Arabic-English and Chinese-English translation over a state-of-the-art system that already includes neural network features.Comment: 11 pages (9 content + 2 references), 2 figures, accepted to ACL 2015 as a long pape

arXiv.org e-Print Archive

Assessing spatiotemporal correlations from data for short-term traffic prediction using multi-task learning

Author: Casas Vilaró Jordi
Gavaldà Mestre Ricard
Mena Yedra Rafael
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Traffic flow prediction is a fundamental problem for efficient transportation control and management. However, most current data-driven traffic prediction work found in the literature have focused on predicting traffic from an individual task perspective, and have not fully leveraged the implicit knowledge present in a road-network through space and time correlations. Such correlations are now far easier to isolate due to the recent profusion of traffic data sources and more specifically their wide geographic spread. In this paper, we take a multi-task learning (MTL) approach whose fundamental aim is to improve the generalization performance by leveraging the domain-specific information contained in related tasks that are jointly learned. In addition, another common factor found in the literature is that a historical dataset is used for the calibration and the assessment of the proposed approach, without dealing in any explicit or implicit way with the frequent challenges found in real-time prediction. In contrast, we adopt a different approach which faces this problem from a point of view of streams of data, and thus the learning procedure is undertaken online, giving greater importance to the most recent data, making data-driven decisions online, and undoing decisions which are no longer optimal. In the experiments presented we achieve a more compact and consistent knowledge in the form of rules automatically extracted from data, while maintaining or even improving, in some cases, the performance over single-task learning (STL).Peer ReviewedPostprint (published version

Classifying Options for Deep Reinforcement Learning

Author: Arulkumaran Kai
Bharath Anil Anthony
Dilokthanakul Nat
Shanahan Murray
Publication venue
Publication date: 23/05/2016
Field of study

In this paper we combine one method for hierarchical reinforcement learning - the options framework - with deep Q-networks (DQNs) through the use of different "option heads" on the policy network, and a supervisory network for choosing between the different options. We utilise our setup to investigate the effects of architectural constraints in subtasks with positive and negative transfer, across a range of network capacities. We empirically show that our augmented DQN has lower sample complexity when simultaneously learning subtasks with negative transfer, without degrading performance when learning subtasks with positive transfer.Comment: IJCAI 2016 Workshop on Deep Reinforcement Learning: Frontiers and Challenge

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

A hybrid representation based simile component extraction

Author: Cai Yi
Chen Junying
Li Qing
Ren Da
Tao Xiaohui
Zhang Pengfei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/03/2020
Field of study

Simile, a special type of metaphor, can help people to express their ideas more clearly. Simile component extraction is to extract tenors and vehicles from sentences. This task has a realistic significance since it is useful for building cognitive knowledge base. With the development of deep neural networks, researchers begin to apply neural models to component extraction. Simile components should be in cross-domain. According to our observations, words in cross-domain always have different concepts. Thus, concept is important when identifying whether two words are simile components or not. However, existing models do not integrate concept into their models. It is difficult for these models to identify the concept of a word. What’s more, corpus about simile component extraction is limited. There are a number of rare words or unseen words, and the representations of these words are always not proper enough. Exiting models can hardly extract simile components accurately when there are low-frequency words in sentences. To solve these problems, we propose a hybrid representation-based component extraction (HRCE) model. Each word in HRCE is represented in three different levels: word level, concept level and character level. Concept representations (representations in concept level) can help HRCE to identify the words in cross-domain more accurately. Moreover, with the help of character representations (representations in character levels), HRCE can represent the meaning of a word more properly since words are consisted of characters and these characters can partly represent the meaning of words. We conduct experiments to compare the performance between HRCE and existing models. The experiment results show that HRCE significantly outperforms current models