Search CORE

20 research outputs found

Logical Entity Level Sentiment Analysis

Author: B Liu
B Pang
E Cambria
F Simančík
GA Miller
H Barendregt
J Hockenmaier
LK-W Tan
M Steedman
M Steedman
MP Marcus
NC Petersen
R Feldman
S Clark
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

Online Research Database In Technology

Hindi CCGbank: CCG Treebank from the Hindi Dependency Treebank

Author: A Bharati
A Joshi
A Mahajan
B Kumari
Bharat Ram Ambati
C Shastri
D Hays
J Hockenmaier
J Nivre
J Robinson
M Kuhlmann
M Lewis
M Palmer
M Steedman
Mark Steedman
MP Marcus
N Xue
S Clark
S Reddy
S Uematsu
T Mohanan
Tejaswini Deoskar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Crossref

Springer - Publisher Connector

Edinburgh Research Explorer

Atomic-accuracy prediction of protein loop structures through an RNA-inspired ansatz

Author: A Fiser
A Hildebrand
A Leaver-Fay
AA Canutescu
AC Martin
AM Buckle
B Kuhlman
BD Sellers
C Levinthal
C Wang
CA McPhalen
CA McPhalen
CA Rohl
Charlotte M. Deane
D Xu
DE Kim
DJ Mandell
DJ Mandell
F DiMaio
FC Chou
GF Schroder
GW Harris
IK McDonald
J Carlsson
J Desmet
J Hockenmaier
JA Cruz
JC Grigg
JJ Gray
JM Word
K Zhu
KD Gibson
L Kinch
LL Videau
MH Chu
N Eswar
N Ollikainen
NB Hammond
NJ Baird
P Sripakdeevong
P Vallurupalli
PB Harbury
R Das
R Das
R Kratzner
R Savva
Rhiju Das
S Raman
S Raman
S Vajda
SB Ozkan
SJ Chen
SJ Fleishman
SR Eddy
T Kortemme
T Kortemme
T Lazaridis
T Schwede
Y Urakubo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 24/05/2013
Field of study

Consistently predicting biopolymer structure at atomic resolution from sequence alone remains a difficult problem, even for small sub-segments of large proteins. Such loop prediction challenges, which arise frequently in comparative modeling and protein design, can become intractable as loop lengths exceed 10 residues and if surrounding side-chain conformations are erased. This article introduces a modeling strategy based on a 'stepwise ansatz', recently developed for RNA modeling, which posits that any realistic all-atom molecular conformation can be built up by residue-by-residue stepwise enumeration. When harnessed to a dynamic-programming-like recursion in the Rosetta framework, the resulting stepwise assembly (SWA) protocol enables enumerative sampling of a 12 residue loop at a significant but achievable cost of thousands of CPU-hours. In a previously established benchmark, SWA recovers crystallographic conformations with sub-Angstrom accuracy for 19 of 20 loops, compared to 14 of 20 by KIC modeling with a comparable expenditure of computational power. Furthermore, SWA gives high accuracy results on an additional set of 15 loops highlighted in the biological literature for their irregularity or unusual length. Successes include cis-Pro touch turns, loops that pass through tunnels of other side-chains, and loops of lengths up to 24 residues. Remaining problem cases are traced to inaccuracies in the Rosetta all-atom energy function. In five additional blind tests, SWA achieves sub-Angstrom accuracy models, including the first such success in a protein/RNA binding interface, the YbxF/kink-turn interaction in the fourth RNA-puzzle competition. These results establish all-atom enumeration as a systematic approach to protein structure that can leverage high performance computing and physically realistic energy functions to more consistently achieve atomic resolution.Comment: Identity of four-loop blind test protein and parts of figures 5 have been omitted in this preprint to ensure confidentiality of the protein structure prior to its public releas

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

A Robust Parser-Interpreter for Jazz Chord Sequences

Author: Auli M.
Baroni M.
Bernstein L.
Chemillier M.
Chew E.
Cooke D.
Cooper G.
Daniélou A.
de Haas W.B.
Euler L.
Fabb N.
Helmholtz H.
Hockenmaier J.
Huron D.
Jeans J.
Johnson-Laird P.N.
Keiler A.
Krumhansl C.
Lerdahl F.
Lerdahl F.
Lindblom B.
Longuet-Higgins H.C.
Longuet-Higgins H.C.
Longuet-Higgins H.C.
Marcus M.
Mark Granroth-Wilding
Mark Steedman
Mathews M.
Meyer L.B.
Narmour E.
Pachet F.
Piston W.
Rameau J.P.
Riemann H.
Riemann H.
Rohrmeier M.
Schenker H.
Simon H.
Srinivas B.
Steedman M.
Steedman M.
Sundberg J.
Temperley D.
Temperley D.
Tymoczko D.
Publication venue: 'Informa UK Limited'
Publication date: 02/10/2014
Field of study

Hierarchical structure similar to that associated with prosody and syntax in language can be identified in the rhythmic and harmonic progressions that underlie Western tonal music. Analysing such musical struc-ture resembles natural language parsing: it requires the derivation of an underlying interpretation from an un-structured sequence of highly ambiguous elements— in the case of music, the notes. The task here is not merely to decide whether the sequence is grammati-cal, but rather to decide which among a large number of analyses it has. An analysis of this sort is a part of the cognitive processing performed by listeners familiar with a musical idiom, whether musically trained or not. Our focus is on the analysis of the structure of ex-pectations and resolutions created by harmonic progres-sions. Building on previous work, we define a theory of tonal harmonic progression, which plays a role analo-gous to semantics in language. Our parser uses a formal grammar of jazz chord sequences, of a kind widely used for natural language processing (NLP), to map music, in the form of chord sequences used by performers, onto a representation of the structured relationships between chords. It uses statistical modelling techniques used for wide-coverage parsing in NLP to make practical pars-ing feasible in the face of considerable ambiguity in the grammar. Using machine learning over a small corpus of jazz chord sequences annotated with harmonic anal-yses, we show that grammar-based musical interpreta-tion using simple statistical parsing models is more ac-curate than a baseline HMM. The experiment demon-strates that statistical techniques adapted from NLP can be profitably applied to the analysis of harmonic struc-ture

CiteSeerX

Crossref

Edinburgh Research Explorer

Microplanning with Communicative Intentions: The SPUD System

Author: Appelt D.
Baldoni M.
Bonnie Webber
Brachman R.
Bratman M. E.
Butt M.
Candito M.
Cassell J.
Chen J.
Cheng H.
Christine Doran
Clark H. H.
Clark H. H.
Dale R.
Dang H. T.
Danlos L.
Davidson D.
Di Eugenio B.
Doran C.
Doran C.
Elhadad M.
Elhadad M.
Farinas del Cerro L.
Gardent C.
Gildea D.
Grice H. P.
Gundel J. K.
Hart P. E.
Hobbs J. R.
Hobbs J. R.
Hockenmaier J.
Hoffman B.
Horacek H.
Jackendoff R.
Jackendoff R.
Joshi A. K.
Joshi A. K.
Joshi A. K.
Kallmeyer L.
Kamp H.
Kehler A.
Kingsbury P.
Kipper K.
Kipper K.
Kittredge R.
Lascarides A.
Levelt W. J. M.
Levin B.
Lewis D.
Mackworth A.
Martha Palmer
Mathiessen C. M. I. M.
Matthew Stone
McDonald D. D.
McDonald D. D.
Mellish C.
Mellish C. S.
Meteer M. W.
Nicolov N.
Nilsson N.
Nogier J.
Palmer M.
Pereira F. C. N.
Pollack M. E.
Prevost S.
Prince E.
Rambow O.
Reiter E.
Reiter E.
Rubinoff R.
Saeboe K. J.
Sarkar A.
Shaw J.
Shieber S. M.
Shieber S. M.
Steedman M.
Stone M.
Stone M.
Stone M.
Stone M.
Stone M.
Stone M.
Stone M.
Stone M.
Talmy L.
Thomason R. H.
Thomason R. H.
Thomason R. H.
Tonia Bleam
van der Sandt R.
Wahlster W.
Wanner L.
Ward G.
Webber B. L.
Xia F.
Publication venue: 'Wiley'
Publication date: 01/01/2003
Field of study

The process of microplanning in Natural Language Generation (NLG) encompasses a range of problems in which a generator must bridge underlying domain-specific representations and general linguistic representations. These problems include constructing linguistic referring expressions to identify domain objects, selecting lexical items to express domain concepts, and using complex linguistic constructions to concisely convey related domain facts. In this paper, we argue that such problems are best solved through a uniform, comprehensive, declarative process. In our approach, the generator directly explores a search space for utterances described by a linguistic grammar. At each stage of search, the generator uses a model of interpretation, which characterizes the potential links between the utterance and the domain and context, to assess its progress in conveying domain-specific representations. We further address the challenges for implementation and knowledge representation in this approach. We show how to implement this approach effectively by using the lexicalized tree-adjoining grammar formalism (LTAG) to connect structure to meaning and using modal logic programming to connect meaning to context. We articulate a detailed methodology for designing grammatical and conceptua

CiteSeerX

Crossref

Edinburgh Research Explorer

Back-Translation Sampling by Targeting Difficult Words in Neural Machine Translation

Author: Chiang D.
Fadaee M.
Hockenmaier J.
Monz C.
Riloff E.
Tsujii J.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Neural Machine Translation has achieved state-of-the-art performance for several language pairs using a combination of parallel and synthetic data. Synthetic data is often generated by back-translating sentences randomly sampled from monolingual data using a reverse translation model. While back-translation has been shown to be very effective in many cases, it is not entirely clear why. In this work, we explore different aspects of back-translation, and show that words with high prediction loss during training benefit most from the addition of synthetic data. We introduce several variations of sampling strategies targeting difficult-to-predict words using prediction losses and frequencies of words. In addition, we also target the contexts of difficult words and sample sentences that are similar in context. Experimental results for the WMT news translation task show that our method improves translation quality by up to 1.7 and 1.2 Bleu points over back-translation using random sampling for German-English and English-German, respectively

The Importance of Being Recurrent for Modeling Hierarchical Structure

Author: Bisazza A.
Chiang D.
Hockenmaier J.
Monz C.
Riloff E.
Tran K.
Tsujii J.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks (Blevins et al., 2018) such as language modeling (Linzen et al., 2016; Gulordava et al., 2018) and neural machine translation (Shi et al., 2016). In contrast, the ability to model structured data with non-recurrent neural networks has received little attention despite their success in many NLP tasks (Gehring et al., 2017; Vaswani et al., 2017). In this work, we compare the two architectures—recurrent versus non-recurrent—with respect to their ability to model hierarchical structure and find that recurrency is indeed important for this purpose. The code and data used in our experiments is available at https://github.com/ ketranm/fan_vs_rn