Search CORE

2,333 research outputs found

Rapid prediction of NMR spectral properties with quantified uncertainty

Author: Jonas Eric
Kuhn Stefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/07/2018
Field of study

open access articleAccurate calculation of specific spectral properties for NMR is an important step for molecular structure elucidation. Here we report the development of a novel machine learning technique for accurately predicting chemical shifts of both 1H and 13C nuclei which exceeds DFT-accessible accuracy for 13C and 1H for a subset of nuclei, while being orders of magnitude more performant. Our method produces estimates of uncertainty, allowing for robust and confident predictions, and suggests future avenues for improved performance

De Montfort University Open Research Archive

Learning Semantic Correspondences in Technical Documentation

Author: Kuhn Jonas
Richardson Kyle
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We consider the problem of translating high-level textual descriptions to formal representations in technical documentation as part of an effort to model the meaning of such documentation. We focus specifically on the problem of learning translational correspondences between text descriptions and grounded representations in the target documentation, such as formal representation of functions or code templates. Our approach exploits the parallel nature of such documentation, or the tight coupling between high-level text and the low-level representations we aim to learn. Data is collected by mining technical documents for such parallel text-representation pairs, which we use to train a simple semantic parsing model. We report new baseline results on sixteen novel datasets, including the standard library documentation for nine popular programming languages across seven natural languages, and a small collection of Unix utility manuals.Comment: accepted to ACL-201

arXiv.org e-Print Archive

Crossref

Polyglot Semantic Parsing in APIs

Author: Berant Jonathan
Kuhn Jonas
Richardson Kyle
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs. In this paper, we explore the idea of polyglot semantic translation, or learning semantic parsing models that are trained on multiple datasets and natural languages. In particular, we focus on translating text to code signature representations using the software component datasets of Richardson and Kuhn (2017a,b). The advantage of such models is that they can be used for parsing a wide variety of input natural languages and output programming languages, or mixed input languages, using a single unified model. To facilitate modeling of this type, we develop a novel graph-based decoding framework that achieves state-of-the-art performance on the above datasets, and apply this method to two other benchmark SP tasks.Comment: accepted for NAACL-2018 (camera ready version

arXiv.org e-Print Archive

Crossref

Recommended from our members

Resolving some apparent formal problems of OT Syntax

Author: Kuhn Jonas
Publication venue: ScholarWorks@UMass Amherst
Publication date: 15/10/2020
Field of study

ScholarWorks@UMass Amherst

SLM-based Digital Adaptive Coronagraphy: Current Status and Capabilities

Author: Arikan Marcel
Kuhn Jonas
Lu Xin
Patapis Polychronis
Publication venue
Publication date: 01/01/2018
Field of study

Active coronagraphy is deemed to play a key role for the next generation of high-contrast instruments, notably in order to deal with large segmented mirrors that might exhibit time-dependent pupil merit function, caused by missing or defective segments. To this purpose, we recently introduced a new technological framework called digital adaptive coronagraphy (DAC), making use of liquid-crystal spatial light modulators (SLMs) display panels operating as active focal-plane phase mask coronagraphs. Here, we first review the latest contrast performance, measured in laboratory conditions with monochromatic visible light, and describe a few potential pathways to improve SLM coronagraphic nulling in the future. We then unveil a few unique capabilities of SLM-based DAC that were recently, or are currently in the process of being, demonstrated in our laboratory, including NCPA wavefront sensing, aperture-matched adaptive phase masks, coronagraphic nulling of multiple star systems, and coherent differential imaging (CDI).Comment: 14 pages, 9 figures, to appear in Proceedings of the SPIE, paper 10706-9

arXiv.org e-Print Archive

Repository for Publications and Research Data

Applying Occam's Razor to Transformer-Based Dependency Parsing: What Works, What Doesn't, and What is Really Necessary

Author: Friedrich Annemarie
Grünewald Stefan
Kuhn Jonas
Publication venue
Publication date: 01/01/2021
Field of study

The introduction of pre-trained transformer-based contextualized word embeddings has led to considerable improvements in the accuracy of graph-based parsers for frameworks such as Universal Dependencies (UD). However, previous works differ in various dimensions, including their choice of pre-trained language models and whether they use LSTM layers. With the aims of disentangling the effects of these choices and identifying a simple yet widely applicable architecture, we introduce STEPS, a new modular graph-based dependency parser. Using STEPS, we perform a series of analyses on the UD corpora of a diverse set of languages. We find that the choice of pre-trained embeddings has by far the greatest impact on parser performance and identify XLM-R as a robust choice across the languages in our study. Adding LSTM layers provides no benefits when using transformer-based embeddings. A multi-task training setup outputting additional UD features may contort results. Taking these insights together, we propose a simple but widely applicable parser architecture and configuration, achieving new state-of-the-art results (in terms of LAS) for 10 out of 12 diverse languages.Comment: 14 pages, 1 figure; camera-ready version for IWPT 202

arXiv.org e-Print Archive

OPUS Augsburg

Towards Opinion Mining from Reviews for the Prediction of Product Rankings

Author: Kessler Wiltrud
Klinger Roman
Kuhn Jonas
Publication venue: Lisboa, Portugal
Publication date: 01/01/2015
Field of study

Opinion mining aims at summarizing the content of reviews for a specific brand, product, or manufacturer. However, the actual desire of a user is often one step further: Produce a ranking corresponding to specific needs such that a selection process is supported. In this work, we aim towards closing this gap. We present the task to rank products based on sentiment information and discuss necessary steps towards addressing this task. This includes, on the one hand, the identification of gold rankings as a fundament for an objective function and evaluation and, on the other hand, methods to rank products based on review information. To demonstrate early results on that task, we employ real world examples of rankings as gold standard that are of interest to potential customers as well as product managers, in our case the sales ranking provided by Amazon.com and the quality ranking by Snapsort.com. As baseline methods, we use the average star ratings and review frequencies. Our best text-based approximation of the sales ranking achieves a Spearman’s correlation coefficient of ρ = 0.23. On the Snapsort data, a ranking based on extracting comparisons leads to ρ = 0.51. In addition, we show that aspect-specific rankings can be used to measure the impact of specific aspects on the ranking

Crossref

Forschungsinformationssystem der Universität Bamberg