1,392 research outputs found
Cross-Lingual Adaptation using Structural Correspondence Learning
Cross-lingual adaptation, a special case of domain adaptation, refers to the
transfer of classification knowledge between two languages. In this article we
describe an extension of Structural Correspondence Learning (SCL), a recently
proposed algorithm for domain adaptation, for cross-lingual adaptation. The
proposed method uses unlabeled documents from both languages, along with a word
translation oracle, to induce cross-lingual feature correspondences. From these
correspondences a cross-lingual representation is created that enables the
transfer of classification knowledge from the source to the target language.
The main advantages of this approach over other approaches are its resource
efficiency and task specificity.
We conduct experiments in the area of cross-language topic and sentiment
classification involving English as source language and German, French, and
Japanese as target languages. The results show a significant improvement of the
proposed method over a machine translation baseline, reducing the relative
error due to cross-lingual adaptation by an average of 30% (topic
classification) and 59% (sentiment classification). We further report on
empirical analyses that reveal insights into the use of unlabeled data, the
sensitivity with respect to important hyperparameters, and the nature of the
induced cross-lingual correspondences
Model construction in analysis and synthesis tasks
von Benno Maria SteinMutmaĂźliches Abgabedatum: 20.12.2001Paderborn, Univ., Habil.-Schr., 200
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
Reasoning is a crucial part of natural language argumentation. To comprehend
an argument, one must analyze its warrant, which explains why its claim follows
from its premises. As arguments are highly contextualized, warrants are usually
presupposed and left implicit. Thus, the comprehension does not only require
language understanding and logic skills, but also depends on common sense. In
this paper we develop a methodology for reconstructing warrants systematically.
We operationalize it in a scalable crowdsourcing process, resulting in a freely
licensed dataset with warrants for 2k authentic arguments from news comments.
On this basis, we present a new challenging task, the argument reasoning
comprehension task. Given an argument with a claim and a premise, the goal is
to choose the correct implicit warrant from two options. Both warrants are
plausible and lexically close, but lead to contradicting claims. A solution to
this task will define a substantial step towards automatic warrant
reconstruction. However, experiments with several neural attention and language
models reveal that current approaches do not suffice.Comment: Accepted as NAACL 2018 Long Paper; see details on the front pag
A keyquery-based classification system for CORE
We apply keyquery-based taxonomy composition to compute a classification system for the CORE dataset, a shared crawl of about 850,000 scientific papers. Keyquery-based taxonomy composition can be understood as a two-phase hierarchical document clustering technique that utilizes search queries as cluster labels: In a first phase, the document collection is indexed by a reference search engine, and the documents are tagged with the search queries they are relevant—for their so-called keyqueries. In a second phase, a hierarchical clustering is formed from the keyqueries within an iterative process. We use the explicit topic model ESA as document retrieval model in order to index the CORE dataset in the reference search engine. Under the ESA retrieval model, documents are represented as vectors of similarities to Wikipedia articles; a methodology proven to be advantageous for text categorization tasks. Our paper presents the generated taxonomy and reports on quantitative properties such as document coverage and processing requirements
Learning Overlap Optimization for Domain Decomposition Methods
Abstract. The finite element method is a numerical simulation technique for solving partial differential equations. Domain decomposition provides a means for parallelizing the expensive simulation with modern computing architecture. Choosing the sub-domains for domain decomposition is a non-trivial task, and in this paper we show how this can be addressed with machine learning. Our method starts with a baseline decomposition, from which we learn tailored sub-domain overlaps from localized neighborhoods. An evaluation of 527 partial differential equations shows that our learned solutions improve the baseline decomposition with high consistency and by a statistically significant margin
Recommended from our members
Demanded Abstract Interpretation
Formal static analysis is seeing increasingly widespread adoption as a tool for verificationand bug-finding, but even with powerful cloud infrastructure it can take minutes or hours for a
developer to get analysis results after a code change. This dissertation considers the problem of
making expressive and sophisticated static analyzers interactive by providing analysis results to
developers in as close to real time as possible. While existing techniques offer some demand-driven
or incremental aspects for certain classes of analysis, the fundamental challenge addressed by this
work is doing both for abstract interpretation in arbitrary domains.This dissertation presents a technique, demanded abstract interpretation, that lifts analysiscomputations to a dependency graph structure in which incremental program edits and demand-driven evaluation of abstract semantics can be handled uniformly. Demanded abstract interpretation
draws inspiration from graph-based approaches to incremental computation, and is not only sound
and terminating but also from-scratch consistent with underlying batch analyses.
The approach is parametric in the choice of abstract domain, supporting a wide range of
analysis problems and enabling the reuse of highly-tuned existing domain implementations in our
demanded analysis framework without requiring any per-domain reasoning about incrementality or
demand. The complex, cyclic, and unbounded dependency structures that arise when analyzing
loops and recursive control flow in an infinite-height domain are a key challenge, which our approach
handles by dynamically extending novel acyclic encodings of such analysis computation.This dissertation describes and formalizes demanded abstract interpretation techniques forboth intraprocedural analysis and compositional interprocedural analysis. We also present promising
experimental results in a prototype analysis implementation, and describe some extensions to the
framework designed to confront practical resource constraints without sacrificing formal guarantees
- …