342 research outputs found
Monitasoisuus - malli puupankeissa olevia dependenssirakenteita varten
Cited several times. E.g. 1. Marco Kuhlmann & Joakim Nivre: Mildly non-projective dependency structures. In the Proceedings of the COLING/ACL on Main conference poster sessions, p. 507--514. In series COLING-ACL '06. Sydney, Australia, 2006. 2. Carlos Gómez-Rodriguez and Joakim Nivre: A transition-based for 2-Planar Dependency Structures. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1492--1501, Uppsala, Sweden, 11-16 July 2010. ACL 3. Marco Kuhlmann. Dependency Structures and Lexicalized Grammars. An Algebraic Approach. LNAI 6270. FoLLI Publications on Logic, Language and Information. Springer 2010. 4.Eri kielille tehtyjen puupankkien määrä kasvaa tasaista vauhtia. Huomattava osa viimeaikaisista puupankeista käyttää annotaatiokäytäntöä joka perustuu dependenssisyntaksiin. Esitämme tässä paperissa mallin lingvistisesti adekvaattien dependenssirakenteiden luokille. Malli on testattu Danish Dependency Treebankin avulla. jne...The number of treebanks available for different languages is growing steadily. A considerable portion of the recent treebanks use annotation schemes that are based on dependency syntax. In this paper, we give a model for linguistically adequate classes of dependency structures in treebanks. Our model is tested using the Danish Dependency Treebank. Lecerf’s projectivity hypothesis assumes a constraint on linear word- order in dependency analyses. Unfortunately, projectivity does not lend itself to adequate treatment of certain non-local syntactic phenomena which are extensively studied in the literature of constituent-based theories such as TG, GB, GPSG, TAG, and LFG. Among these phenomena are scrambling, topicalizations, WH-movements, cleft sentences, discontinuous NPs, and discontinuous negation. a few relaxed models somewhat similar to projectivity have been pro- posed. These include quasi-projectivity, planarity, pseudo-projectivity, meta-projectivity, and polarized dependency grammars. None of the these models is motivated by formal language theory. The current work presents a new word-order model with a clear connection to formal language theory. The model, multiplanarity with a bounded number of planes, is based on planarity, which is itself a generalization of projectivity.Peer reviewe
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling
Syntactic features play an essential role in identifying relationship in a
sentence. Previous neural network models often suffer from irrelevant
information introduced when subjects and objects are in a long distance. In
this paper, we propose to learn more robust relation representations from the
shortest dependency path through a convolution neural network. We further
propose a straightforward negative sampling strategy to improve the assignment
of subjects and objects. Experimental results show that our method outperforms
the state-of-the-art methods on the SemEval-2010 Task 8 dataset
Conversational Exploratory Search via Interactive Storytelling
Conversational interfaces are likely to become more efficient, intuitive and
engaging way for human-computer interaction than today's text or touch-based
interfaces. Current research efforts concerning conversational interfaces focus
primarily on question answering functionality, thereby neglecting support for
search activities beyond targeted information lookup. Users engage in
exploratory search when they are unfamiliar with the domain of their goal,
unsure about the ways to achieve their goals, or unsure about their goals in
the first place. Exploratory search is often supported by approaches from
information visualization. However, such approaches cannot be directly
translated to the setting of conversational search.
In this paper we investigate the affordances of interactive storytelling as a
tool to enable exploratory search within the framework of a conversational
interface. Interactive storytelling provides a way to navigate a document
collection in the pace and order a user prefers. In our vision, interactive
storytelling is to be coupled with a dialogue-based system that provides verbal
explanations and responsive design. We discuss challenges and sketch the
research agenda required to put this vision into life.Comment: Accepted at ICTIR'17 Workshop on Search-Oriented Conversational AI
(SCAI 2017
Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use
International audienceThis article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resourc
Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation
We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.Johns Hopkins University Human Language Technology Center of Excellenc
- …