342 research outputs found

    Monitasoisuus - malli puupankeissa olevia dependenssirakenteita varten

    Get PDF
    Cited several times. E.g. 1. Marco Kuhlmann & Joakim Nivre: Mildly non-projective dependency structures. In the Proceedings of the COLING/ACL on Main conference poster sessions, p. 507--514. In series COLING-ACL '06. Sydney, Australia, 2006. 2. Carlos Gómez-Rodriguez and Joakim Nivre: A transition-based for 2-Planar Dependency Structures. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pages 1492--1501, Uppsala, Sweden, 11-16 July 2010. ACL 3. Marco Kuhlmann. Dependency Structures and Lexicalized Grammars. An Algebraic Approach. LNAI 6270. FoLLI Publications on Logic, Language and Information. Springer 2010. 4.Eri kielille tehtyjen puupankkien määrä kasvaa tasaista vauhtia. Huomattava osa viimeaikaisista puupankeista käyttää annotaatiokäytäntöä joka perustuu dependenssisyntaksiin. Esitämme tässä paperissa mallin lingvistisesti adekvaattien dependenssirakenteiden luokille. Malli on testattu Danish Dependency Treebankin avulla. jne...The number of treebanks available for different languages is growing steadily. A considerable portion of the recent treebanks use annotation schemes that are based on dependency syntax. In this paper, we give a model for linguistically adequate classes of dependency structures in treebanks. Our model is tested using the Danish Dependency Treebank. Lecerf’s projectivity hypothesis assumes a constraint on linear word- order in dependency analyses. Unfortunately, projectivity does not lend itself to adequate treatment of certain non-local syntactic phenomena which are extensively studied in the literature of constituent-based theories such as TG, GB, GPSG, TAG, and LFG. Among these phenomena are scrambling, topicalizations, WH-movements, cleft sentences, discontinuous NPs, and discontinuous negation. a few relaxed models somewhat similar to projectivity have been pro- posed. These include quasi-projectivity, planarity, pseudo-projectivity, meta-projectivity, and polarized dependency grammars. None of the these models is motivated by formal language theory. The current work presents a new word-order model with a clear connection to formal language theory. The model, multiplanarity with a bounded number of planes, is based on planarity, which is itself a generalization of projectivity.Peer reviewe

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    Semantic Relation Classification via Convolutional Neural Networks with Simple Negative Sampling

    Full text link
    Syntactic features play an essential role in identifying relationship in a sentence. Previous neural network models often suffer from irrelevant information introduced when subjects and objects are in a long distance. In this paper, we propose to learn more robust relation representations from the shortest dependency path through a convolution neural network. We further propose a straightforward negative sampling strategy to improve the assignment of subjects and objects. Experimental results show that our method outperforms the state-of-the-art methods on the SemEval-2010 Task 8 dataset

    Conversational Exploratory Search via Interactive Storytelling

    Get PDF
    Conversational interfaces are likely to become more efficient, intuitive and engaging way for human-computer interaction than today's text or touch-based interfaces. Current research efforts concerning conversational interfaces focus primarily on question answering functionality, thereby neglecting support for search activities beyond targeted information lookup. Users engage in exploratory search when they are unfamiliar with the domain of their goal, unsure about the ways to achieve their goals, or unsure about their goals in the first place. Exploratory search is often supported by approaches from information visualization. However, such approaches cannot be directly translated to the setting of conversational search. In this paper we investigate the affordances of interactive storytelling as a tool to enable exploratory search within the framework of a conversational interface. Interactive storytelling provides a way to navigate a document collection in the pace and order a user prefers. In our vision, interactive storytelling is to be coupled with a dialogue-based system that provides verbal explanations and responsive design. We discuss challenges and sketch the research agenda required to put this vision into life.Comment: Accepted at ICTIR'17 Workshop on Search-Oriented Conversational AI (SCAI 2017

    Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use

    Get PDF
    International audienceThis article is a position paper about Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in articles of the domain, this type of on-line working platforms allows to develop quickly all sorts of quality language resources, at a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal. Our goal here is manifold: 1- to inform researchers, so that they can make their own choices, 2- to develop alternatives with the help of funding agencies and scientific associations, 3- to propose practical and organizational solutions in order to improve language resources development, while limiting the risks of ethical and legal issues without letting go price or quality, 4- to introduce an Ethics and Big Data Charter for the documentation of language resourc

    Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation

    Get PDF
    We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.Johns Hopkins University Human Language Technology Center of Excellenc
    corecore