13,612 research outputs found
Recommended from our members
Parsing Early Modern English for Linguistic Search
This work addresses the question of whether the output of a state-of-the-art parser is accurate enough to support research in theoretical linguistics. In order to build reliable models of syntactic change, we aim to eventually parse the 1.5-billion-word Early English Books Online (EEBO) corpus. But since EEBO is not yet parsed, we begin by constructing and testing a parser on the 1.7-million-word Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME). In order to obtain robust results, we define an 8-fold split on PPCEME. We then evaluate the parser with evalb and, more relevantly for us, with a task-specific metric - namely, its accuracy in parsing 6 sentence types necessary to track the rise of auxiliary do (as in They did not come vs. its historical precursor They came not). Retrieving the relevant sentences from the gold and test versions with CorpusSearch queries, we find that the parser\u27s accuracy promises to be sufficient for our purposes. A remaining concern is the variability of the output, which we plan to address with three pieces of future work sketched in the conclusion
Linguistic argumentation and logic: An alternative method approach in Arabic grammar
Rozprawa podkreśla związek między językową argumentacją a logiką. Argumentacja językowa jest systemem językowym, który stosuje znaczenie wyrażeń ujętych w zdania do zarysowania pełnego znaczenia zdań, w nich bowiem konstytuują się zależności między wyrażeniami. Rzeczywiście, to powiązanie między wyrażeniami wzmacnia całościowe znaczenie począwszy od samych podstaw struktury zdania w logicznym powiązaniu idei. W nim znajduje się relacja między słowami a umysłem, zależna od logiki powiązanych ze sobą wypowiedzi. Aby podkreślić znaczenie przedstawionego wyżej sposobu myślenia, autorka zwraca się ku teorii wczesnej gramatyki arabskiej, ogniskującej się raczej wokół analogii niż anomalii. Nastawienie w systemie na analogię opiera się na podstawowej teorii, która implikuje wspomnianą wyżej relację, choć niektóre współczesne ujęcia odrzucają tę interpretację. W podsumowaniu swojej analizy, autorka uwzględnia podobne teorie gramatyki łacińskiej, które wykazują nastawienie logiczne, będące następstwem powiązania językowej argumentacji z logiką. Konkludując, autorka stwierdza, iż powiązanie słów i logiki okazuje się pojęciem uniwersalnym
Concurrent Lexicalized Dependency Parsing: A Behavioral View on ParseTalk Events
The behavioral specification of an object-oriented grammar model is
considered. The model is based on full lexicalization, head-orientation via
valency constraints and dependency relations, inheritance as a means for
non-redundant lexicon specification, and concurrency of computation. The
computation model relies upon the actor paradigm, with concurrency entering
through asynchronous message passing between actors. In particular, we here
elaborate on principles of how the global behavior of a lexically distributed
grammar and its corresponding parser can be specified in terms of event type
networks and event networks, resp.Comment: 68kB, 5pages Postscrip
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Concurrent Lexicalized Dependency Parsing: The ParseTalk Model
A grammar model for concurrent, object-oriented natural language parsing is
introduced. Complete lexical distribution of grammatical knowledge is achieved
building upon the head-oriented notions of valency and dependency, while
inheritance mechanisms are used to capture lexical generalizations. The
underlying concurrent computation model relies upon the actor paradigm. We
consider message passing protocols for establishing dependency relations and
ambiguity handling.Comment: 90kB, 7pages Postscrip
- …