27,760 research outputs found

    Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly

    Get PDF
    Recent years have seen considerable success in the generation of automatically obtained wide-coverage deep grammars for natural language processing, given reliable and large CFG-like treebanks. For research within Lexical Functional Grammar framework, these deep grammars are typically based on an extended PCFG parsing scheme from which dependencies are extracted. However, increasing success in statistical dependency parsing suggests that such deep grammar approaches to statistical parsing could be streamlined. We explore this novel approach to deep grammar parsing within the framework of LFG in this paper, for French, showing that best results (an f-score of 69.46) for the established integrated architecture may be obtained for French

    Patterns in syntactic dependency networks

    Get PDF
    Many languages are spoken on Earth. Despite their diversity, many robust language universals are known to exist. All languages share syntax, i.e., the ability of combining words for forming sentences. The origin of such traits is an issue of open debate. By using recent developments from the statistical physics of complex networks, we show that different syntactic dependency networks (from Czech, German, and Romanian) share many nontrivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing. Such previously unreported features of syntax organization are not a trivial consequence of the structure of sentences, but an emergent trait at the global scale.Peer ReviewedPostprint (published version

    How do individual cognitive differences relate to acceptability judgments?: A reply to Sprouse, Wagers, and Phillips

    Get PDF
    Sprouse, Wagers, and Phillips (2012) carried out two experiments in which they measured individual differences in memory to test processing accounts of island effects. They found that these individual differences failed to predict the magnitude of island effects, and they construe these findings as counterevidence to processing-based accounts of island effects. Here, we take up several problems with their methods, their findings, and their conclusions. First, the arguments against processing accounts are based on null results using tasks that may be ineffective or inappropriate measures of working memory (the n-back and serial-recall tasks). The authors provide no evidence that these two measures predict judgments for other constructions that are difficult to process and yet are clearly grammatical. They assume that other measures of working memory would have yielded the same result, but provide no justification that they should. We further show that whether a working-memory measure relates to judgments of grammatical, hard-to-process sentences depends on how difficult the sentences are. In this light, the stimuli used by the authors present processing difficulties other than the island violations under investigation and may have been particularly hard to process. Second, the Sprouse et al. results are statistically in line with the hypothesis that island sensitivity varies with working memory. Three out of the four island types in their experiment 1 show a significant relation between memory scores and island sensitivity, but the authors discount these findings on the grounds that the variance accounted for is too small to have much import. This interpretation, however, runs counter to standard practices in linguistics, psycholinguistics, and psychology

    Discovery of Linguistic Relations Using Lexical Attraction

    Full text link
    This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical attraction. There is no initial grammar or lexicon built in and the only input is raw text. Learning and processing are interdigitated. The processor uses the regularities detected by the learner to impose structure on the input. This structure enables the learner to detect higher level regularities. Using this bootstrapping procedure, the program was trained on 100 million words of Associated Press material and was able to achieve 60% precision and 50% recall in finding relations between content-words. Using knowledge of lexical attraction, the program can identify the correct relations in syntactically ambiguous sentences such as ``I saw the Statue of Liberty flying over New York.''Comment: dissertation, 56 page

    Crossings as a side effect of dependency lengths

    Get PDF
    The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence. We investigate two competing explanations. The traditional hypothesis is that this trend arises from an independent principle of syntax that reduces crossings practically to zero. An alternative to this view is the hypothesis that crossings are a side effect of dependency lengths, i.e. sentences with shorter dependency lengths should tend to have fewer crossings. We are able to reject the traditional view in the majority of languages considered. The alternative hypothesis can lead to a more parsimonious theory of language.Comment: the discussion section has been expanded significantly; in press in Complexity (Wiley
    corecore