7,588 research outputs found

    Discovery of Linguistic Relations Using Lexical Attraction

    Full text link
    This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical attraction. There is no initial grammar or lexicon built in and the only input is raw text. Learning and processing are interdigitated. The processor uses the regularities detected by the learner to impose structure on the input. This structure enables the learner to detect higher level regularities. Using this bootstrapping procedure, the program was trained on 100 million words of Associated Press material and was able to achieve 60% precision and 50% recall in finding relations between content-words. Using knowledge of lexical attraction, the program can identify the correct relations in syntactically ambiguous sentences such as ``I saw the Statue of Liberty flying over New York.''Comment: dissertation, 56 page

    Learning Language from a Large (Unannotated) Corpus

    Full text link
    A novel approach to the fully automated, unsupervised extraction of dependency grammars and associated syntax-to-semantic-relationship mappings from large text corpora is described. The suggested approach builds on the authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well as on a number of prior papers and approaches from the statistical language learning literature. If successful, this approach would enable the mining of all the information needed to power a natural language comprehension and generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa

    The polysemy of the Spanish verb sentir: a behavioral profile analysis

    Get PDF
    This study investigates the intricate polysemy of the Spanish perception verb sentir (‘feel’) which, analogous to the more-studied visual perception verbs ver (‘see’) and mirar (‘look’), also displays an ample gamut of semantic uses in various syntactic environments. The investigation is based on a corpus-based behavioral profile (BP) analysis. Besides its methodological merits as a quantitative, systematic and verifiable approach to the study of meaning and to polysemy in particular, the BP analysis offers qualitative usage-based evidence for cognitive linguistic theorizing. With regard to the polysemy of sentir, the following questions were addressed: (1) What is the prototype of each cluster of senses? (2) How are the different senses structured: how many senses should be distinguished – i.e. which senses cluster together and which senses should be kept separately? (3) Which senses are more related to each other and which are highly distinguishable? (4) What morphosyntactic variables make them more or less distinguishable? The results show that two significant meaning clusters can be distinguished, which coincide with the division between the middle voice uses (sentirse) and the other uses (sentir). Within these clusters, a number of meaningful subclusters emerge, which seem to coincide largely with the more general semantic categories of physical, cognitive and emotional perception

    Herstellung eines Phaffia rhodozyma : Stamms mit verstärkter Astaxanthin-Synthese über gezielte genetische Modifikation chemisch mutagenisierter Stämme

    Get PDF
    Ziel dieser Arbeit war es erstmals durch eine Kombination aus chemischer Mutagenese und gezielter genetischer Modifikation (hier: „metabolic engineering“) einen Phaffia-Stamm herzustellen, welcher über die Mutagenese hinaus über eine weiter verstärkte Astaxanthin-Synthese verfügt. Die von „DSM Nutritional Products“ bereitgestellten chemischen Mutanten wurden analysiert und über einen Selektionsprozess auf Pigmentstabilität und Wachstum hin optimiert, da die Stämme aus cryogenisierter Dauerkultur starke Pigmentinstabilitäten und ein verzögertes Wachstum aufwiesen. Über eine exploratorische Phase wurde die Carotinoidsynthese analysiert und festgestellt, dass in den Mutanten keine Einzelreaktionen betroffen sind, welche für die Heraufregulierung der Carotinoidsynthese in den Mutanten verantwortlich sind. Hierbei wurden Limitierungen identifiziert und diese durch Transformation von Expressionsplasmiden mit geeigneten Genen aufgehoben, um damit eine noch effizientere Metabolisierung von Astaxanthin-Vorstufen hin zu Astaxanthin zu erreichen. Eine Überexpression der Phytoensynthase/Lycopinzyklase crtYB resultierte in einem gesteigerten Carotinoidgehalt bei gleichbleibendem Astaxanthin- Anteil. Durch eine zweite Transformation mit einer Expressionskassette für die Astaxanthin-Synthase asy konnte der Carotinoidgehalt weiter gesteigert und zusätzlich eine Limitierung der Metabolisierung von Astaxanthin-Vorstufen behoben werden, sodass die Transformante nahezu alle Intermediate der Astaxanthinsynthese zu Astaxanthin metabolisieren konnte (Gassel et al. 2013). Es konnte gezeigt werden, dass auch in den Mutanten, aus Experimenten mit dem Wildtyp bekannte, Limitierungen identifiziert und ausgeglichen werden konnten

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    Language Matters: A Guide To Everyday Questions About Language

    Get PDF
    Is Ebonics really a dialect or simply bad English? Do women and men speak differently? Will computers ever really learn human language? Does offensive language harm children? These are only a few of the issues surrounding language that crop up every day. Most of us have very definite opinions on these questions one way or another. Yet as linguists Donna Jo Napoli and Vera Lee-Schoenfeld point out in this short and thoroughly readable volume, many of our most deeply held ideas about the nature of language and its role in our lives are either misconceived or influenced by myths and stereotypes Language Matters provides a highly informative tour of the world of language, examining these and other vexing and controversial language-related questions. Throughout, Napoli and Lee-Schoenfeld encourage and lead the reader to use common sense and everyday experience rather than preconceived notions or technical linguistic expertise. Both their questions and their conclusions are surprising, sometimes provocative, and always entertaining. This thoroughly revised second edition updates the book with a new coauthor and includes new chapters on language and power, language extinction, and what it is linguists actually do. Language Matters is sure to engage both general readers and students of language and linguistics at any level
    corecore