7,588 research outputs found
Discovery of Linguistic Relations Using Lexical Attraction
This work has been motivated by two long term goals: to understand how humans
learn language and to build programs that can understand language. Using a
representation that makes the relevant features explicit is a prerequisite for
successful learning and understanding. Therefore, I chose to represent
relations between individual words explicitly in my model. Lexical attraction
is defined as the likelihood of such relations. I introduce a new class of
probabilistic language models named lexical attraction models which can
represent long distance relations between words and I formalize this new class
of models using information theory.
Within the framework of lexical attraction, I developed an unsupervised
language acquisition program that learns to identify linguistic relations in a
given sentence. The only explicitly represented linguistic knowledge in the
program is lexical attraction. There is no initial grammar or lexicon built in
and the only input is raw text. Learning and processing are interdigitated. The
processor uses the regularities detected by the learner to impose structure on
the input. This structure enables the learner to detect higher level
regularities. Using this bootstrapping procedure, the program was trained on
100 million words of Associated Press material and was able to achieve 60%
precision and 50% recall in finding relations between content-words. Using
knowledge of lexical attraction, the program can identify the correct relations
in syntactically ambiguous sentences such as ``I saw the Statue of Liberty
flying over New York.''Comment: dissertation, 56 page
Learning Language from a Large (Unannotated) Corpus
A novel approach to the fully automated, unsupervised extraction of
dependency grammars and associated syntax-to-semantic-relationship mappings
from large text corpora is described. The suggested approach builds on the
authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well
as on a number of prior papers and approaches from the statistical language
learning literature. If successful, this approach would enable the mining of
all the information needed to power a natural language comprehension and
generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa
The polysemy of the Spanish verb sentir: a behavioral profile analysis
This study investigates the intricate polysemy of the Spanish perception verb sentir (‘feel’) which, analogous to the more-studied visual perception verbs ver (‘see’) and mirar (‘look’), also displays an ample gamut of semantic uses in various syntactic environments. The investigation is based on a corpus-based behavioral profile (BP) analysis. Besides its methodological merits as a quantitative, systematic and verifiable approach to the study of meaning and to polysemy in particular, the BP analysis offers qualitative usage-based evidence for cognitive linguistic theorizing. With regard to the polysemy of sentir, the following questions were addressed: (1) What is the prototype of each cluster of senses? (2) How are the different senses structured: how many senses should be distinguished – i.e. which senses cluster together and which senses should be kept separately? (3) Which senses are more related to each other and which are highly distinguishable? (4) What morphosyntactic variables make them more or less distinguishable? The results show that two significant meaning clusters can be distinguished, which coincide with the division between the middle voice uses (sentirse) and the other uses (sentir). Within these clusters, a number of meaningful subclusters emerge, which seem to coincide largely with the more general semantic categories of physical, cognitive and emotional perception
Herstellung eines Phaffia rhodozyma : Stamms mit verstärkter Astaxanthin-Synthese über gezielte genetische Modifikation chemisch mutagenisierter Stämme
Ziel dieser Arbeit war es erstmals durch eine Kombination aus chemischer Mutagenese und gezielter genetischer Modifikation (hier: „metabolic engineering“) einen Phaffia-Stamm herzustellen, welcher über die Mutagenese hinaus über eine weiter verstärkte Astaxanthin-Synthese verfügt.
Die von „DSM Nutritional Products“ bereitgestellten chemischen Mutanten wurden analysiert und über einen Selektionsprozess auf Pigmentstabilität und Wachstum hin optimiert, da die Stämme aus cryogenisierter Dauerkultur starke Pigmentinstabilitäten und ein verzögertes Wachstum aufwiesen.
Über eine exploratorische Phase wurde die Carotinoidsynthese analysiert und festgestellt, dass in den Mutanten keine Einzelreaktionen betroffen sind, welche für die Heraufregulierung der Carotinoidsynthese in den Mutanten verantwortlich sind. Hierbei wurden Limitierungen identifiziert und diese durch Transformation von Expressionsplasmiden mit geeigneten Genen aufgehoben, um damit eine noch effizientere Metabolisierung von Astaxanthin-Vorstufen hin zu Astaxanthin zu erreichen. Eine Überexpression der Phytoensynthase/Lycopinzyklase crtYB resultierte in einem gesteigerten Carotinoidgehalt bei gleichbleibendem Astaxanthin- Anteil. Durch eine zweite Transformation mit einer Expressionskassette für die Astaxanthin-Synthase asy konnte der Carotinoidgehalt weiter gesteigert und zusätzlich eine Limitierung der Metabolisierung von Astaxanthin-Vorstufen behoben werden, sodass die Transformante nahezu alle Intermediate der Astaxanthinsynthese zu Astaxanthin metabolisieren konnte (Gassel et al. 2013). Es konnte gezeigt werden, dass auch in den Mutanten, aus Experimenten mit dem Wildtyp bekannte, Limitierungen identifiziert und ausgeglichen werden konnten
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Language Matters: A Guide To Everyday Questions About Language
Is Ebonics really a dialect or simply bad English? Do women and men speak differently? Will computers ever really learn human language? Does offensive language harm children? These are only a few of the issues surrounding language that crop up every day. Most of us have very definite opinions on these questions one way or another. Yet as linguists Donna Jo Napoli and Vera Lee-Schoenfeld point out in this short and thoroughly readable volume, many of our most deeply held ideas about the nature of language and its role in our lives are either misconceived or influenced by myths and stereotypes Language Matters provides a highly informative tour of the world of language, examining these and other vexing and controversial language-related questions. Throughout, Napoli and Lee-Schoenfeld encourage and lead the reader to use common sense and everyday experience rather than preconceived notions or technical linguistic expertise. Both their questions and their conclusions are surprising, sometimes provocative, and always entertaining. This thoroughly revised second edition updates the book with a new coauthor and includes new chapters on language and power, language extinction, and what it is linguists actually do. Language Matters is sure to engage both general readers and students of language and linguistics at any level
- …