681 research outputs found

    Discovery of Linguistic Relations Using Lexical Attraction

    Full text link
    This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical attraction. There is no initial grammar or lexicon built in and the only input is raw text. Learning and processing are interdigitated. The processor uses the regularities detected by the learner to impose structure on the input. This structure enables the learner to detect higher level regularities. Using this bootstrapping procedure, the program was trained on 100 million words of Associated Press material and was able to achieve 60% precision and 50% recall in finding relations between content-words. Using knowledge of lexical attraction, the program can identify the correct relations in syntactically ambiguous sentences such as ``I saw the Statue of Liberty flying over New York.''Comment: dissertation, 56 page

    SHOE:The extraction of hierarchical structure for machine learning of natural language

    Get PDF

    Human Simulations of Vocabulary Learning

    Get PDF
    The work reported here experimentally investigates a striking generalization about vocabulary acquisition: Noun learning is superior to verb learning in the earliest moments of child language development. The dominant explanation of this phenomenon in the literature invokes differing conceptual requirements for items in these lexical categories: Verbs are cognitively more complex than nouns and so their acquisition must await certain mental developments in the infant. In the present work, we investigate an alternative hypothesis; namely, that it is the information requirements of verb learning, not the conceptual requirements, that crucially determine the acquisition order. Efficient verb learning requires access to structural features of the exposure language and thus cannot take place until a scaffolding of noun knowledge enables the acquisition of clause-level syntax. More generally, we experimentally investigate the hypothesis that vocabulary acquisition takes place via an incremental constraint-satisfaction procedure that bootstraps itself into successively more sophisticated linguistic representations which, in turn, enable new kinds of vocabulary learning. If the experimental subjects were young children, it would be difficult to distinguish between this information-centered hypothesis and the conceptual change hypothesis. Therefore the experimental learners are adults. The items to be “acquired” in the experiments were the 24 most frequent nouns and 24 most frequent verbs from a sample of maternal speech to 18-24-month old infants. The various experiments ask about the kinds of information that will support identification of these words as they occur in mother-to-child discourse. In Experiment 1, subjects were required to identify the words from observing several extralinguistic contexts for their use (silent videos in which mothers are seen uttering the “mystery word” several times to the infants, with each such use cued by a beep or a nonsense word). The findings under these conditions mimicked the known learning trajectory for infants at the inception of speech and comprehension: Nouns are learned far more efficiently than verbs. Experiment 2 showed that the Experiment 1 results are best understood as concreteness differences that are correlated with lexical class membership in the common useage of mothers to young children. Experiment 3 presented (different) subject groups with 24 verbs under varying information Conditions; namely: (1) extralinguistic information; (2) noun-co-occurrence information; (3) both (1) and (2); (4) syntactic-frame information but with nouns and verbs represented by nonsense words; (5) both (2) and (4); (6) both (1) and (5). Each Condition led to greater identification success than the preceding Condition. Moreover, not only the number but the type of verb that was efficiently learned was different under the different information conditions. We discuss these results as consistent with the incremental construction of a highly lexicalized grammar by cognitively and pragmatically sophisticated human infants, but inconsistent with a procedure in which lexical acquisition is independent of and antecedent to syntax acquisition

    The marker yypothesis: a constructivist theory of language acquisition

    Get PDF
    This thesis presents a theory of the early stages of first language acquisition. Language is characterised as constituting an instructional environment - diachronic change in language serves to maintain and enhance sources of structural marking which act as salient cues that guide the development of linguistic representations in the child's brain. Language learning is characterised as a constructivist process in which the underlying grammatical representation and modular structure arise out of developmental processes. In particular, I investigate the role of closed-class elements in language which obtain salience through their high occurrence frequency and which serve to both label and segment useful grammatical units. I adopt an inter-disciplinary approach which encompasses analyses of child language and agrammatic speech, psycholinguistic data, the development of a developmental linguistic theory based on the Dependency Grammar formalism, and a number of computational investigations of spoken language corpora. I conclude that language development is highly interactionist and that in trying to understand the processes involved in learning we must begin with the child and not with the end-point of adult linguistic competence

    Integration of Action and Language Knowledge: A Roadmap for Developmental Robotics

    Get PDF
    “This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder." “Copyright IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.”This position paper proposes that the study of embodied cognitive agents, such as humanoid robots, can advance our understanding of the cognitive development of complex sensorimotor, linguistic, and social learning skills. This in turn will benefit the design of cognitive robots capable of learning to handle and manipulate objects and tools autonomously, to cooperate and communicate with other robots and humans, and to adapt their abilities to changing internal, environmental, and social conditions. Four key areas of research challenges are discussed, specifically for the issues related to the understanding of: 1) how agents learn and represent compositional actions; 2) how agents learn and represent compositional lexica; 3) the dynamics of social interaction and learning; and 4) how compositional action and language representations are integrated to bootstrap the cognitive system. The review of specific issues and progress in these areas is then translated into a practical roadmap based on a series of milestones. These milestones provide a possible set of cognitive robotics goals and test scenarios, thus acting as a research roadmap for future work on cognitive developmental robotics.Peer reviewe
    • …
    corecore