20 research outputs found

    A Robust Parsing Algorithm For Link Grammars

    Full text link
    In this paper we present a robust parsing algorithm based on the link grammar formalism for parsing natural languages. Our algorithm is a natural extension of the original dynamic programming recognition algorithm which recursively counts the number of linkages between two words in the input sentence. The modified algorithm uses the notion of a null link in order to allow a connection between any pair of adjacent words, regardless of their dictionary definitions. The algorithm proceeds by making three dynamic programming passes. In the first pass, the input is parsed using the original algorithm which enforces the constraints on links to ensure grammaticality. In the second pass, the total cost of each substring of words is computed, where cost is determined by the number of null links necessary to parse the substring. The final pass counts the total number of parses with minimal cost. All of the original pruning techniques have natural counterparts in the robust algorithm. When used together with memoization, these techniques enable the algorithm to run efficiently with cubic worst-case complexity. We have implemented these ideas and tested them by parsing the Switchboard corpus of conversational English. This corpus is comprised of approximately three million words of text, corresponding to more than 150 hours of transcribed speech collected from telephone conversations restricted to 70 different topics. Although only a small fraction of the sentences in this corpus are "grammatical" by standard criteria, the robust link grammar parser is able to extract relevant structure for a large portion of the sentences. We present the results of our experiments using this system, including the analyses of selected and random sentences from the corpus.Comment: 17 pages, compressed postscrip

    On Parsing CHILDES

    Get PDF
    Research on child language acquisition would benefit from the availability of a large body of syntactically parsed utterances between parents and children. We consider the problem of generating such a ``treebank'' from the CHILDES corpus, which currently contains primarily orthographically transcribed speech tagged for lexical category

    Real time multimodal interaction with animated virtual human

    Get PDF
    This paper describes the design and implementation of a real time animation framework in which animated virtual human is capable of performing multimodal interactions with human user. The animation system consists of several functional components, namely perception, behaviours generation, and motion generation. The virtual human agent in the system has a complex underlying geometry structure with multiple degrees of freedom (DOFs). It relies on a virtual perception system to capture information from its environment and respond to human user's commands by a combination of non-verbal behaviours including co-verbal gestures, posture, body motions and simple utterances. A language processing module is incorporated to interpret user's command. In particular, an efficient motion generation method has been developed to combines both motion captured data and parameterized actions generated in real time to produce variations in agent's behaviours depending on its momentary emotional states

    Extracting human protein information from MEDLINE using a full-sentence parser

    Get PDF
    Today, a fair number of systems are available for the task of processing biological data. The development of effective systems is of great importance since they can support both the research and the everyday work of biologists. It is well known that biological databases are large both in size and number, hence data processing technologies are required for the fast and effective management of the contents stored in databases like MEDLINE. A possible solution for content management is the application of natural language processing methods to help make this task easier. With our approach we would like to learn more about the interactions of human genes using full-sentence parsing. Given a sentence, the syntactic parser assigns to it a syntactic structure, which consists of a set of labelled links connecting pairs of words. The parser also produces a constituent representation of a sentence (showing noun phrases, verb phrases, and so on). Here we show experimentally that using the syntactic information of each abstract, the biological interactions of genes can be predicted. Hence, it is worth developing the kind of information extraction (IE) system that can retrieve information about gene interactions just by using syntactic information contained in these text. Our IE system can handle certain types of gene interactions with the help of machine learning (ML) methodologies (Hidden Markov Models, Artificial Neural Networks, Decision Trees, Support Vector Machines). The experiments and practical usage show clearly that our system can provide a useful intuitive guide for biological researchers in their investigations and in the design of their experiments

    e-Learning for English Based on Multimedia Database and Internet

    Get PDF
    [[abstract]]In this time of Internet delivery, learning through Internet will be popular and enhance the efficiency of teaching. This paper presents an Internet-based distance learning system for English learning through multimedia database and Internet technologies, it is called "multimedia English corpus". It includes two major learning functions. One of them provides Articles, Dialogs, and Videos databases in English. An English learner can study English writing, reading, and listing by Web browser to connect the Corpus server. In the system, "semantic query" and "Link grammar annotation" are applied. It can promote the query level from keyword-base and content-based query to semantic level. These skills of "semantic query" and "link grammar" have been used to construct the English multimedia corpus system. The main function of this system is to query the English sentence pattern by keywords from the English multimedia corpus. And the other function is to detect the grammar error in the sentence, which is written by student. It does not only provide learners to find their mistakes of English grammar, but also the teachers can understand the most frequent mistakes made by learners through the records of this corpus.[[notice]]補正完

    Peut-on évaluer les outils d'acquisition de connaissances à partir de textes ?

    No full text
    National audienceMalgré les années de recul et d'expériences accumulées, il est difficile de se faire une idée claire de l'état d'avancement des recherches en acquisition de connaissances à partir de textes. Le manque de protocoles d'évaluation ne facilite pas la comparaison des résultats. Nous développons, dans cet article, la question de l'évaluation des outils d'acquisition de terminologies et d'ontologies en soulignant les princi- pales difficultés et en décrivant nos premières propositions dans ce domaine

    Ontologies Supporting Intelligent Agent-Based Assistance

    Get PDF
    Intelligent agent-based assistants are systems that try to simplify peoples work based on computers. Recent research on intelligent assistance has presented significant results in several and different situations. Building such a system is a difficult task that requires expertise in numerous artificial intelligence and engineering disciplines. A key point in this kind of system is knowledge handling. The use of ontologies for representing domain knowledge and for supporting reasoning is becoming wide-spread in many areas, including intelligent assistance. In this paper we present how ontologies can be used to support intelligent assistance in a multi-agent system context. We show how ontologies may be spread over the multi-agent system architecture, highlighting their role controlling user interaction and service description. We present in detail an ontology-based conversational interface for personal assistants, showing how to design an ontology for semantic interpretation and how the interpretation process uses it for semantic analysis. We also present how ontologies are used to describe decentralized services based on a multi-agent architecture
    corecore