6,570 research outputs found

    Children as Models for Computers: Natural Language Acquisition for Machine Learning

    No full text
    International audienceThis paper focuses on a subïŹeld of machine learning, the so- called grammatical inference. Roughly speaking, grammatical inference deals with the problem of inferring a grammar that generates a given set of sample sentences in some manner that is supposed to be realized by some inference algorithm. We discuss how the analysis and formalization of the main features of the process of human natural language acquisition may improve results in the area of grammatical inference

    Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources

    Get PDF
    The manual construction of formal domain conceptualizations (ontologies) is labor-intensive. Ontology learning, by contrast, provides (semi-)automatic ontology generation from input data such as domain text. This thesis proposes a novel approach for learning labels of non-taxonomic ontology relations. It combines corpus-based techniques with reasoning on Semantic Web data. Corpus-based methods apply vector space similarity of verbs co-occurring with labeled and unlabeled relations to calculate relation label suggestions from a set of candidates. A meta ontology in combination with Semantic Web sources such as DBpedia and OpenCyc allows reasoning to improve the suggested labels. An extensive formal evaluation demonstrates the superior accuracy of the presented hybrid approach

    Information retrieval and text mining technologies for chemistry

    Get PDF
    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.A.V. and M.K. acknowledge funding from the European Community’s Horizon 2020 Program (project reference: 654021 - OpenMinted). M.K. additionally acknowledges the Encomienda MINETAD-CNIO as part of the Plan for the Advancement of Language Technology. O.R. and J.O. thank the Foundation for Applied Medical Research (FIMA), University of Navarra (Pamplona, Spain). This work was partially funded by Consellería de Cultura, Educación e Ordenación Universitaria (Xunta de Galicia), and FEDER (European Union), and the Portuguese Foundation for Science and Technology (FCT) under the scope of the strategic funding of UID/BIO/04469/2013 unit and COMPETE 2020 (POCI-01-0145-FEDER-006684). We thank Iñigo Garciá -Yoldi for useful feedback and discussions during the preparation of the manuscript.info:eu-repo/semantics/publishedVersio

    Proceedings of the Conference on Natural Language Processing 2010

    Get PDF
    This book contains state-of-the-art contributions to the 10th conference on Natural Language Processing, KONVENS 2010 (Konferenz zur Verarbeitung natĂŒrlicher Sprache), with a focus on semantic processing. The KONVENS in general aims at offering a broad perspective on current research and developments within the interdisciplinary field of natural language processing. The central theme draws specific attention towards addressing linguistic aspects ofmeaning, covering deep as well as shallow approaches to semantic processing. The contributions address both knowledgebased and data-driven methods for modelling and acquiring semantic information, and discuss the role of semantic information in applications of language technology. The articles demonstrate the importance of semantic processing, and present novel and creative approaches to natural language processing in general. Some contributions put their focus on developing and improving NLP systems for tasks like Named Entity Recognition or Word Sense Disambiguation, or focus on semantic knowledge acquisition and exploitation with respect to collaboratively built ressources, or harvesting semantic information in virtual games. Others are set within the context of real-world applications, such as Authoring Aids, Text Summarisation and Information Retrieval. The collection highlights the importance of semantic processing for different areas and applications in Natural Language Processing, and provides the reader with an overview of current research in this field

    Open, Online, Calculus Help Forums: Learning About and From a Public Conversation

    Get PDF
    This study is an exploration of participation, community, and mathematical understanding in an open, online, calculus help forum. These forums, populated by members from around the world, are locations where students post queries from their coursework and receive assistance from volunteer tutors. The site under investigation has a spontaneous participation structure, meaning that any forum member can respond to a query and contribute to an ongoing discussion. From earlier work, we know that such forums foster mathematical dialogue, contain exchanges with sophisticated pedagogical moves, and exhibit a strong sense of community. In this study, we delve deeper into the functional aspects of activity (such as student positioning and pedagogical moves), the benefits that accrue from participation in tutoring as a communal activity, and the mathematical understanding that is evident in the way problems on limit and related rates are framed and solutions constructed. Based on an observational methodology, we find that the forum provides tutoring for students and support for tutors that is unique from our expectations of other learning environments, such as one-on-one tutoring and computer-based tutoring systems. Students position themselves with authority in the exchanges by making assertions and proposals of action, questioning or challenging others' proposals, and indicating when resolution has been achieved. Tutors, who generally have more experience and expertise than students, provide mathematical guidance, and, in exemplary exchanges, draw the student into making a mathematical discovery. The dedication of tutors to the forum community was evident in the presence of authentic, honest mathematical practices, in the generous provision of alternative perspectives on problems, and in the sincere correction of errors. Some student participants picked up on these aspects of community and expressed excitement and appreciation for this taste of mathematical discourse. The primary contribution of the tutors was their assistance in supporting students as they constructed productive framings for the exercises, and this was the help that students were most in need of. As a result of eavesdropping on this public conversation, we conclude that the forums are a public conversation that should be listened to by educational researchers, teachers, and designers of tutoring systems

    Mining Meaning from Wikipedia

    Get PDF
    Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.Comment: An extensive survey of re-using information in Wikipedia in natural language processing, information retrieval and extraction and ontology building. Accepted for publication in International Journal of Human-Computer Studie

    Tree Transducers and Formal Methods (Dagstuhl Seminar 13192)

    Get PDF
    The aim of this Dagstuhl Seminar was to bring together researchers from various research areas related to the theory and application of tree transducers. Recently, interest in tree transducers has been revived due to surprising new applications in areas such as XML databases, security verification, programming language theory, and linguistics. This seminar therefore aimed to inspire the exchange of theoretical results and information regarding the practical requirements related to tree transducers
    • 

    corecore