6,683 research outputs found

    Unsupervised Extraction of Representative Concepts from Scientific Literature

    Full text link
    This paper studies the automated categorization and extraction of scientific concepts from titles of scientific articles, in order to gain a deeper understanding of their key contributions and facilitate the construction of a generic academic knowledgebase. Towards this goal, we propose an unsupervised, domain-independent, and scalable two-phase algorithm to type and extract key concept mentions into aspects of interest (e.g., Techniques, Applications, etc.). In the first phase of our algorithm we propose PhraseType, a probabilistic generative model which exploits textual features and limited POS tags to broadly segment text snippets into aspect-typed phrases. We extend this model to simultaneously learn aspect-specific features and identify academic domains in multi-domain corpora, since the two tasks mutually enhance each other. In the second phase, we propose an approach based on adaptor grammars to extract fine grained concept mentions from the aspect-typed phrases without the need for any external resources or human effort, in a purely data-driven manner. We apply our technique to study literature from diverse scientific domains and show significant gains over state-of-the-art concept extraction techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201

    Review of Paul M. Churchland, Plato's Camera

    Get PDF

    Bibliometric Survey on Incremental Learning in Text Classification Algorithms for False Information Detection

    Get PDF
    The false information or misinformation over the web has severe effects on people, business and society as a whole. Therefore, detection of misinformation has become a topic of research among many researchers. Detecting misinformation of textual articles is directly connected to text classification problem. With the massive and dynamic generation of unstructured textual documents over the web, incremental learning in text classification has gained more popularity. This survey explores recent advancements in incremental learning in text classification and review the research publications of the area from Scopus, Web of Science, Google Scholar, and IEEE databases and perform quantitative analysis by using methods such as publication statistics, collaboration degree, research network analysis, and citation analysis. The contribution of this study in incremental learning in text classification provides researchers insights on the latest status of the research through literature survey, and helps the researchers to know the various applications and the techniques used recently in the field

    Scientific Information Extraction with Semi-supervised Neural Tagging

    Full text link
    This paper addresses the problem of extracting keyphrases from scientific articles and categorizing them as corresponding to a task, process, or material. We cast the problem as sequence tagging and introduce semi-supervised methods to a neural tagging model, which builds on recent advances in named entity recognition. Since annotated training data is scarce in this domain, we introduce a graph-based semi-supervised algorithm together with a data selection scheme to leverage unannotated articles. Both inductive and transductive semi-supervised learning strategies outperform state-of-the-art information extraction performance on the 2017 SemEval Task 10 ScienceIE task.Comment: accepted by EMNLP 201
    corecore