4,580 research outputs found

    A Survey on Knowledge Graphs: Representation, Acquisition and Applications

    Full text link
    Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction towards cognition and human-level intelligence. In this survey, we provide a comprehensive review of knowledge graph covering overall research topics about 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models, and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference, and logical rule reasoning, are reviewed. We further explore several emerging topics, including meta relational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of datasets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions

    CrossNER: Evaluating Cross-Domain Named Entity Recognition

    Full text link
    Cross-domain named entity recognition (NER) models are able to cope with the scarcity issue of NER samples in target domains. However, most of the existing NER benchmarks lack domain-specialized entity types or do not focus on a certain domain, leading to a less effective cross-domain evaluation. To address these obstacles, we introduce a cross-domain NER dataset (CrossNER), a fully-labeled collection of NER data spanning over five diverse domains with specialized entity categories for different domains. Additionally, we also provide a domain-related corpus since using it to continue pre-training language models (domain-adaptive pre-training) is effective for the domain adaptation. We then conduct comprehensive experiments to explore the effectiveness of leveraging different levels of the domain corpus and pre-training strategies to do domain-adaptive pre-training for the cross-domain task. Results show that focusing on the fractional corpus containing domain-specialized entities and utilizing a more challenging pre-training strategy in domain-adaptive pre-training are beneficial for the NER domain adaptation, and our proposed method can consistently outperform existing cross-domain NER baselines. Nevertheless, experiments also illustrate the challenge of this cross-domain NER task. We hope that our dataset and baselines will catalyze research in the NER domain adaptation area. The code and data are available at https://github.com/zliucr/CrossNER.Comment: Accepted in AAAI-202

    A critical analysis of COVID-19 research literature: Text mining approach

    Get PDF
    Objective: Among the stakeholders of COVID-19 research, clinicians particularly experience difficulty keeping up with the deluge of SARS-CoV-2 literature while performing their much needed clinical duties. By revealing major topics, this study proposes a text-mining approach as an alternative to navigating large volumes of COVID-19 literature. Materials and methods: We obtained 85,268 references from the NIH COVID-19 Portfolio as of November 21. After the exclusion based on inadequate abstracts, 65,262 articles remained in the final corpus. We utilized natural language processing to curate and generate the term list. We applied topic modeling analyses and multiple correspondence analyses to reveal the major topics and the associations among topics, journal countries, and publication sources. Results: In our text mining analyses of NIH’s COVID-19 Portfolio, we discovered two sets of eleven major research topics by analyzing abstracts and titles of the articles separately. The eleven major areas of COVID-19 research based on abstracts included the following topics: 1) Public Health, 2) Patient Care & Outcomes, 3) Epidemiologic Modeling, 4) Diagnosis and Complications, 5) Mechanism of Disease, 6) Health System Response, 7) Pandemic Control, 8) Protection/Prevention, 9) Mental/Behavioral Health, 10) Detection/Testing, 11) Treatment Options. Further analyses revealed that five (2,3,4,5, and 9) of the eleven abstract-based topics showed a significant correlation (ranked from moderate to weak) with title-based topics. Conclusion: By offering up the more dynamic, scalable, and responsive categorization of published literature, our study provides valuable insights to the stakeholders of COVID-19 research, particularly clinicians.3417985

    COVID-19 datasets : a brief overview

    Get PDF
    The outbreak of the COVID-19 pandemic affects lives and social-economic development around the world. The affecting of the pandemic has motivated researchers from different domains to find effective solutions to diagnose, prevent, and estimate the pandemic and relieve its adverse effects. Numerous COVID-19 datasets are built from these studies and are available to the public. These datasets can be used for disease diagnosis and case prediction, speeding up solving problems caused by the pandemic. To meet the needs of researchers to understand various COVID-19 datasets, we examine and provide an overview of them. We organise the majority of these datasets into three categories based on the category of ap-plications, i.e., time-series, knowledge base, and media-based datasets. Organising COVID-19 datasets into appropriate categories can help researchers hold their focus on methodology rather than the datasets. In addition, applications and COVID-19 datasets suffer from a series of problems, such as privacy and quality. We discuss these issues as well as potentials of COVID-19 datasets. © 2022, ComSIS Consortium. All rights reserved
    • …
    corecore