4,580 research outputs found
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Human knowledge provides a formal understanding of the world. Knowledge
graphs that represent structural relations between entities have become an
increasingly popular research direction towards cognition and human-level
intelligence. In this survey, we provide a comprehensive review of knowledge
graph covering overall research topics about 1) knowledge graph representation
learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph,
and 4) knowledge-aware applications, and summarize recent breakthroughs and
perspective directions to facilitate future research. We propose a full-view
categorization and new taxonomies on these topics. Knowledge graph embedding is
organized from four aspects of representation space, scoring function, encoding
models, and auxiliary information. For knowledge acquisition, especially
knowledge graph completion, embedding methods, path inference, and logical rule
reasoning, are reviewed. We further explore several emerging topics, including
meta relational learning, commonsense reasoning, and temporal knowledge graphs.
To facilitate future research on knowledge graphs, we also provide a curated
collection of datasets and open-source libraries on different tasks. In the
end, we have a thorough outlook on several promising research directions
CrossNER: Evaluating Cross-Domain Named Entity Recognition
Cross-domain named entity recognition (NER) models are able to cope with the
scarcity issue of NER samples in target domains. However, most of the existing
NER benchmarks lack domain-specialized entity types or do not focus on a
certain domain, leading to a less effective cross-domain evaluation. To address
these obstacles, we introduce a cross-domain NER dataset (CrossNER), a
fully-labeled collection of NER data spanning over five diverse domains with
specialized entity categories for different domains. Additionally, we also
provide a domain-related corpus since using it to continue pre-training
language models (domain-adaptive pre-training) is effective for the domain
adaptation. We then conduct comprehensive experiments to explore the
effectiveness of leveraging different levels of the domain corpus and
pre-training strategies to do domain-adaptive pre-training for the cross-domain
task. Results show that focusing on the fractional corpus containing
domain-specialized entities and utilizing a more challenging pre-training
strategy in domain-adaptive pre-training are beneficial for the NER domain
adaptation, and our proposed method can consistently outperform existing
cross-domain NER baselines. Nevertheless, experiments also illustrate the
challenge of this cross-domain NER task. We hope that our dataset and baselines
will catalyze research in the NER domain adaptation area. The code and data are
available at https://github.com/zliucr/CrossNER.Comment: Accepted in AAAI-202
A critical analysis of COVID-19 research literature: Text mining approach
Objective: Among the stakeholders of COVID-19 research, clinicians particularly experience difficulty keeping up with the deluge of SARS-CoV-2 literature while performing their much needed clinical duties. By revealing major topics, this study proposes a text-mining approach as an alternative to navigating large volumes of COVID-19 literature. Materials and methods: We obtained 85,268 references from the NIH COVID-19 Portfolio as of November 21. After the exclusion based on inadequate abstracts, 65,262 articles remained in the final corpus. We utilized natural language processing to curate and generate the term list. We applied topic modeling analyses and multiple correspondence analyses to reveal the major topics and the associations among topics, journal countries, and publication sources. Results: In our text mining analyses of NIH’s COVID-19 Portfolio, we discovered two sets of eleven major research topics by analyzing abstracts and titles of the articles separately. The eleven major areas of COVID-19 research based on abstracts included the following topics: 1) Public Health, 2) Patient Care & Outcomes, 3) Epidemiologic Modeling, 4) Diagnosis and Complications, 5) Mechanism of Disease, 6) Health System Response, 7) Pandemic Control, 8) Protection/Prevention, 9) Mental/Behavioral Health, 10) Detection/Testing, 11) Treatment Options. Further analyses revealed that five (2,3,4,5, and 9) of the eleven abstract-based topics showed a significant correlation (ranked from moderate to weak) with title-based topics. Conclusion: By offering up the more dynamic, scalable, and responsive categorization of published literature, our study provides valuable insights to the stakeholders of COVID-19 research, particularly clinicians.3417985
COVID-19 datasets : a brief overview
The outbreak of the COVID-19 pandemic affects lives and social-economic development around the world. The affecting of the pandemic has motivated researchers from different domains to find effective solutions to diagnose, prevent, and estimate the pandemic and relieve its adverse effects. Numerous COVID-19 datasets are built from these studies and are available to the public. These datasets can be used for disease diagnosis and case prediction, speeding up solving problems caused by the pandemic. To meet the needs of researchers to understand various COVID-19 datasets, we examine and provide an overview of them. We organise the majority of these datasets into three categories based on the category of ap-plications, i.e., time-series, knowledge base, and media-based datasets. Organising COVID-19 datasets into appropriate categories can help researchers hold their focus on methodology rather than the datasets. In addition, applications and COVID-19 datasets suffer from a series of problems, such as privacy and quality. We discuss these issues as well as potentials of COVID-19 datasets. © 2022, ComSIS Consortium. All rights reserved
- …