4,657 research outputs found
On Robustness and Bias Analysis of BERT-based Relation Extraction
Fine-tuning pre-trained models have achieved impressive performance on
standard natural language processing benchmarks. However, the resultant model
generalizability remains poorly understood. We do not know, for example, how
excellent performance can lead to the perfection of generalization models. In
this study, we analyze a fine-tuned BERT model from different perspectives
using relation extraction. We also characterize the differences in
generalization techniques according to our proposed improvements. From
empirical experimentation, we find that BERT suffers a bottleneck in terms of
robustness by way of randomizations, adversarial and counterfactual tests, and
biases (i.e., selection and semantic). These findings highlight opportunities
for future improvements. Our open-sourced testbed DiagnoseRE is available in
\url{https://github.com/zjunlp/DiagnoseRE}.Comment: work in progres
Structuring the Unstructured: Unlocking pharmacokinetic data from journals with Natural Language Processing
The development of a new drug is an increasingly expensive and inefficient process. Many drug candidates are discarded due to pharmacokinetic (PK) complications detected at clinical phases. It is critical to accurately estimate the PK parameters of new drugs before being tested in humans since they will determine their efficacy and safety outcomes. Preclinical predictions of PK parameters are largely based on prior knowledge from other compounds, but much of this potentially valuable data is currently locked in the format of scientific papers. With an ever-increasing amount of scientific literature, automated systems are essential to exploit this resource efficiently. Developing text mining systems that can structure PK literature is critical to improving the drug development pipeline.
This thesis studied the development and application of text mining resources to accelerate the curation of PK databases. Specifically, the development of novel corpora and suitable natural language processing architectures in the PK domain were addressed. The work presented focused on machine learning approaches that can model the high diversity of PK studies, parameter mentions, numerical measurements, units, and contextual information reported across the literature. Additionally, architectures and training approaches that could efficiently deal with the scarcity of annotated examples were explored. The chapters of this thesis tackle the development of suitable models and corpora to (1) retrieve PK documents, (2) recognise PK parameter mentions, (3) link PK entities to a knowledge base and (4) extract relations between parameter mentions, estimated measurements, units and other contextual information. Finally, the last chapter of this thesis studied the feasibility of the whole extraction pipeline to accelerate tasks in drug development research.
The results from this thesis exhibited the potential of text mining approaches to automatically generate PK databases that can aid researchers in the field and ultimately accelerate the drug development pipeline. Additionally, the thesis presented contributions to biomedical natural language processing by developing suitable architectures and corpora for multiple tasks, tackling novel entities and relations within the PK domain
A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4
Large language models (LLMs) are a special class of pretrained language
models obtained by scaling model size, pretraining corpus and computation.
LLMs, because of their large size and pretraining on large volumes of text
data, exhibit special abilities which allow them to achieve remarkable
performances without any task-specific training in many of the natural language
processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the
popularity of LLMs is increasing exponentially after the introduction of models
like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models,
including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With
the ever-rising popularity of GLLMs, especially in the research community,
there is a strong need for a comprehensive survey which summarizes the recent
research progress in multiple dimensions and can guide the research community
with insightful future research directions. We start the survey paper with
foundation concepts like transformers, transfer learning, self-supervised
learning, pretrained language models and large language models. We then present
a brief overview of GLLMs and discuss the performances of GLLMs in various
downstream tasks, specific domains and multiple languages. We also discuss the
data labelling and data augmentation abilities of GLLMs, the robustness of
GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with
multiple insightful future research directions. To summarize, this
comprehensive survey paper will serve as a good resource for both academic and
industry people to stay updated with the latest research related to GPT-3
family large language models.Comment: Preprint under review, 58 page
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Human knowledge provides a formal understanding of the world. Knowledge
graphs that represent structural relations between entities have become an
increasingly popular research direction towards cognition and human-level
intelligence. In this survey, we provide a comprehensive review of knowledge
graph covering overall research topics about 1) knowledge graph representation
learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph,
and 4) knowledge-aware applications, and summarize recent breakthroughs and
perspective directions to facilitate future research. We propose a full-view
categorization and new taxonomies on these topics. Knowledge graph embedding is
organized from four aspects of representation space, scoring function, encoding
models, and auxiliary information. For knowledge acquisition, especially
knowledge graph completion, embedding methods, path inference, and logical rule
reasoning, are reviewed. We further explore several emerging topics, including
meta relational learning, commonsense reasoning, and temporal knowledge graphs.
To facilitate future research on knowledge graphs, we also provide a curated
collection of datasets and open-source libraries on different tasks. In the
end, we have a thorough outlook on several promising research directions
RLSbench: Domain Adaptation Under Relaxed Label Shift
Despite the emergence of principled methods for domain adaptation under label
shift, their sensitivity to shifts in class conditional distributions is
precariously under explored. Meanwhile, popular deep domain adaptation
heuristics tend to falter when faced with label proportions shifts. While
several papers modify these heuristics in attempts to handle label proportions
shifts, inconsistencies in evaluation standards, datasets, and baselines make
it difficult to gauge the current best practices. In this paper, we introduce
RLSbench, a large-scale benchmark for relaxed label shift, consisting of 500
distribution shift pairs spanning vision, tabular, and language modalities,
with varying label proportions. Unlike existing benchmarks, which primarily
focus on shifts in class-conditional , our benchmark also focuses on
label marginal shifts. First, we assess 13 popular domain adaptation methods,
demonstrating more widespread failures under label proportion shifts than were
previously known. Next, we develop an effective two-step meta-algorithm that is
compatible with most domain adaptation heuristics: (i) pseudo-balance the data
at each epoch; and (ii) adjust the final classifier with target label
distribution estimate. The meta-algorithm improves existing domain adaptation
heuristics under large label proportion shifts, often by 2--10\% accuracy
points, while conferring minimal effect (0.5\%) when label proportions do
not shift. We hope that these findings and the availability of RLSbench will
encourage researchers to rigorously evaluate proposed methods in relaxed label
shift settings. Code is publicly available at
https://github.com/acmi-lab/RLSbench.Comment: Accepted at ICML 2023. Paper website:
https://sites.google.com/view/rlsbench
Obsessive-compulsive disorder and related disorders: a comprehensive survey
Our aim was to present a comprehensive, updated survey on obsessive-compulsive disorder (OCD) and obsessive-compulsive related disorders (OCRDs) and their clinical management via literature review, critical analysis and synthesis
Deep Active Learning for Computer Vision: Past and Future
As an important data selection schema, active learning emerges as the
essential component when iterating an Artificial Intelligence (AI) model. It
becomes even more critical given the dominance of deep neural network based
models, which are composed of a large number of parameters and data hungry, in
application. Despite its indispensable role for developing AI models, research
on active learning is not as intensive as other research directions. In this
paper, we present a review of active learning through deep active learning
approaches from the following perspectives: 1) technical advancements in active
learning, 2) applications of active learning in computer vision, 3) industrial
systems leveraging or with potential to leverage active learning for data
iteration, 4) current limitations and future research directions. We expect
this paper to clarify the significance of active learning in a modern AI model
manufacturing process and to bring additional research attention to active
learning. By addressing data automation challenges and coping with automated
machine learning systems, active learning will facilitate democratization of AI
technologies by boosting model production at scale.Comment: Accepted by APSIPA Transactions on Signal and Information Processin
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts
Machine learning methods strive to acquire a robust model during training
that can generalize well to test samples, even under distribution shifts.
However, these methods often suffer from a performance drop due to unknown test
distributions. Test-time adaptation (TTA), an emerging paradigm, has the
potential to adapt a pre-trained model to unlabeled data during testing, before
making predictions. Recent progress in this paradigm highlights the significant
benefits of utilizing unlabeled data for training self-adapted models prior to
inference. In this survey, we divide TTA into several distinct categories,
namely, test-time (source-free) domain adaptation, test-time batch adaptation,
online test-time adaptation, and test-time prior adaptation. For each category,
we provide a comprehensive taxonomy of advanced algorithms, followed by a
discussion of different learning scenarios. Furthermore, we analyze relevant
applications of TTA and discuss open challenges and promising areas for future
research. A comprehensive list of TTA methods can be found at
\url{https://github.com/tim-learn/awesome-test-time-adaptation}.Comment: Discussions, comments, and questions are all welcomed in
\url{https://github.com/tim-learn/awesome-test-time-adaptation
- …