17,135 research outputs found
HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention
Twitter bot detection has become an increasingly important and challenging
task to combat online misinformation, facilitate social content moderation, and
safeguard the integrity of social platforms. Though existing graph-based
Twitter bot detection methods achieved state-of-the-art performance, they are
all based on the homophily assumption, which assumes users with the same label
are more likely to be connected, making it easy for Twitter bots to disguise
themselves by following a large number of genuine users. To address this issue,
we proposed HOFA, a novel graph-based Twitter bot detection framework that
combats the heterophilous disguise challenge with a homophily-oriented graph
augmentation module (Homo-Aug) and a frequency adaptive attention module
(FaAt). Specifically, the Homo-Aug extracts user representations and computes a
k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN
graph. For the FaAt, we propose an attention mechanism that adaptively serves
as a low-pass filter along a homophilic edge and a high-pass filter along a
heterophilic edge, preventing user features from being over-smoothed by their
neighborhood. We also introduce a weight guidance loss to guide the frequency
adaptive attention module. Our experiments demonstrate that HOFA achieves
state-of-the-art performance on three widely-acknowledged Twitter bot detection
benchmarks, which significantly outperforms vanilla graph-based bot detection
techniques and strong heterophilic baselines. Furthermore, extensive studies
confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability
to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure
Advancing Adversarial Training by Injecting Booster Signal
Recent works have demonstrated that deep neural networks (DNNs) are highly
vulnerable to adversarial attacks. To defend against adversarial attacks, many
defense strategies have been proposed, among which adversarial training has
been demonstrated to be the most effective strategy. However, it has been known
that adversarial training sometimes hurts natural accuracy. Then, many works
focus on optimizing model parameters to handle the problem. Different from the
previous approaches, in this paper, we propose a new approach to improve the
adversarial robustness by using an external signal rather than model
parameters. In the proposed method, a well-optimized universal external signal
called a booster signal is injected into the outside of the image which does
not overlap with the original content. Then, it boosts both adversarial
robustness and natural accuracy. The booster signal is optimized in parallel to
model parameters step by step collaboratively. Experimental results show that
the booster signal can improve both the natural and robust accuracies over the
recent state-of-the-art adversarial training methods. Also, optimizing the
booster signal is general and flexible enough to be adopted on any existing
adversarial training methods.Comment: Accepted at IEEE Transactions on Neural Networks and Learning System
Machine learning in solar physics
The application of machine learning in solar physics has the potential to
greatly enhance our understanding of the complex processes that take place in
the atmosphere of the Sun. By using techniques such as deep learning, we are
now in the position to analyze large amounts of data from solar observations
and identify patterns and trends that may not have been apparent using
traditional methods. This can help us improve our understanding of explosive
events like solar flares, which can have a strong effect on the Earth
environment. Predicting hazardous events on Earth becomes crucial for our
technological society. Machine learning can also improve our understanding of
the inner workings of the sun itself by allowing us to go deeper into the data
and to propose more complex models to explain them. Additionally, the use of
machine learning can help to automate the analysis of solar data, reducing the
need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a
Living Review in Solar Physics (LRSP
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Neural networks can be significantly compressed by pruning, leading to sparse
models requiring considerably less storage and floating-point operations while
maintaining predictive performance. Model soups (Wortsman et al., 2022) improve
generalization and out-of-distribution performance by averaging the parameters
of multiple models into a single one without increased inference time. However,
identifying models in the same loss basin to leverage both sparsity and
parameter averaging is challenging, as averaging arbitrary sparse models
reduces the overall sparsity due to differing sparse connectivities. In this
work, we address these challenges by demonstrating that exploring a single
retraining phase of Iterative Magnitude Pruning (IMP) with varying
hyperparameter configurations, such as batch ordering or weight decay, produces
models that are suitable for averaging and share the same sparse connectivity
by design. Averaging these models significantly enhances generalization
performance compared to their individual components. Building on this idea, we
introduce Sparse Model Soups (SMS), a novel method for merging sparse models by
initiating each prune-retrain cycle with the averaged model of the previous
phase. SMS maintains sparsity, exploits sparse network benefits being modular
and fully parallelizable, and substantially improves IMP's performance.
Additionally, we demonstrate that SMS can be adapted to enhance the performance
of state-of-the-art pruning during training approaches.Comment: 9 pages, 5 pages references, 7 pages appendi
Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision
Large language models (LLMs) have demonstrated remarkable capabilities out of
box for a wide range of applications, yet accuracy still remains a major growth
area, especially in mission-critical domains such as biomedicine. An effective
method to calibrate the confidence level on LLM responses is essential to
automatically detect errors and facilitate human-in-the-loop verification. An
important source of calibration signals stems from expert-stipulated
programmatic supervision, which is often available at low cost but has its own
limitations such as noise and coverage. In this paper, we introduce a Pareto
optimal self-supervision framework that can leverage available programmatic
supervision to systematically calibrate LLM responses by producing a risk score
for every response, without any additional manual efforts. This is accomplished
by learning a harmonizer model to align LLM output with other available
supervision sources, which would assign higher risk scores to more uncertain
LLM responses and facilitate error correction. Experiments on standard relation
extraction tasks in biomedical and general domains demonstrate the promise of
this approach, with our proposed risk scores highly correlated with the real
error rate of LLMs. For the most uncertain test instances, dynamic prompting
based on our proposed risk scores results in significant accuracy improvement
for off-the-shelf LLMs, boosting GPT-3 results past state-of-the-art (SOTA)
weak supervision and GPT-4 results past SOTA supervised results on challenging
evaluation datasets
Biomedical Entity Recognition by Detection and Matching
Biomedical named entity recognition (BNER) serves as the foundation for
numerous biomedical text mining tasks. Unlike general NER, BNER require a
comprehensive grasp of the domain, and incorporating external knowledge beyond
training data poses a significant challenge. In this study, we propose a novel
BNER framework called DMNER. By leveraging existing entity representation
models SAPBERT, we tackle BNER as a two-step process: entity boundary detection
and biomedical entity matching. DMNER exhibits applicability across multiple
NER scenarios: 1) In supervised NER, we observe that DMNER effectively
rectifies the output of baseline NER models, thereby further enhancing
performance. 2) In distantly supervised NER, combining MRC and AutoNER as span
boundary detectors enables DMNER to achieve satisfactory results. 3) For
training NER by merging multiple datasets, we adopt a framework similar to
DS-NER but additionally leverage ChatGPT to obtain high-quality phrases in the
training. Through extensive experiments conducted on 10 benchmark datasets, we
demonstrate the versatility and effectiveness of DMNER.Comment: 9 pages content, 2 pages appendi
Classification of knee osteoarthritis based on quantum-to-classical transfer learning
Quantum machine learning takes advantage of features such as quantum computing superposition and entanglement to enable better performance of machine learning models. In this paper, we first propose an improved hybrid quantum convolutional neural network (HQCNN) model. The HQCNN model was used to pre-train brain tumor dataset (MRI) images. Next, the quantum classical transfer learning (QCTL) approach is used to fine-tune and extract features based on pre-trained weights. A hybrid quantum convolutional network structure was used to test the osteoarthritis of the knee dataset (OAI) and to quantitatively evaluate standard metrics to verify the robustness of the classifier. The final experimental results show that the QCTL method can effectively classify knee osteoarthritis with a classification accuracy of 98.36%. The quantum-to-classical transfer learning method improves classification accuracy by 1.08%. How to use different coding techniques in HQCNN models applied to medical image analysis is also a future research direction
Improving Heterogeneous Graph Learning with Weighted Mixed-Curvature Product Manifold
In graph representation learning, it is important that the complex geometric
structure of the input graph, e.g. hidden relations among nodes, is well
captured in embedding space. However, standard Euclidean embedding spaces have
a limited capacity in representing graphs of varying structures. A promising
candidate for the faithful embedding of data with varying structure is product
manifolds of component spaces of different geometries (spherical, hyperbolic,
or euclidean). In this paper, we take a closer look at the structure of product
manifold embedding spaces and argue that each component space in a product
contributes differently to expressing structures in the input graph, hence
should be weighted accordingly. This is different from previous works which
consider the roles of different components equally. We then propose
WEIGHTED-PM, a data-driven method for learning embedding of heterogeneous
graphs in weighted product manifolds. Our method utilizes the topological
information of the input graph to automatically determine the weight of each
component in product spaces. Extensive experiments on synthetic and real-world
graph datasets demonstrate that WEIGHTED-PM is capable of learning better graph
representations with lower geometric distortion from input data, and performs
better on multiple downstream tasks, such as word similarity learning, top-
recommendation, and knowledge graph embedding
Incorporating Deep Q -- Network with Multiclass Classification Algorithms
In this study, we explore how Deep Q-Network (DQN) might improve the
functionality of multiclass classification algorithms. We will use a benchmark
dataset from Kaggle to create a framework incorporating DQN with existing
supervised multiclass classification algorithms. The findings of this study
will bring insight into how deep reinforcement learning strategies may be used
to increase multiclass classification accuracy. They have been used in a number
of fields, including image recognition, natural language processing, and
bioinformatics. This study is focused on the prediction of financial distress
in companies in addition to the wider application of Deep Q-Network in
multiclass classification. Identifying businesses that are likely to experience
financial distress is a crucial task in the fields of finance and risk
management. Whenever a business experiences serious challenges keeping its
operations going and meeting its financial responsibilities, it is said to be
in financial distress. It commonly happens when a company has a sharp and
sustained recession in profitability, cash flow issues, or an unsustainable
level of debt
Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations
The local explanation provides heatmaps on images to explain how
Convolutional Neural Networks (CNNs) derive their output. Due to its visual
straightforwardness, the method has been one of the most popular explainable AI
(XAI) methods for diagnosing CNNs. Through our formative study (S1), however,
we captured ML engineers' ambivalent perspective about the local explanation as
a valuable and indispensable envision in building CNNs versus the process that
exhausts them due to the heuristic nature of detecting vulnerability. Moreover,
steering the CNNs based on the vulnerability learned from the diagnosis seemed
highly challenging. To mitigate the gap, we designed DeepFuse, the first
interactive design that realizes the direct feedback loop between a user and
CNNs in diagnosing and revising CNN's vulnerability using local explanations.
DeepFuse helps CNN engineers to systemically search "unreasonable" local
explanations and annotate the new boundaries for those identified as
unreasonable in a labor-efficient manner. Next, it steers the model based on
the given annotation such that the model doesn't introduce similar mistakes. We
conducted a two-day study (S2) with 12 experienced CNN engineers. Using
DeepFuse, participants made a more accurate and "reasonable" model than the
current state-of-the-art. Also, participants found the way DeepFuse guides
case-based reasoning can practically improve their current practice. We provide
implications for design that explain how future HCI-driven design can move our
practice forward to make XAI-driven insights more actionable.Comment: 32 pages, 6 figures, 5 tables. Accepted for publication in the
Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 202
- …