211 research outputs found
Study on the dynamics of golf swing and impedance control for a golf swing robot
高知工科大学博士(工学) 平成19年3月20日授与 (甲第111号
Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal Perspective
Recent advancements in Large Language Models (LLMs) have facilitated the
development of Multimodal LLMs (MLLMs). Despite their impressive capabilities,
MLLMs often suffer from an over-reliance on unimodal biases (e.g., language
bias and vision bias), leading to incorrect answers in complex multimodal
tasks. To investigate this issue, we propose a causal framework to interpret
the biases in Visual Question Answering (VQA) problems. Within our framework,
we devise a causal graph to elucidate the predictions of MLLMs on VQA problems,
and assess the causal effect of biases through an in-depth causal analysis.
Motivated by the causal graph, we introduce a novel MORE dataset, consisting of
12,000 VQA instances. This dataset is designed to challenge MLLMs' abilities,
necessitating multi-hop reasoning and the surmounting of unimodal biases.
Furthermore, we propose two strategies to mitigate unimodal biases and enhance
MLLMs' reasoning capabilities, including a Decompose-Verify-Answer (DeVA)
framework for limited-access MLLMs and the refinement of open-source MLLMs
through fine-tuning. Extensive quantitative and qualitative experiments offer
valuable insights for future research. Our project page is at
https://opencausalab.github.io/MORE
Reducing Communication for Split Learning by Randomized Top-k Sparsification
Split learning is a simple solution for Vertical Federated Learning (VFL),
which has drawn substantial attention in both research and application due to
its simplicity and efficiency. However, communication efficiency is still a
crucial issue for split learning. In this paper, we investigate multiple
communication reduction methods for split learning, including cut layer size
reduction, top-k sparsification, quantization, and L1 regularization. Through
analysis of the cut layer size reduction and top-k sparsification, we further
propose randomized top-k sparsification, to make the model generalize and
converge better. This is done by selecting top-k elements with a large
probability while also having a small probability to select non-top-k elements.
Empirical results show that compared with other communication-reduction
methods, our proposed randomized top-k sparsification achieves a better model
performance under the same compression level.Comment: Accepted by IJCAI 202
Federated Unlearning via Active Forgetting
The increasing concerns regarding the privacy of machine learning models have
catalyzed the exploration of machine unlearning, i.e., a process that removes
the influence of training data on machine learning models. This concern also
arises in the realm of federated learning, prompting researchers to address the
federated unlearning problem. However, federated unlearning remains
challenging. Existing unlearning methods can be broadly categorized into two
approaches, i.e., exact unlearning and approximate unlearning. Firstly,
implementing exact unlearning, which typically relies on the
partition-aggregation framework, in a distributed manner does not improve time
efficiency theoretically. Secondly, existing federated (approximate) unlearning
methods suffer from imprecise data influence estimation, significant
computational burden, or both. To this end, we propose a novel federated
unlearning framework based on incremental learning, which is independent of
specific models and federated settings. Our framework differs from existing
federated unlearning methods that rely on approximate retraining or data
influence estimation. Instead, we leverage new memories to overwrite old ones,
imitating the process of \textit{active forgetting} in neurology. Specifically,
the model, intended to unlearn, serves as a student model that continuously
learns from randomly initiated teacher models. To preserve catastrophic
forgetting of non-target data, we utilize elastic weight consolidation to
elastically constrain weight change. Extensive experiments on three benchmark
datasets demonstrate the efficiency and effectiveness of our proposed method.
The result of backdoor attacks demonstrates that our proposed method achieves
satisfying completeness
Federated Learning for Short Text Clustering
Short text clustering has been popularly studied for its significance in
mining valuable insights from many short texts. In this paper, we focus on the
federated short text clustering (FSTC) problem, i.e., clustering short texts
that are distributed in different clients, which is a realistic problem under
privacy requirements. Compared with the centralized short text clustering
problem that short texts are stored on a central server, the FSTC problem has
not been explored yet. To fill this gap, we propose a Federated Robust Short
Text Clustering (FSTC) framework. FSTC includes two main modules, i.e., robust
short text clustering module and federated cluster center aggregation module.
The robust short text clustering module aims to train an effective short text
clustering model with local data in each client. We innovatively combine
optimal transport to generate pseudo-labels with Gaussian-uniform mixture model
to ensure the reliability of the pseudo-supervised data. The federated cluster
center aggregation module aims to exchange knowledge across clients without
sharing local raw data in an efficient way. The server aggregates the local
cluster centers from different clients and then sends the global centers back
to all clients in each communication round. Our empirical studies on three
short text clustering datasets demonstrate that FSTC significantly outperforms
the federated short text clustering baselines
Federated Large Language Model: A Position Paper
Large scale language models (LLM) have received significant attention and
found diverse applications across various domains, but their development
encounters challenges in real-world scenarios. These challenges arise due to
the scarcity of public domain data availability and the need to maintain
privacy with respect to private domain data. To address these issues, federated
learning (FL) has emerged as a promising technology that enables collaborative
training of shared models while preserving decentralized data. We propose the
concept of federated LLM, which comprises three key components, i.e., federated
LLM pre-training, federated LLM fine-tuning, and federated LLM prompt
engineering. For each component, we discuss its advantage over traditional LLM
training methods and propose specific engineering strategies for
implementation. Furthermore, we explore the novel challenges introduced by the
integration of FL and LLM. We analyze existing solutions and identify potential
obstacles faced by these solutions within the context of federated LLM.Comment: 11 pages, 4 figure
- …