183 research outputs found
Preconditioned Federated Learning
Federated Learning (FL) is a distributed machine learning approach that
enables model training in communication efficient and privacy-preserving
manner. The standard optimization method in FL is Federated Averaging (FedAvg),
which performs multiple local SGD steps between communication rounds. FedAvg
has been considered to lack algorithm adaptivity compared to modern first-order
adaptive optimizations. In this paper, we propose new communication-efficient
FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and
server-side adaptivity (PreFedOp). Proposed methods adopt adaptivity by using a
novel covariance matrix preconditioner. Theoretically, we provide convergence
guarantees for our algorithms. The empirical experiments show our methods
achieve state-of-the-art performances on both i.i.d. and non-i.i.d. settings.Comment: preprin
Joint Manufacturing and Onsite Microgrid System Control using Markov Decision Process and Neural Network Integrated Reinforcement Learning
Onsite microgrid generation systems with renewable sources are considered a promising complementary energy supply system for manufacturing plant, especially when outage occurs during which the energy supplied from the grid is not available. Compared to the widely recognized benefits in terms of the resilience improvement when it is used as a backup energy system, the operation along with the electricity grid to support the manufacturing operations in non-emergent mode has been less investigated. In this paper, we propose a joint dynamic decision-making model for the optimal control for both manufacturing system and onsite generation system. Markov Decision Process (MDP) is used to formulate the decision-making model. A neural network integrated reinforcement learning algorithm is proposed to approximately estimate the value function given policy of MDP. A case study based on a manufacturing system as well as a typical onsite microgrid generation system is conducted to validate the proposed MDP model as well as the solution strategy
Energy Consumption Modeling of Stereolithography-Based Additive Manufacturing Toward Environmental Sustainability
Additive manufacturing (AM), also referred as three-dimensional printing or rapid prototyping, has been implemented in various areas as one of the most promising new manufacturing technologies in the past three decades. In addition to the growing public interest in developing AM into a potential mainstream manufacturing approach, increasing concerns on environmental sustainability, especially on energy consumption, have been presented. To date, research efforts have been dedicated to quantitatively measuring and analyzing the energy consumption of AM processes. Such efforts only covered partial types of AM processes and explored inadequate factors that might influence the energy consumption. In addition, energy consumption modeling for AM processes has not been comprehensively studied. To fill the research gap, this article presents a mathematical model for the energy consumption of stereolithography (SLA)-based processes. To validate the mathematical model, experiments are conducted to measure the real energy consumption from an SLA-based AM machine. The design of experiments method is adopted to examine the impacts of different parameters and their potential interactions on the overall energy consumption. For the purpose of minimization of the total energy consumption, a response optimization method is used to identify the optimal combination of parameters. The surface quality of the product built using a set of optimal parameters is obtained and compared with parts built with different parameter combinations. The comparison results show that the overall energy consumption from SLA-based AM processes can be significantly reduced through optimal parameter setting, without observable product quality decay
Privacy-Preserving Gradient Boosting Decision Trees
The Gradient Boosting Decision Tree (GBDT) is a popular machine learning
model for various tasks in recent years. In this paper, we study how to improve
model accuracy of GBDT while preserving the strong guarantee of differential
privacy. Sensitivity and privacy budget are two key design aspects for the
effectiveness of differential private models. Existing solutions for GBDT with
differential privacy suffer from the significant accuracy loss due to too loose
sensitivity bounds and ineffective privacy budget allocations (especially
across different trees in the GBDT model). Loose sensitivity bounds lead to
more noise to obtain a fixed privacy level. Ineffective privacy budget
allocations worsen the accuracy loss especially when the number of trees is
large. Therefore, we propose a new GBDT training algorithm that achieves
tighter sensitivity bounds and more effective noise allocations. Specifically,
by investigating the property of gradient and the contribution of each tree in
GBDTs, we propose to adaptively control the gradients of training data for each
iteration and leaf node clipping in order to tighten the sensitivity bounds.
Furthermore, we design a novel boosting framework to allocate the privacy
budget between trees so that the accuracy loss can be further reduced. Our
experiments show that our approach can achieve much better model accuracy than
other baselines
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
Recently, large language models (LLMs) have made significant advancements in
natural language understanding and generation. However, their potential in
computer vision remains largely unexplored. In this paper, we introduce a new,
exploratory approach that enables LLMs to process images using the Scalable
Vector Graphics (SVG) format. By leveraging the XML-based textual descriptions
of SVG representations instead of raster images, we aim to bridge the gap
between the visual and textual modalities, allowing LLMs to directly understand
and manipulate images without the need for parameterized visual components. Our
method facilitates simple image classification, generation, and in-context
learning using only LLM capabilities. We demonstrate the promise of our
approach across discriminative and generative tasks, highlighting its (i)
robustness against distribution shift, (ii) substantial improvements achieved
by tapping into the in-context learning abilities of LLMs, and (iii) image
understanding and generation capabilities with human guidance. Our code, data,
and models can be found here https://github.com/mu-cai/svg-llm
FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing
As a typical entity of MEC (Mobile Edge Computing), 5G CPE (Customer Premise
Equipment)/HGU (Home Gateway Unit) has proven to be a promising alternative to
traditional Smart Home Gateway. Network TC (Traffic Classification) is a vital
service quality assurance and security management method for communication
networks, which has become a crucial functional entity in 5G CPE/HGU. In recent
years, many researchers have applied Machine Learning or Deep Learning (DL) to
TC, namely AI-TC, to improve its performance. However, AI-TC faces challenges,
including data dependency, resource-intensive traffic labeling, and user
privacy concerns. The limited computing resources of 5G CPE further complicate
efficient classification. Moreover, the "black box" nature of AI-TC models
raises transparency and credibility issues. The paper proposes the FedEdge
AI-TC framework, leveraging Federated Learning (FL) for reliable Network TC in
5G CPE. FL ensures privacy by employing local training, model parameter
iteration, and centralized training. A semi-supervised TC algorithm based on
Variational Auto-Encoder (VAE) and convolutional neural network (CNN) reduces
data dependency while maintaining accuracy. To optimize model light-weight
deployment, the paper introduces XAI-Pruning, an AI model compression method
combined with DL model interpretability. Experimental evaluation demonstrates
FedEdge AI-TC's superiority over benchmarks in terms of accuracy and efficient
TC performance. The framework enhances user privacy and model credibility,
offering a comprehensive solution for dependable and transparent Network TC in
5G CPE, thus enhancing service quality and security.Comment: 13 pages, 13 figure
Early Detection of Disease using Electronic Health Records and Fisher\u27s Wishart Discriminant Analysis
Linear Discriminant Analysis (LDA) is a simple and effective technique for pattern classification, while it is also widely-used for early detection of diseases using Electronic Health Records (EHR) data. However, the performance of LDA for EHR data classification is frequently affected by two main factors: ill-posed estimation of LDA parameters (e.g., covariance matrix), and linear inseparability of the EHR data for classification. To handle these two issues, in this paper, we propose a novel classifier FWDA -- Fisher\u27s Wishart Discriminant Analysis, which is developed as a faster and robust nonlinear classifier. Specifically, FWDA first surrogates the distribution of potential inverse covariance matrix estimates using a Wishart distribution estimated from the training data. Then, FWDA samples a group of inverse covariance matrices from the Wishart distribution, predicts using LDA classifiers based on the sampled inverse covariance matrices, and weighted-averages the prediction results via Bayesian Voting scheme. The weights for voting are optimally updated to adapt each new input data, so as to enable the nonlinear classification
In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search
Since large language models have approached human-level performance on many
tasks, it has become increasingly harder for researchers to find tasks that are
still challenging to the models. Failure cases usually come from the long-tail
distribution - data that an oracle language model could assign a probability on
the lower end of its distribution. Current methodology such as prompt
engineering or crowdsourcing are insufficient for creating long-tail examples
because humans are constrained by cognitive bias. We propose a
Logic-Induced-Knowledge-Search (LINK) framework for systematically generating
long-tail knowledge statements. Grounded by a symbolic rule, we search for
long-tail values for each variable of the rule by first prompting a LLM, then
verifying the correctness of the values with a critic, and lastly pushing for
the long-tail distribution with a reranker. With this framework we construct a
dataset, Logic-Induced-Long-Tail (LINT), consisting of 200 symbolic rules and
50K knowledge statements spanning across four domains. Human annotations find
that 84% of the statements in LINT are factually correct. In contrast, ChatGPT
and GPT4 struggle with directly generating long-tail statements under the
guidance of logic rules, each only getting 56% and 78% of their statements
correct. Moreover, their "long-tail" generations in fact fall into the higher
likelihood range, and thus are not really long-tail. Our findings suggest that
LINK is effective for generating data in the long-tail distribution while
enforcing quality. LINT can be useful for systematically evaluating LLMs'
capabilities in the long-tail distribution. We challenge the models with a
simple entailment classification task using samples from LINT. We find that
ChatGPT and GPT4's capability in identifying incorrect knowledge drop by ~3% in
the long-tail distribution compared to head distribution
- …