Search CORE

183 research outputs found

Preconditioned Federated Learning

Author: Li Qun
Tao Zeyi
Wu Jindi
Publication venue
Publication date: 20/09/2023
Field of study

Federated Learning (FL) is a distributed machine learning approach that enables model training in communication efficient and privacy-preserving manner. The standard optimization method in FL is Federated Averaging (FedAvg), which performs multiple local SGD steps between communication rounds. FedAvg has been considered to lack algorithm adaptivity compared to modern first-order adaptive optimizations. In this paper, we propose new communication-efficient FL algortithms based on two adaptive frameworks: local adaptivity (PreFed) and server-side adaptivity (PreFedOp). Proposed methods adopt adaptivity by using a novel covariance matrix preconditioner. Theoretically, we provide convergence guarantees for our algorithms. The empirical experiments show our methods achieve state-of-the-art performances on both i.i.d. and non-i.i.d. settings.Comment: preprin

arXiv.org e-Print Archive

Joint Manufacturing and Onsite Microgrid System Control using Markov Decision Process and Neural Network Integrated Reinforcement Learning

Author: Hu Wenqing
Li Y.
Sun Zeyi
Zhang Y.
Publication venue: Scholars\u27 Mine
Publication date: 01/08/2019
Field of study

Onsite microgrid generation systems with renewable sources are considered a promising complementary energy supply system for manufacturing plant, especially when outage occurs during which the energy supplied from the grid is not available. Compared to the widely recognized benefits in terms of the resilience improvement when it is used as a backup energy system, the operation along with the electricity grid to support the manufacturing operations in non-emergent mode has been less investigated. In this paper, we propose a joint dynamic decision-making model for the optimal control for both manufacturing system and onsite generation system. Markov Decision Process (MDP) is used to formulate the decision-making model. A neural network integrated reinforcement learning algorithm is proposed to approximately estimate the value function given policy of MDP. A case study based on a manufacturing system as well as a typical onsite microgrid generation system is conducted to validate the proposed MDP model as well as the solution strategy

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Energy Consumption Modeling of Stereolithography-Based Additive Manufacturing Toward Environmental Sustainability

Author: Li Lin
Pan Yayue
Sun Zeyi
Yang Yiran
Publication venue: Scholars\u27 Mine
Publication date: 01/11/2017
Field of study

Additive manufacturing (AM), also referred as three-dimensional printing or rapid prototyping, has been implemented in various areas as one of the most promising new manufacturing technologies in the past three decades. In addition to the growing public interest in developing AM into a potential mainstream manufacturing approach, increasing concerns on environmental sustainability, especially on energy consumption, have been presented. To date, research efforts have been dedicated to quantitatively measuring and analyzing the energy consumption of AM processes. Such efforts only covered partial types of AM processes and explored inadequate factors that might influence the energy consumption. In addition, energy consumption modeling for AM processes has not been comprehensively studied. To fill the research gap, this article presents a mathematical model for the energy consumption of stereolithography (SLA)-based processes. To validate the mathematical model, experiments are conducted to measure the real energy consumption from an SLA-based AM machine. The design of experiments method is adopted to examine the impacts of different parameters and their potential interactions on the overall energy consumption. For the purpose of minimization of the total energy consumption, a response optimization method is used to identify the optimal combination of parameters. The surface quality of the product built using a set of optimal parameters is obtained and compared with parts built with different parameter combinations. The comparison results show that the overall energy consumption from SLA-based AM processes can be significantly reduced through optimal parameter setting, without observable product quality decay

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Privacy-Preserving Gradient Boosting Decision Trees

Author: He Bingsheng
Li Qinbin
Wen Zeyi
Wu Zhaomin
Publication venue
Publication date: 28/10/2020
Field of study

The Gradient Boosting Decision Tree (GBDT) is a popular machine learning model for various tasks in recent years. In this paper, we study how to improve model accuracy of GBDT while preserving the strong guarantee of differential privacy. Sensitivity and privacy budget are two key design aspects for the effectiveness of differential private models. Existing solutions for GBDT with differential privacy suffer from the significant accuracy loss due to too loose sensitivity bounds and ineffective privacy budget allocations (especially across different trees in the GBDT model). Loose sensitivity bounds lead to more noise to obtain a fixed privacy level. Ineffective privacy budget allocations worsen the accuracy loss especially when the number of trees is large. Therefore, we propose a new GBDT training algorithm that achieves tighter sensitivity bounds and more effective noise allocations. Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds. Furthermore, we design a novel boosting framework to allocate the privacy budget between trees so that the accuracy loss can be further reduced. Our experiments show that our approach can achieve much better model accuracy than other baselines

arXiv.org e-Print Archive

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding

Author: Cai Mu
Huang Zeyi
Lee Yong Jae
Li Yuheng
Wang Haohan
Publication venue
Publication date: 09/06/2023
Field of study

Recently, large language models (LLMs) have made significant advancements in natural language understanding and generation. However, their potential in computer vision remains largely unexplored. In this paper, we introduce a new, exploratory approach that enables LLMs to process images using the Scalable Vector Graphics (SVG) format. By leveraging the XML-based textual descriptions of SVG representations instead of raster images, we aim to bridge the gap between the visual and textual modalities, allowing LLMs to directly understand and manipulate images without the need for parameterized visual components. Our method facilitates simple image classification, generation, and in-context learning using only LLM capabilities. We demonstrate the promise of our approach across discriminative and generative tasks, highlighting its (i) robustness against distribution shift, (ii) substantial improvements achieved by tapping into the in-context learning abilities of LLMs, and (iii) image understanding and generation capabilities with human guidance. Our code, data, and models can be found here https://github.com/mu-cai/svg-llm

arXiv.org e-Print Archive

FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing

Author: Fu Mengyi
Li Zeyi
Liu MinYao
Wang Pan
Wang Zixuan
Zhang Ze
Publication venue
Publication date: 14/08/2023
Field of study

As a typical entity of MEC (Mobile Edge Computing), 5G CPE (Customer Premise Equipment)/HGU (Home Gateway Unit) has proven to be a promising alternative to traditional Smart Home Gateway. Network TC (Traffic Classification) is a vital service quality assurance and security management method for communication networks, which has become a crucial functional entity in 5G CPE/HGU. In recent years, many researchers have applied Machine Learning or Deep Learning (DL) to TC, namely AI-TC, to improve its performance. However, AI-TC faces challenges, including data dependency, resource-intensive traffic labeling, and user privacy concerns. The limited computing resources of 5G CPE further complicate efficient classification. Moreover, the "black box" nature of AI-TC models raises transparency and credibility issues. The paper proposes the FedEdge AI-TC framework, leveraging Federated Learning (FL) for reliable Network TC in 5G CPE. FL ensures privacy by employing local training, model parameter iteration, and centralized training. A semi-supervised TC algorithm based on Variational Auto-Encoder (VAE) and convolutional neural network (CNN) reduces data dependency while maintaining accuracy. To optimize model light-weight deployment, the paper introduces XAI-Pruning, an AI model compression method combined with DL model interpretability. Experimental evaluation demonstrates FedEdge AI-TC's superiority over benchmarks in terms of accuracy and efficient TC performance. The framework enhances user privacy and model credibility, offering a comprehensive solution for dependable and transparent Network TC in 5G CPE, thus enhancing service quality and security.Comment: 13 pages, 13 figure

arXiv.org e-Print Archive

Early Detection of Disease using Electronic Health Records and Fisher\u27s Wishart Discriminant Analysis

Author: Bian Jian
Li Yu
Sun Zeyi
Wang Licheng
Xiong Haoyi
Yang Sijia
Zhu Haojin
Publication venue: Scholars\u27 Mine
Publication date: 01/11/2018
Field of study

Linear Discriminant Analysis (LDA) is a simple and effective technique for pattern classification, while it is also widely-used for early detection of diseases using Electronic Health Records (EHR) data. However, the performance of LDA for EHR data classification is frequently affected by two main factors: ill-posed estimation of LDA parameters (e.g., covariance matrix), and linear inseparability of the EHR data for classification. To handle these two issues, in this paper, we propose a novel classifier FWDA -- Fisher\u27s Wishart Discriminant Analysis, which is developed as a faster and robust nonlinear classifier. Specifically, FWDA first surrogates the distribution of potential inverse covariance matrix estimates using a Wishart distribution estimated from the training data. Then, FWDA samples a group of inverse covariance matrices from the Wishart distribution, predicts using LDA classifiers based on the sampled inverse covariance matrices, and weighted-averages the prediction results via Bayesian Voting scheme. The weights for voting are optimally updated to adapt each new input data, so as to enable the nonlinear classification

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

In Search of the Long-Tail: Systematic Generation of Long-Tail Knowledge via Logical Rule Guided Search

Author: Brahman Faeze
Choi Yejin
Li Huihan
Li Xiang Lorraine
Liao Zeyi
Lu Ximing
Ning Yuting
Ren Xiang
Wang Siyuan
Zhao Wenting
Publication venue
Publication date: 13/11/2023
Field of study

Since large language models have approached human-level performance on many tasks, it has become increasingly harder for researchers to find tasks that are still challenging to the models. Failure cases usually come from the long-tail distribution - data that an oracle language model could assign a probability on the lower end of its distribution. Current methodology such as prompt engineering or crowdsourcing are insufficient for creating long-tail examples because humans are constrained by cognitive bias. We propose a Logic-Induced-Knowledge-Search (LINK) framework for systematically generating long-tail knowledge statements. Grounded by a symbolic rule, we search for long-tail values for each variable of the rule by first prompting a LLM, then verifying the correctness of the values with a critic, and lastly pushing for the long-tail distribution with a reranker. With this framework we construct a dataset, Logic-Induced-Long-Tail (LINT), consisting of 200 symbolic rules and 50K knowledge statements spanning across four domains. Human annotations find that 84% of the statements in LINT are factually correct. In contrast, ChatGPT and GPT4 struggle with directly generating long-tail statements under the guidance of logic rules, each only getting 56% and 78% of their statements correct. Moreover, their "long-tail" generations in fact fall into the higher likelihood range, and thus are not really long-tail. Our findings suggest that LINK is effective for generating data in the long-tail distribution while enforcing quality. LINT can be useful for systematically evaluating LLMs' capabilities in the long-tail distribution. We challenge the models with a simple entailment classification task using samples from LINT. We find that ChatGPT and GPT4's capability in identifying incorrect knowledge drop by ~3% in the long-tail distribution compared to head distribution

arXiv.org e-Print Archive