Search CORE

71 research outputs found

On Knowledge Editing in Federated Learning: Perspectives, Challenges, and Future Directions

Author: Guo Song
Hong Zicong
Wang Junxiao
Wu Leijie
Zhang Jie
Zhou Jingren
Publication venue
Publication date: 02/06/2023
Field of study

As Federated Learning (FL) has gained increasing attention, it has become widely acknowledged that straightforwardly applying stochastic gradient descent (SGD) on the overall framework when learning over a sequence of tasks results in the phenomenon known as ``catastrophic forgetting''. Consequently, much FL research has centered on devising federated increasing learning methods to alleviate forgetting while augmenting knowledge. On the other hand, forgetting is not always detrimental. The selective amnesia, also known as federated unlearning, which entails the elimination of specific knowledge, can address privacy concerns and create additional ``space'' for acquiring new knowledge. However, there is a scarcity of extensive surveys that encompass recent advancements and provide a thorough examination of this issue. In this manuscript, we present an extensive survey on the topic of knowledge editing (augmentation/removal) in Federated Learning, with the goal of summarizing the state-of-the-art research and expanding the perspective for various domains. Initially, we introduce an integrated paradigm, referred to as Federated Editable Learning (FEL), by reevaluating the entire lifecycle of FL. Secondly, we provide a comprehensive overview of existing methods, evaluate their position within the proposed paradigm, and emphasize the current challenges they face. Lastly, we explore potential avenues for future research and identify unresolved issues.Comment: 7 pages, 1 figure, 2 tabel

arXiv.org e-Print Archive

Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

Author: Ding Bolin
Gao Dawei
Li Yaliang
Qian Yichen
Sun Xiuyu
Wang Haibin
Zhou Jingren
Publication venue
Publication date: 20/09/2023
Field of study

Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.Comment: We have released code on https://github.com/BeachWang/DAIL-SQ

arXiv.org e-Print Archive

Unicron: Economizing Self-Healing LLM Training at Scale

Author: He Tao
Li Xue
Qian Kun
Wang Zhibin
Xu Jingbo
Yu Wenyuan
Zhou Jingren
Publication venue
Publication date: 29/12/2023
Field of study

Training large-scale language models is increasingly critical in various domains, but it is hindered by frequent failures, leading to significant time and economic costs. Current failure recovery methods in cloud-based settings inadequately address the diverse and complex scenarios that arise, focusing narrowly on erasing downtime for individual tasks without considering the overall cost impact on a cluster. We introduce Unicron, a workload manager designed for efficient self-healing in large-scale language model training. Unicron optimizes the training process by minimizing failure-related costs across multiple concurrent tasks within a cluster. Its key features include in-band error detection for real-time error identification without extra overhead, a dynamic cost-aware plan generation mechanism for optimal reconfiguration, and an efficient transition strategy to reduce downtime during state changes. Deployed on a 128-GPU distributed cluster, Unicron demonstrates up to a 1.9x improvement in training efficiency over state-of-the-art methods, significantly reducing failure recovery costs and enhancing the reliability of large-scale language model training

arXiv.org e-Print Archive

TouchStone: Evaluating Vision-Language Models by Language Models

Author: Bai Jinze
Bai Shuai
Lin Junyang
Wang Peng
Wang Xinggang
Yang Shusheng
Zhang Xingxuan
Zhou Chang
Zhou Jingren
Publication venue
Publication date: 04/09/2023
Field of study

Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs). However, current assessments mainly focus on recognizing and reasoning abilities, lacking direct evaluation of conversational skills and neglecting visual storytelling abilities. In this paper, we propose an evaluation method that uses strong LLMs as judges to comprehensively evaluate the various abilities of LVLMs. Firstly, we construct a comprehensive visual dialogue dataset TouchStone, consisting of open-world images and questions, covering five major categories of abilities and 27 subtasks. This dataset not only covers fundamental recognition and comprehension but also extends to literary creation. Secondly, by integrating detailed image annotations we effectively transform the multimodal input content into a form understandable by LLMs. This enables us to employ advanced LLMs for directly evaluating the quality of the multimodal dialogue without requiring human intervention. Through validation, we demonstrate that powerful LVLMs, such as GPT-4, can effectively score dialogue quality by leveraging their textual capabilities alone, aligning with human preferences. We hope our work can serve as a touchstone for LVLMs' evaluation and pave the way for building stronger LVLMs. The evaluation code is available at https://github.com/OFA-Sys/TouchStone.Comment: https://github.com/OFA-Sys/TouchSton

arXiv.org e-Print Archive

Gelatin-based biomaterials and gelatin as an additive for chronic wound repair

Author: Danyang Zhao
Hongwei Cao
Jingren Wang
Zhanying Hao
Publication venue: Frontiers Media S.A.
Publication date: 01/05/2024
Field of study

Disturbing or disrupting the regular healing process of a skin wound may result in its progression to a chronic state. Chronic wounds often lead to increased infection because of their long healing time, malnutrition, and insufficient oxygen flow, subsequently affecting wound progression. Gelatin—the main structure of natural collagen—is widely used in biomedical fields because of its low cost, wide availability, biocompatibility, and degradability. However, gelatin may exhibit diverse tailored physical properties and poor antibacterial activity. Research on gelatin-based biomaterials has identified the challenges of improving gelatin’s poor antibacterial properties and low mechanical properties. In chronic wounds, gelatin-based biomaterials can promote wound hemostasis, enhance peri-wound antibacterial and anti-inflammatory properties, and promote vascular and epithelial cell regeneration. In this article, we first introduce the natural process of wound healing. Second, we present the role of gelatin-based biomaterials and gelatin as an additive in wound healing. Finally, we present the future implications of gelatin-based biomaterials

Directory of Open Access Journals

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Author: Bai Jinze
Bai Shuai
Lin Junyang
Tan Sinan
Wang Peng
Wang Shijie
Yang Shusheng
Zhou Chang
Zhou Jingren
Publication venue
Publication date: 24/08/2023
Field of study

We introduce the Qwen-VL series, a set of large-scale vision-language models designed to perceive and understand both text and images. Comprising Qwen-VL and Qwen-VL-Chat, these models exhibit remarkable performance in tasks like image captioning, question answering, visual localization, and flexible interaction. The evaluation covers a wide range of tasks including zero-shot captioning, visual or document visual question answering, and grounding. We demonstrate the Qwen-VL outperforms existing Large Vision Language Models (LVLMs). We present their architecture, training, capabilities, and performance, highlighting their contributions to advancing multimodal artificial intelligence. Code, demo and models are available at https://github.com/QwenLM/Qwen-VL.Comment: Code, demo and models are available at https://github.com/QwenLM/Qwen-V

arXiv.org e-Print Archive

Bis{4,4′-[oxalylbis(azanediyl)]dipyridinium} octamolybdate

Author: Cronin
Cui
Fukaya
Gong
Jianbo Qin
Jinghua Li
Jingren Dong
Katsoulis
Luan
Pope
Sheldrick
Tzeng
Wang
Yun Gong
Publication venue: International Union of Crystallography
Publication date: 01/07/2010
Field of study

In the crystal structure of the title compound, (C12H12N4O2)2[Mo8O26], the amino and pyridinium groups of the N 1,N 2-di(pyridinium-4-yl)oxalamide cations are hydrogen bonded to the O atoms of the centrosymmetric isopolyoxometalate β-[Mo8O26]4− anions, forming a three-dimensional supramolecular architecture

Crossref

Directory of Open Access Journals

PubMed Central