71 research outputs found

    On Knowledge Editing in Federated Learning: Perspectives, Challenges, and Future Directions

    Full text link
    As Federated Learning (FL) has gained increasing attention, it has become widely acknowledged that straightforwardly applying stochastic gradient descent (SGD) on the overall framework when learning over a sequence of tasks results in the phenomenon known as ``catastrophic forgetting''. Consequently, much FL research has centered on devising federated increasing learning methods to alleviate forgetting while augmenting knowledge. On the other hand, forgetting is not always detrimental. The selective amnesia, also known as federated unlearning, which entails the elimination of specific knowledge, can address privacy concerns and create additional ``space'' for acquiring new knowledge. However, there is a scarcity of extensive surveys that encompass recent advancements and provide a thorough examination of this issue. In this manuscript, we present an extensive survey on the topic of knowledge editing (augmentation/removal) in Federated Learning, with the goal of summarizing the state-of-the-art research and expanding the perspective for various domains. Initially, we introduce an integrated paradigm, referred to as Federated Editable Learning (FEL), by reevaluating the entire lifecycle of FL. Secondly, we provide a comprehensive overview of existing methods, evaluate their position within the proposed paradigm, and emphasize the current challenges they face. Lastly, we explore potential avenues for future research and identify unresolved issues.Comment: 7 pages, 1 figure, 2 tabel

    Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

    Full text link
    Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.Comment: We have released code on https://github.com/BeachWang/DAIL-SQ

    Unicron: Economizing Self-Healing LLM Training at Scale

    Full text link
    Training large-scale language models is increasingly critical in various domains, but it is hindered by frequent failures, leading to significant time and economic costs. Current failure recovery methods in cloud-based settings inadequately address the diverse and complex scenarios that arise, focusing narrowly on erasing downtime for individual tasks without considering the overall cost impact on a cluster. We introduce Unicron, a workload manager designed for efficient self-healing in large-scale language model training. Unicron optimizes the training process by minimizing failure-related costs across multiple concurrent tasks within a cluster. Its key features include in-band error detection for real-time error identification without extra overhead, a dynamic cost-aware plan generation mechanism for optimal reconfiguration, and an efficient transition strategy to reduce downtime during state changes. Deployed on a 128-GPU distributed cluster, Unicron demonstrates up to a 1.9x improvement in training efficiency over state-of-the-art methods, significantly reducing failure recovery costs and enhancing the reliability of large-scale language model training

    TouchStone: Evaluating Vision-Language Models by Language Models

    Full text link
    Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs). However, current assessments mainly focus on recognizing and reasoning abilities, lacking direct evaluation of conversational skills and neglecting visual storytelling abilities. In this paper, we propose an evaluation method that uses strong LLMs as judges to comprehensively evaluate the various abilities of LVLMs. Firstly, we construct a comprehensive visual dialogue dataset TouchStone, consisting of open-world images and questions, covering five major categories of abilities and 27 subtasks. This dataset not only covers fundamental recognition and comprehension but also extends to literary creation. Secondly, by integrating detailed image annotations we effectively transform the multimodal input content into a form understandable by LLMs. This enables us to employ advanced LLMs for directly evaluating the quality of the multimodal dialogue without requiring human intervention. Through validation, we demonstrate that powerful LVLMs, such as GPT-4, can effectively score dialogue quality by leveraging their textual capabilities alone, aligning with human preferences. We hope our work can serve as a touchstone for LVLMs' evaluation and pave the way for building stronger LVLMs. The evaluation code is available at https://github.com/OFA-Sys/TouchStone.Comment: https://github.com/OFA-Sys/TouchSton

    Gelatin-based biomaterials and gelatin as an additive for chronic wound repair

    Get PDF
    Disturbing or disrupting the regular healing process of a skin wound may result in its progression to a chronic state. Chronic wounds often lead to increased infection because of their long healing time, malnutrition, and insufficient oxygen flow, subsequently affecting wound progression. Gelatin—the main structure of natural collagen—is widely used in biomedical fields because of its low cost, wide availability, biocompatibility, and degradability. However, gelatin may exhibit diverse tailored physical properties and poor antibacterial activity. Research on gelatin-based biomaterials has identified the challenges of improving gelatin’s poor antibacterial properties and low mechanical properties. In chronic wounds, gelatin-based biomaterials can promote wound hemostasis, enhance peri-wound antibacterial and anti-inflammatory properties, and promote vascular and epithelial cell regeneration. In this article, we first introduce the natural process of wound healing. Second, we present the role of gelatin-based biomaterials and gelatin as an additive in wound healing. Finally, we present the future implications of gelatin-based biomaterials

    Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

    Full text link
    We introduce the Qwen-VL series, a set of large-scale vision-language models designed to perceive and understand both text and images. Comprising Qwen-VL and Qwen-VL-Chat, these models exhibit remarkable performance in tasks like image captioning, question answering, visual localization, and flexible interaction. The evaluation covers a wide range of tasks including zero-shot captioning, visual or document visual question answering, and grounding. We demonstrate the Qwen-VL outperforms existing Large Vision Language Models (LVLMs). We present their architecture, training, capabilities, and performance, highlighting their contributions to advancing multimodal artificial intelligence. Code, demo and models are available at https://github.com/QwenLM/Qwen-VL.Comment: Code, demo and models are available at https://github.com/QwenLM/Qwen-V

    Bis{4,4′-[oxalylbis(aza­nedi­yl)]dipyridinium} octa­molybdate

    Get PDF
    In the crystal structure of the title compound, (C12H12N4O2)2[Mo8O26], the amino and pyridinium groups of the N 1,N 2-di(pyridinium-4-yl)oxalamide cations are hydrogen bonded to the O atoms of the centrosymmetric isopolyoxometalate β-[Mo8O26]4− anions, forming a three-dimensional supra­molecular architecture
    corecore