266 research outputs found

    Continual Instruction Tuning for Large Multimodal Models

    Full text link
    Instruction tuning is now a widely adopted approach to aligning large multimodal models (LMMs) to follow human intent. It unifies the data format of vision-language tasks, enabling multi-task joint training. However, vision-language tasks are constantly being created in practice. Instead of always re-training LMMs when new tasks arrive, continual learning offers flexibility for models to continually and efficiently exploit the evolving data. This work aims to explore the following two questions: 1) Do LMMs still suffer from catastrophic forgetting in continual instruction tuning? 2) Are the existing three classes of continual learning methods still applicable to the continual instruction tuning of LMMs? An extensive study is conducted to address the above questions. First, we establish the first benchmark in this setting and reveal that catastrophic forgetting is still observed when continually instruction-tuning LMMs. However, the multi-task joint instruction tuning can facilitate the model's continual learning ability and mitigate forgetting. Second, we integrate and adapt classic continual learning methods to our context, demonstrating the efficacy of data replay and model expansion strategies across diverse scenarios. In contrast, regularization-based methods only perform well on models that have been jointly instruction-tuned on multiple tasks. Third, we delve into the correlation and forgetting dynamics between vision-language task pairs and propose task-similarity-informed regularization and model expansion methods for continual instruction tuning of LMMs. Experimental results show that our approach consistently boosts the model's performance

    MathNAS: If Blocks Have a Role in Mathematical Architecture Design

    Full text link
    Neural Architecture Search (NAS) has emerged as a favoured method for unearthing effective neural architectures. Recent development of large models has intensified the demand for faster search speeds and more accurate search results. However, designing large models by NAS is challenging due to the dramatical increase of search space and the associated huge performance evaluation cost. Consider a typical modular search space widely used in NAS, in which a neural architecture consists of mm block nodes and a block node has nn alternative blocks. Facing the space containing nmn^m candidate networks, existing NAS methods attempt to find the best one by searching and evaluating candidate networks directly.Different from the general strategy that takes architecture search as a whole problem, we propose a novel divide-and-conquer strategy by making use of the modular nature of the search space.Here, we introduce MathNAS, a general NAS framework based on mathematical programming.In MathNAS, the performances of the mβˆ—nm*n possible building blocks in the search space are calculated first, and then the performance of a network is directly predicted based on the performances of its building blocks. Although estimating block performances involves network training, just as what happens for network performance evaluation in existing NAS methods, predicting network performance is completely training-free and thus extremely fast. In contrast to the nmn^m candidate networks to evaluate in existing NAS methods, which require training and a formidable computational burden, there are only mβˆ—nm*n possible blocks to handle in MathNAS. Therefore, our approach effectively reduces the complexity of network performance evaluation.Our code is available at https://github.com/wangqinsi1/MathNAS.Comment: NeurIPS 202

    JIGSAW: Efficient and Scalable Path Constraints Fuzzing

    Get PDF
    Coverage-guided testing has shown to be an effective way to find bugs. If we model coverage-guided testing as a search problem (i.e., finding inputs that can cover more branches), then its efficiency mainly depends on two factors: (1) the accuracy of the searching algorithm and (2) the number of inputs that can be evaluated per unit time. Therefore, improving the search throughput has shown to be an effective way to improve the performance of coverage-guided testing.In this work, we present a novel design to improve the search throughput: by evaluating newly generated inputs with JIT-compiled path constraints. This approach allows us to significantly improve the single thread throughput as well as scaling to multiple cores. We also developed several optimization techniques to eliminate major bottlenecks during this process. Evaluation of our prototype JIGSAW shows that our approach can achieve three orders of magnitude higher search throughput than existing fuzzers and can scale to multiple cores. We also find that with such high throughput, a simple gradient-guided search heuristic can solve path constraints collected from a large set of real-world programs faster than SMT solvers with much more sophisticated search heuristics. Evaluation of end-to-end coverage-guided testing also shows that our JIGSAW-powered hybrid fuzzer can outperform state-of-the-art testing tools

    From Economic Cooperation to Strategic Competition:Understanding the US-China Trade Disputes Through the Transformed Relations

    Get PDF
    This article investigates the escalation of US-China trade disputes and the implications for Sino-US relations. Both structural realism and liberal institutionalism have failed to pay sufficient attention to the evolution of the US-China economic relationship, and this article strives to highlight this crucial issue. The article employs a historical perspective to examine the transformation of US-China economic relations in the twenty-first century. It argues that the US-China economic relationship is evolving from a symbiotic but asymmetric one between 2001 and 2008, toward an increasingly competitive one after the 2008 global financial crisis, especially in the Trump-Xi era. The changing dynamics of US-China economic relations, as well as the shifting perceptions of the top leadership of each country toward the other, create the impetus for the transformation of Sino-US relations. This article suggests that the recent trade tension is embedded in the growing strategic competition between the two countries

    The role of short-chain fatty acids produced by gut microbiota in the regulation of pre-eclampsia onset

    Get PDF
    BackgroundPreeclampsia (PE) is a common pregnancy-related disorder characterized by disrupted maternal-fetal immune tolerance, involving diffuse inflammatory responses and vascular endothelial damage. Alterations in the gut microbiota (GM) during pregnancy can affect intestinal barrier function and immune balance.Aims and purposeThis comprehensive review aims to investigate the potential role of short-chain fatty acids (SCFAs), essential metabolites produced by the GM, in the development of PE. The purpose is to examine their impact on colonic peripheral regulatory T (Treg) cells, the pathogenic potential of antigen-specific helper T (Th) cells, and the inflammatory pathways associated with immune homeostasis.Key insightsAn increasing body of evidence suggests that dysbiosis in the GM can lead to alterations in SCFA levels, which may significantly contribute to the development of PE. SCFAs enhance the number and function of colonic Treg cells, mitigate the pathogenic potential of GM-specific Th cells, and inhibit inflammatory progression, thereby maintaining immune homeostasis. These insights highlight the potential significance of GM dysregulation and SCFAs produced by GM in the pathogenesis of PE. While the exact causes of PE remain elusive, and definitive clinical treatments are lacking, the GM and SCFAs present promising avenues for future clinical applications related to PE, offering a novel approach for prophylaxis and therapy
    • …
    corecore