26 research outputs found

    CopyScope: Model-level Copyright Infringement Quantification in the Diffusion Workflow

    Full text link
    Web-based AI image generation has become an innovative art form that can generate novel artworks with the rapid development of the diffusion model. However, this new technique brings potential copyright infringement risks as it may incorporate the existing artworks without the owners' consent. Copyright infringement quantification is the primary and challenging step towards AI-generated image copyright traceability. Previous work only focused on data attribution from the training data perspective, which is unsuitable for tracing and quantifying copyright infringement in practice because of the following reasons: (1) the training datasets are not always available in public; (2) the model provider is the responsible party, not the image. Motivated by this, in this paper, we propose CopyScope, a new framework to quantify the infringement of AI-generated images from the model level. We first rigorously identify pivotal components within the AI image generation pipeline. Then, we propose to take advantage of Fr\'echet Inception Distance (FID) to effectively capture the image similarity that fits human perception naturally. We further propose the FID-based Shapley algorithm to evaluate the infringement contribution among models. Extensive experiments demonstrate that our work not only reveals the intricacies of infringement quantification but also effectively depicts the infringing models quantitatively, thus promoting accountability in AI image-generation tasks

    EFFL: Egalitarian Fairness in Federated Learning for Mitigating Matthew Effect

    Full text link
    Recent advances in federated learning (FL) enable collaborative training of machine learning (ML) models from large-scale and widely dispersed clients while protecting their privacy. However, when different clients' datasets are heterogeneous, traditional FL mechanisms produce a global model that does not adequately represent the poorer clients with limited data resources, resulting in lower accuracy and higher bias on their local data. According to the Matthew effect, which describes how the advantaged gain more advantage and the disadvantaged lose more over time, deploying such a global model in client applications may worsen the resource disparity among the clients and harm the principles of social welfare and fairness. To mitigate the Matthew effect, we propose Egalitarian Fairness Federated Learning (EFFL), where egalitarian fairness refers to the global model learned from FL has: (1) equal accuracy among clients; (2) equal decision bias among clients. Besides achieving egalitarian fairness among the clients, EFFL also aims for performance optimality, minimizing the empirical risk loss and the bias for each client; both are essential for any ML model training, whether centralized or decentralized. We formulate EFFL as a constrained multi-constrained multi-objectives optimization (MCMOO) problem, with the decision bias and egalitarian fairness as constraints and the minimization of the empirical risk losses on all clients as multiple objectives to be optimized. We propose a gradient-based three-stage algorithm to obtain the Pareto optimal solutions within the constraint space. Extensive experiments demonstrate that EFFL outperforms other state-of-the-art FL algorithms in achieving a high-performance global model with enhanced egalitarian fairness among all clients

    ChatAnything: Facetime Chat with LLM-Enhanced Personas

    Full text link
    In this technical report, we target generating anthropomorphized personas for LLM-based characters in an online manner, including visual appearance, personality and tones, with only text descriptions. To achieve this, we first leverage the in-context learning capability of LLMs for personality generation by carefully designing a set of system prompts. We then propose two novel concepts: the mixture of voices (MoV) and the mixture of diffusers (MoD) for diverse voice and appearance generation. For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones and select the most matching one based on the user-provided text description automatically. For MoD, we combine the recent popular text-to-image generation techniques and talking head algorithms to streamline the process of generating talking objects. We termed the whole framework as ChatAnything. With it, users could be able to animate anything with any personas that are anthropomorphic using just a few text inputs. However, we have observed that the anthropomorphic objects produced by current generative models are often undetectable by pre-trained face landmark detectors, leading to failure of the face motion generation, even if these faces possess human-like appearances because those images are nearly seen during the training (e.g., OOD samples). To address this issue, we incorporate pixel-level guidance to infuse human face landmarks during the image generation phase. To benchmark these metrics, we have built an evaluation dataset. Based on it, we verify that the detection rate of the face landmark is significantly increased from 57.0% to 92.5% thus allowing automatic face animation based on generated speech content. The code and more results can be found at https://chatanything.github.io/

    DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

    Full text link
    In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managing computational costs when scaling up model parameters. However, conventional MoE architectures like GShard, which activate the top-KK out of NN experts, face challenges in ensuring expert specialization, i.e. each expert acquires non-overlapping and focused knowledge. In response, we propose the DeepSeekMoE architecture towards ultimate expert specialization. It involves two principal strategies: (1) finely segmenting the experts into mNmN ones and activating mKmK from them, allowing for a more flexible combination of activated experts; (2) isolating KsK_s experts as shared ones, aiming at capturing common knowledge and mitigating redundancy in routed experts. Starting from a modest scale with 2B parameters, we demonstrate that DeepSeekMoE 2B achieves comparable performance with GShard 2.9B, which has 1.5 times the expert parameters and computation. In addition, DeepSeekMoE 2B nearly approaches the performance of its dense counterpart with the same number of total parameters, which set the upper bound of MoE models. Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations. Further, our preliminary efforts to scale up DeepSeekMoE to 145B parameters consistently validate its substantial advantages over the GShard architecture, and show its performance comparable with DeepSeek 67B, using only 28.5% (maybe even 18.2%) of computations

    Comprehensive Analysis of the Relationship Between RAS and RAF Mutations and MSI Status of Colorectal Cancer in Northeastern China

    Get PDF
    Background/Aims: Colorectal cancer (CRC) is mainly caused by chromosomal instability (CIN) and microsatellite instability (MSI). The RAS and RAF genes are essential components of the CIN pathway, and several studies have found that RAS and RAF mutations are associated with MSI status in CRC. Here, we examined these three factors in CRC in Northeast China and aimed to reveal new details of the relationship between these mutations and MSI status. Methods: This study involved 290 patients with CRC who had RAS or RAF gene mutation detected using fluorescence-based allele-specific polymerase chain reaction or Sanger sequencing. The majority of the identified patients were found to harbor MSI (MSI status). Accurate molecular detection was carried out using formalin-fixed paraffin-embedded tissue or blood samples. Results: The rates of RAS and RAF mutations were 58.5% and 4.1%, respectively. The prevalence of RAS mutation in CRC was clearly higher and that of RAF mutation was lower in Northeast China compared with previously reported cohorts in other locations. High MSI level (MSI-H status) was more complex, at around 10%. This was consistent with previous data from China. However, compared with data reported from other continents, MSI-H was higher than that of Japan or South Korea in Asia, and lower than that of Europe or the United States. Conclusion: RAS/RAF mutations and MSI status in CRC are closely associated with tumor location and ethnicity. Further studies investigating the relationship between these three factors can help in the development of treatment strategies for patients with CRC

    DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

    Full text link
    The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5

    Legendre Cooperative PSO Strategies for Trajectory Optimization

    No full text
    Particle swarm optimization (PSO) is a population-based stochastic optimization technique in a smooth search space. However, in a category of trajectory optimization problem with arbitrary final time and multiple control variables, the smoothness of variables cannot be satisfied since the linear interpolation is widely used. In the paper, a novel Legendre cooperative PSO (LCPSO) is proposed by introducing Legendre orthogonal polynomials instead of the linear interpolation. An additional control variable is introduced to transcribe the original optimal problem with arbitrary final time to the fixed one. Then, a practical fast one-dimensional interval search algorithm is designed to optimize the additional control variable. Furthermore, to improve the convergence and prevent explosion of the LCPSO, a theorem on how to determine the boundaries of the coefficient of polynomials is given and proven. Finally, in the numeral simulations, compared with the ordinary PSO and other typical intelligent optimization algorithms GA and DE, the proposed LCPSO has traits of lower dimension, faster speed of convergence, and higher accuracy, while providing smoother control variables
    corecore