Search CORE

26 research outputs found

CopyScope: Model-level Copyright Infringement Quantification in the Diffusion Workflow

Author: Gao Jiashi
Wang Ziwei
Wei Xuetao
Zhou Junlei
Publication venue
Publication date: 13/10/2023
Field of study

Web-based AI image generation has become an innovative art form that can generate novel artworks with the rapid development of the diffusion model. However, this new technique brings potential copyright infringement risks as it may incorporate the existing artworks without the owners' consent. Copyright infringement quantification is the primary and challenging step towards AI-generated image copyright traceability. Previous work only focused on data attribution from the training data perspective, which is unsuitable for tracing and quantifying copyright infringement in practice because of the following reasons: (1) the training datasets are not always available in public; (2) the model provider is the responsible party, not the image. Motivated by this, in this paper, we propose CopyScope, a new framework to quantify the infringement of AI-generated images from the model level. We first rigorously identify pivotal components within the AI image generation pipeline. Then, we propose to take advantage of Fr\'echet Inception Distance (FID) to effectively capture the image similarity that fits human perception naturally. We further propose the FID-based Shapley algorithm to evaluate the infringement contribution among models. Extensive experiments demonstrate that our work not only reveals the intricacies of infringement quantification but also effectively depicts the infringing models quantitatively, thus promoting accountability in AI image-generation tasks

arXiv.org e-Print Archive

EFFL: Egalitarian Fairness in Federated Learning for Mitigating Matthew Effect

Author: Gao Jiashi
Huang Changwu
Tan Shin Hwei
Tang Ming
Wei Xuetao
Yao Xin
Publication venue
Publication date: 28/09/2023
Field of study

Recent advances in federated learning (FL) enable collaborative training of machine learning (ML) models from large-scale and widely dispersed clients while protecting their privacy. However, when different clients' datasets are heterogeneous, traditional FL mechanisms produce a global model that does not adequately represent the poorer clients with limited data resources, resulting in lower accuracy and higher bias on their local data. According to the Matthew effect, which describes how the advantaged gain more advantage and the disadvantaged lose more over time, deploying such a global model in client applications may worsen the resource disparity among the clients and harm the principles of social welfare and fairness. To mitigate the Matthew effect, we propose Egalitarian Fairness Federated Learning (EFFL), where egalitarian fairness refers to the global model learned from FL has: (1) equal accuracy among clients; (2) equal decision bias among clients. Besides achieving egalitarian fairness among the clients, EFFL also aims for performance optimality, minimizing the empirical risk loss and the bias for each client; both are essential for any ML model training, whether centralized or decentralized. We formulate EFFL as a constrained multi-constrained multi-objectives optimization (MCMOO) problem, with the decision bias and egalitarian fairness as constraints and the minimization of the empirical risk losses on all clients as multiple objectives to be optimized. We propose a gradient-based three-stage algorithm to obtain the Pareto optimal solutions within the constraint space. Extensive experiments demonstrate that EFFL outperforms other state-of-the-art FL algorithms in achieving a high-performance global model with enhanced egalitarian fairness among all clients

arXiv.org e-Print Archive

ChatAnything: Facetime Chat with LLM-Enhanced Personas

Author: Feng Jiashi
Gao Shanghua
Hou Qibin
Lin Zhijie
Yuan Xinbin
Zhao Yilin
Zhou Daquan
Publication venue
Publication date: 12/11/2023
Field of study

In this technical report, we target generating anthropomorphized personas for LLM-based characters in an online manner, including visual appearance, personality and tones, with only text descriptions. To achieve this, we first leverage the in-context learning capability of LLMs for personality generation by carefully designing a set of system prompts. We then propose two novel concepts: the mixture of voices (MoV) and the mixture of diffusers (MoD) for diverse voice and appearance generation. For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones and select the most matching one based on the user-provided text description automatically. For MoD, we combine the recent popular text-to-image generation techniques and talking head algorithms to streamline the process of generating talking objects. We termed the whole framework as ChatAnything. With it, users could be able to animate anything with any personas that are anthropomorphic using just a few text inputs. However, we have observed that the anthropomorphic objects produced by current generative models are often undetectable by pre-trained face landmark detectors, leading to failure of the face motion generation, even if these faces possess human-like appearances because those images are nearly seen during the training (e.g., OOD samples). To address this issue, we incorporate pixel-level guidance to infuse human face landmarks during the image generation phase. To benchmark these metrics, we have built an evaluation dataset. Based on it, we verify that the detection rate of the face landmark is significantly increased from 57.0% to 92.5% thus allowing automatic face animation based on generated speech content. The code and more results can be found at https://chatanything.github.io/

arXiv.org e-Print Archive

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Author: Chen Deli
Dai Damai
Deng Chengqi
Gao Huazuo
Huang Panpan
Li Jiashi
Li Y. K.
Liang Wenfeng
Luo Fuli
Ruan Chong
Sui Zhifang
Wu Y.
Xie Zhenda
Xu R. X.
Yu Xingkai
Zeng Wangding
Zhao Chenggang
Publication venue
Publication date: 11/01/2024
Field of study

In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managing computational costs when scaling up model parameters. However, conventional MoE architectures like GShard, which activate the top-

K

out of

N

experts, face challenges in ensuring expert specialization, i.e. each expert acquires non-overlapping and focused knowledge. In response, we propose the DeepSeekMoE architecture towards ultimate expert specialization. It involves two principal strategies: (1) finely segmenting the experts into

mN

ones and activating

mK

from them, allowing for a more flexible combination of activated experts; (2) isolating

K_s

experts as shared ones, aiming at capturing common knowledge and mitigating redundancy in routed experts. Starting from a modest scale with 2B parameters, we demonstrate that DeepSeekMoE 2B achieves comparable performance with GShard 2.9B, which has 1.5 times the expert parameters and computation. In addition, DeepSeekMoE 2B nearly approaches the performance of its dense counterpart with the same number of total parameters, which set the upper bound of MoE models. Subsequently, we scale up DeepSeekMoE to 16B parameters and show that it achieves comparable performance with LLaMA2 7B, with only about 40% of computations. Further, our preliminary efforts to scale up DeepSeekMoE to 145B parameters consistently validate its substantial advantages over the GShard architecture, and show its performance comparable with DeepSeek 67B, using only 28.5% (maybe even 18.2%) of computations

arXiv.org e-Print Archive

Comprehensive Analysis of the Relationship Between RAS and RAF Mutations and MSI Status of Colorectal Cancer in Northeastern China

Author: Hongxue Meng
Huining Li
Jiashi Geng
Jingshu Geng
Qi You
Ruiqi Liu
Wenqi Li
Xiaoming Jin
Xinxin Yang
Yangyang Niu
Yingwei Xue
Yuhui Gao
Publication venue: 'S. Karger AG'
Publication date: 01/10/2018
Field of study

Background/Aims: Colorectal cancer (CRC) is mainly caused by chromosomal instability (CIN) and microsatellite instability (MSI). The RAS and RAF genes are essential components of the CIN pathway, and several studies have found that RAS and RAF mutations are associated with MSI status in CRC. Here, we examined these three factors in CRC in Northeast China and aimed to reveal new details of the relationship between these mutations and MSI status. Methods: This study involved 290 patients with CRC who had RAS or RAF gene mutation detected using fluorescence-based allele-specific polymerase chain reaction or Sanger sequencing. The majority of the identified patients were found to harbor MSI (MSI status). Accurate molecular detection was carried out using formalin-fixed paraffin-embedded tissue or blood samples. Results: The rates of RAS and RAF mutations were 58.5% and 4.1%, respectively. The prevalence of RAS mutation in CRC was clearly higher and that of RAF mutation was lower in Northeast China compared with previously reported cohorts in other locations. High MSI level (MSI-H status) was more complex, at around 10%. This was consistent with previous data from China. However, compared with data reported from other continents, MSI-H was higher than that of Japan or South Korea in Asia, and lower than that of Europe or the United States. Conclusion: RAS/RAF mutations and MSI status in CRC are closely associated with tumor location and ethnicity. Further studies investigating the relationship between these three factors can help in the development of treatment strategies for patients with CRC

Directory of Open Access Journals

Outcome of infants with acute lymphoblastic leukemia treated with the Chinese Children's Cancer Group Acute Lymphoblastic Leukemia 2015 study protocol

Author: Alex WK Leung
Changda Liang
Chi-kong Li
Ching-Hon Pui
Fen Zhou
Hua Jiang
Hui Zhang
Jiaoyang Cai
Jiashi Zhu
Jiefen Qin
Jingyan Tang
Ju Gao
Liangchun Yang
Lirong Sun
Ningling Wang
Pan Gao
Qun Hu
Shaoyan Hu
Xiaofan Zhu
Xiaowen Zhai
Xin Tian
Xiuli Ju
Xuedong Wu
Yongjun Fang
Zhi Wan
Publication venue: Ferrata Storti Foundation
Publication date: 01/04/2024
Field of study

Not available

Directory of Open Access Journals

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Author: :
Bi Xiao
Chen Deli
Chen Guanting
Chen Shanhuang
Dai Damai
DeepSeek-AI
Deng Chengqi
Ding Honghui
Dong Kai
Du Qiushi
Fu Zhe
Gao Huazuo
Gao Kaige
Gao Wenjun
Ge Ruiqi
Guan Kang
Guo Daya
Guo Jianzhong
Hao Guangbo
Hao Zhewen
He Ying
Hu Wenjie
Huang Panpan
Li Erhang
Li Guowei
Li Jiashi
Li Y. K.
Li Yao
Liang Wenfeng
Lin Fangyun
Liu A. X.
Liu Bo
Liu Wen
Liu Xiaodong
Liu Xin
Liu Yiyuan
Lu Haoyu
Lu Shanghao
Luo Fuli
Ma Shirong
Nie Xiaotao
Pei Tian
Piao Yishi
Qiu Junjie
Qu Hui
Ren Tongzheng
Ren Zehui
Ruan Chong
Sha Zhangli
Shao Zhihong
Song Junxiao
Su Xuecheng
Sun Jingxiang
Sun Yaofeng
Tang Minghui
Wang Bingxuan
Wang Peiyi
Wang Shiyu
Wang Yaohui
Wang Yongji
Wu Tong
Wu Y.
Xie Xin
Xie Zhenda
Xie Ziwei
Xiong Yiliang
Xu Hanwei
Xu R. X.
Xu Yanhong
Yang Dejian
You Yuxiang
Yu Shuiping
Yu Xingkai
Zhang B.
Zhang Haowei
Zhang Lecong
Zhang Liyue
Zhang Mingchuan
Zhang Minghua
Zhang Wentao
Zhang Yichao
Zhao Chenggang
Zhao Yao
Zhou Shangyan
Zhou Shunfeng
Zhu Qihao
Zou Yuheng
Publication venue
Publication date: 05/01/2024
Field of study

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5

arXiv.org e-Print Archive

Legendre Cooperative PSO Strategies for Trajectory Optimization

Author: Fuqiang Xie
Jiashi Gao
Lei Liu
Yongji Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

Particle swarm optimization (PSO) is a population-based stochastic optimization technique in a smooth search space. However, in a category of trajectory optimization problem with arbitrary final time and multiple control variables, the smoothness of variables cannot be satisfied since the linear interpolation is widely used. In the paper, a novel Legendre cooperative PSO (LCPSO) is proposed by introducing Legendre orthogonal polynomials instead of the linear interpolation. An additional control variable is introduced to transcribe the original optimal problem with arbitrary final time to the fixed one. Then, a practical fast one-dimensional interval search algorithm is designed to optimize the additional control variable. Furthermore, to improve the convergence and prevent explosion of the LCPSO, a theorem on how to determine the boundaries of the coefficient of polynomials is given and proven. Finally, in the numeral simulations, compared with the ordinary PSO and other typical intelligent optimization algorithms GA and DE, the proposed LCPSO has traits of lower dimension, faster speed of convergence, and higher accuracy, while providing smoother control variables

Directory of Open Access Journals

Evolving parsimonious circuits through Shapley value-based genetic programming

Author: Gao Jiashi
Minku Leandro
Shi Xinming
Yao Xin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 19/07/2022
Field of study

University of Birmingham Research Portal

Legendre Cooperative PSO Strategies for Trajectory Optimization

Author: Fuqiang Xie
Jiashi Gao
Lei Liu
Yongji Wang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2018
Field of study

Crossref