Search CORE

127 research outputs found

Multi-Job Intelligent Scheduling with Cross-Device Federated Learning

Author: Dai Huaiyu
Dou Dejing
Jia Juncheng
Liu Ji
Ma Beichen
Zhou Chendi
Zhou Jingbo
Zhou Yang
Publication venue
Publication date: 24/11/2022
Field of study

Recent years have witnessed a large amount of decentralized data in various (edge) devices of end-users, while the decentralized data aggregation remains complicated for machine learning jobs because of regulations and laws. As a practical approach to handling decentralized data, Federated Learning (FL) enables collaborative global machine learning model training without sharing sensitive raw data. The servers schedule devices to jobs within the training process of FL. In contrast, device scheduling with multiple jobs in FL remains a critical and open problem. In this paper, we propose a novel multi-job FL framework, which enables the training process of multiple jobs in parallel. The multi-job FL framework is composed of a system model and a scheduling method. The system model enables a parallel training process of multiple jobs, with a cost model based on the data fairness and the training time of diverse devices during the parallel training process. We propose a novel intelligent scheduling approach based on multiple scheduling methods, including an original reinforcement learning-based scheduling method and an original Bayesian optimization-based scheduling method, which corresponds to a small cost while scheduling devices to multiple jobs. We conduct extensive experimentation with diverse jobs and datasets. The experimental results reveal that our proposed approaches significantly outperform baseline approaches in terms of training time (up to 12.73 times faster) and accuracy (up to 46.4% higher).Comment: To appear in TPDS; 22 pages, 17 figures, 8 tables. arXiv admin note: substantial text overlap with arXiv:2112.0592

arXiv.org e-Print Archive

Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization

Author: Che Tianshi
Dai Huaiyu
Dou Dejing
Liu Ji
Ren Jiaxiang
Sheng Victor S.
Zhou Jiwen
Zhou Yang
Publication venue
Publication date: 29/10/2023
Field of study

Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. However, the training process of Large Language Models (LLMs) generally incurs the update of significant parameters, which limits the applicability of FL techniques to tackle the LLMs in real scenarios. Prompt tuning can significantly reduce the number of parameters to update, but it either incurs performance degradation or low training efficiency. The straightforward utilization of prompt tuning in the FL often raises non-trivial communication costs and dramatically degrades performance. In addition, the decentralized data is generally non-Independent and Identically Distributed (non-IID), which brings client drift problems and thus poor performance. This paper proposes a Parameter-efficient prompt Tuning approach with Adaptive Optimization, i.e., FedPepTAO, to enable efficient and effective FL of LLMs. First, an efficient partial prompt tuning approach is proposed to improve performance and efficiency simultaneously. Second, a novel adaptive optimization method is developed to address the client drift problems on both the device and server sides to enhance performance further. Extensive experiments based on 10 datasets demonstrate the superb performance (up to 60.8\% in terms of accuracy) and efficiency (up to 97.59\% in terms of training time) of FedPepTAO compared with 9 baseline approaches. Our code is available at https://github.com/llm-eff/FedPepTAO.Comment: 18 pages, accepted by EMNLP 202

arXiv.org e-Print Archive

Data-Centric Financial Large Language Models

Author: Chen Hong
Chu Zhixuan
Cui Qing
Guo Huaiyu
Li Longfei
Li Sheng
Lu Xin
Wang Yijia
Xu Wanqing
Yu Fei
Zhou Jun
Zhou Xinyuan
Publication venue
Publication date: 13/11/2023
Field of study

Large language models (LLMs) show promise for natural language tasks but struggle when applied directly to complex domains like finance. LLMs have difficulty reasoning about and integrating all relevant information. We propose a data-centric approach to enable LLMs to better handle financial tasks. Our key insight is that rather than overloading the LLM with everything at once, it is more effective to preprocess and pre-understand the data. We create a financial LLM (FLLM) using multitask prompt-based finetuning to achieve data pre-processing and pre-understanding. However, labeled data is scarce for each task. To overcome manual annotation costs, we employ abductive augmentation reasoning (AAR) to automatically generate training data by modifying the pseudo labels from FLLM's own outputs. Experiments show our data-centric FLLM with AAR substantially outperforms baseline financial LLMs designed for raw text, achieving state-of-the-art on financial analysis and interpretation tasks. We also open source a new benchmark for financial analysis and interpretation. Our methodology provides a promising path to unlock LLMs' potential for complex real-world domains

arXiv.org e-Print Archive

Significant transcriptional changes in mature daughter Varroa destructor mites during infestation of different developmental stages of honeybees

Author: Elsheikha Hany M
Getachew Awraris
Tu Yangyang
Wu Jiangli
Xu Shufa
Zhou Chunxue
Zhou Huaiyu
Publication venue: 'Wiley'
Publication date: 01/08/2020
Field of study

Background: Varroa destructor is considered a major cause of honeybee (Apis mellifera) colony losses worldwide. Although V. destructor mites exhibit preference behavior for certain honeybee lifecycle stages, the mechanism underlying host finding and preference remains largely unknown. Results: By using a de novo transcriptome assembly strategy, we sequenced the mature daughter V. destructor mite transcriptome during infestation of different stages of honeybees (brood cells, newly emerged bees and adult bees). A total of 132 779 unigenes were obtained with an average length of 2745 bp and N50 of 5706 bp. About 63.1% of the transcriptome could be annotated based on sequence homology to the predatory mite Metaseiulus occidentalis proteins. Expression analysis revealed that mature daughter mites had distinct transcriptome profiles after infestation of different honeybee stages, and that the majority of the differentially expressed genes (DEGs) of mite infesting adult honeybees were down-regulated compared to that infesting the sealed brood cells. Gene ontology and KEGG pathway enrichment analyses showed that a large number of DEGs were involved in cellular process and metabolic process, suggesting that Varroa mites undergo metabolic adjustment to accommodate the cellular, molecular and/or immune response of the honeybees. Interestingly, in adult honeybees, some mite DEGs involved in neurotransmitter biosynthesis and transport were identified and their levels of expression were validated by quantitative polymerase chain reaction (qPCR). Conclusion: These results provide evidence for transcriptional reprogramming in mature daughter Varroa mites during infestation of honeybees, which may be relevant to understanding the mechanism underpinning adaptation and preference behavior of these mites for honeybees. © 2020 Society of Chemical Industry

Crossref

Repository@Nottingham

Toxoplasma gondii cathepsin proteases are undeveloped prominent vaccine antigens against toxoplasmosis

Author: Aihua Zhou
Gang Lv
Guanghui Zhao
Hua Cong
Huaiyu Zhou
Lin Wang
Min Meng
Min Sun
Qunli Zhao
Shenyi He
Xing-Quan Zhu
Yali Han
Yang Bai
Publication venue: Springer Nature
Publication date: 01/01/2013
Field of study

BACKGROUND: Toxoplasma gondii, an obligate intracellular apicomplexan parasite, infects a wide range of warm-blooded animals including humans. T. gondii expresses five members of the C1 family of cysteine proteases, including cathepsin B-like (TgCPB) and cathepsin L-like (TgCPL) proteins. TgCPB is involved in ROP protein maturation and parasite invasion, whereas TgCPL contributes to proteolytic maturation of proTgM2AP and proTgMIC3. TgCPL is also associated with the residual body in the parasitophorous vacuole after cell division has occurred. Both of these proteases are potential therapeutic targets in T. gondii. The aim of this study was to investigate TgCPB and TgCPL for their potential as DNA vaccines against T. gondii. METHODS: Using bioinformatics approaches, we analyzed TgCPB and TgCPL proteins and identified several linear-B cell epitopes and potential Th-cell epitopes in them. Based on these results, we assembled two single-gene constructs (TgCPB and TgCPL) and a multi-gene construct (pTgCPB/TgCPL) with which to immunize BALB/c mice and test their effectiveness as DNA vaccines. RESULTS: TgCPB and TgCPL vaccines elicited strong humoral and cellular immune responses in mice, both of which were Th-1 cell mediated. In addition, all of the vaccines protected the mice against infection with virulent T. gondii RH tachyzoites, with the multi-gene vaccine (pTgCPB/TgCPL) providing the highest level of protection. CONCLUSIONS: T. gondii CPB and CPL proteases are strong candidates for development as novel DNA vaccines

Springer - Publisher Connector

PubMed Central

Immunogenicity of a Virus-Like-Particle Vaccine Containing Multiple Antigenic Epitopes of Toxoplasma gondii Against Acute and Chronic Toxoplasmosis in Mice

Author: Aihua Zhou
Chunxue Zhou
Ge Pan
Hua Cong
Huaiyu Zhou
Jingjing Guo
Kang Ai
Shenyi He
Wenchao Sha
Xiahui Sun
Publication venue: 'Frontiers Media SA'
Publication date: 01/03/2019
Field of study

There is no effective protective vaccine against human toxoplasmosis, which is a potential threat to nearly a third of the world population. Vaccines based on virus-like particles (VLPs) have been highly successful in humans for many years, but have rarely been applied against Toxoplasma gondii infection. In this study, we inserted a B cell epitope (SAG182−102 or SAG1301−320), a CD8+ cell epitope (HF10 or ROP7), and a CD4+ cell epitope (AS15) of T. gondii into a truncated HBcΔ(amino acids1–149) particle to construct four chimeric VLP vaccine formulations, i.e., HBcΔH82, HBcΔH301, HBcΔ R82, and HBcΔ R301. When these chimeric HBc particles were expressed in Escherichia coli, they showed icosahedral morphology similar to that of the original VLPs and were evaluated as vaccine formulations against acute and chronic toxoplasmosis in a mouse model (BALB/c mice (H-2d). All these chimeric HBc VLPs induced strong humoral and cellular immune responses with high IgG antibody titers and interferon(IFN)-γ production. Only the mice immunized with HBcΔH82 showed prolonged survival time (15.6 ± 3.8 vs. 5.6 ± 0.8 days) against acute infection with RH tachyzoites and decrease in brain parasite load (1,454 ± 239 vs. 2,091 ± 263) against chronic infection with Prugniuad cysts, as compared to the findings for the control group. These findings suggest that HBc VLPs would act as an effective carrier for delivering effective multiple antigenic epitopes and would be beneficial for developing a safe and long-acting vaccine against toxoplasmosis

Directory of Open Access Journals

FigShare

Early Identification of Hepatocellular Carcinoma Patients at High-Risk of Recurrence Using the Adv Score: A Multicenter Retrospective Study

Author: Cao Shuya
Chen Chaobo
Ji Guwei
Li Wenwen
Liu Jinsong
Wang Ke
Wu Huaiyu
Xu Jiawei
Xu Xiaoliang
Xu Zhenggang
Yuan Yihang
Zhao Chunlong
Zhou Zheyu
Publication venue: DigitalCommons@TMC
Publication date: 07/09/2024
Field of study

BACKGROUND: Postoperative recurrence is a vital reason for poor 5-year overall survival in hepatocellular carcinoma (HCC) patients. The ADV score is considered a parameter that can quantify HCC aggressiveness. This study aimed to identify HCC patients at high-risk of recurrence early using the ADV score. METHODS: The medical data of consecutive HCC patients undergoing hepatectomy from The First Affiliated Hospital of Nanjing Medical University (TFAHNJMU) and Nanjing Drum Tower Hospital (NJDTH) were retrospectively reviewed. Based on the status of microvascular invasion and the Edmondson-Steiner grade, HCC patients were divided into three groups: low-risk group (group 1: no risk factor exists), medium-risk group (group 2: one risk factor exists), and high-risk group (group 3: coexistence of two risk factors). In the training cohort (TFAHNJMU), the R package nnet was used to establish a multi-categorical unordered logistic regression model based on the ADV score to predict three risk groups. The Welch\u27s T-test was used to compare differences in clinical variables in three predicted risk groups. NJDTH served as an external validation center. At last, the confusion matrix was developed using the R package caret to evaluate the diagnostic performance of the model. RESULTS: 350 and 405 patients from TFAHNJMU and NJDTH were included. HCC patients in different risk groups had significantly different liver function and inflammation levels. Density maps demonstrated that the ADV score could best differentiate between the three risk groups. The probability curve was plotted according to the predicted results of the multi-categorical unordered logistic regression model, and the best cut-off values of the ADV score were as follows: low-risk ≤ 3.4 log, 3.4 log \u3c medium-risk ≤ 5.7 log, and high-risk \u3e 5.7 log. The sensitivities of the ADV score predicting the high-risk group (group 3) were 70.2% (99/141) and 78.8% (63/80) in the training and external validation cohort, respectively. CONCLUSION: The ADV score might become a valuable marker for screening patients at high-risk of HCC recurrence with a cut-off value of 5.7 log, which might help surgeons, pathologists, and HCC patients make appropriate clinical decisions

DigitalCommons@The Texas Medical Center

Crystal structure of rhodopsin bound to arrestin by femtosecond X-ray laser.

Author: Barty Anton
Basu Shibom
Boutet Sébastien
Caffrey Martin
Caro Lydia N
Carragher Bridget
Chapman Henry N
Cherezov Vadim
Coe Jesse
Conrad Chelsie E
de Waal Parker W
Diederichs Kay
Dong Yuhui
Ernst Oliver P
Fromme Petra
Fromme Raimund
Gao Xiang
Gati Cornelius
Griffin Patrick R
Grotjohann Ingo
Gu Xin
Gurevich Vsevolod V
Han Gye Won
He Yuanzheng
Howe Nicole
Hubbell Wayne L
Ishchenko Andrii
James Daniel
Jiang Hualiang
Jiang Yi
Kang Yanyong
Katritch Vsevolod
Ke Jiyuan
Kupitz Christopher
Lee Regina J
Li Dianfan
Li Jun
Lisova Stella
Liu Haiguang
Liu Wei
Ma Jinming
Melcher Karsten
Messerschmidt Marc
Moeller Arne
Pal Kuntal
Pascal Bruce D
Potter Clinton S
Roy-Chowdhury Shatabdi
Spence John CH
Standfuss Jörg
Stevens Raymond C
Suino-Powell Kelly M
Tan MH Eileen
Tan Minjia
Van Eps Ned
Vishnivetskiy Sergey A
Wang Dingjie
Wang Meitian
Weierstall Uwe
West Graham M
White Thomas A
Williams Garth J
Xu H Eric
Xu Qingping
Yang Huaiyu
Yefanov Oleksandr
Zatsepin Nadia A
Zhang Chenghai
Zhao Yingming
Zheng Zhong
Zhi Xiaoyong
Zhou X Edward
Publication venue: eScholarship, University of California
Publication date: 01/01/2015
Field of study

G-protein-coupled receptors (GPCRs) signal primarily through G proteins or arrestins. Arrestin binding to GPCRs blocks G protein interaction and redirects signalling to numerous G-protein-independent pathways. Here we report the crystal structure of a constitutively active form of human rhodopsin bound to a pre-activated form of the mouse visual arrestin, determined by serial femtosecond X-ray laser crystallography. Together with extensive biochemical and mutagenesis data, the structure reveals an overall architecture of the rhodopsin-arrestin assembly in which rhodopsin uses distinct structural elements, including transmembrane helix 7 and helix 8, to recruit arrestin. Correspondingly, arrestin adopts the pre-activated conformation, with a ∼20° rotation between the amino and carboxy domains, which opens up a cleft in arrestin to accommodate a short helix formed by the second intracellular loop of rhodopsin. This structure provides a basis for understanding GPCR-mediated arrestin-biased signalling and demonstrates the power of X-ray lasers for advancing the frontiers of structural biology

DESY Publication Database

PubMed Central

eScholarship - University of California

DESY