Search CORE

33 research outputs found

Fast Prototyping Next-Generation Accelerators for New ML Models using MASE: ML Accelerator System Exploration

Author: Bouganis Christos-Savvas
Cheng Jianyi
Montgomerie-Corcoran Alex
Xiao Can
Yu Zhewen
Zhang Cheng
Zhao Yiren
Publication venue
Publication date: 28/07/2023
Field of study

Machine learning (ML) accelerators have been studied and used extensively to compute ML models with high performance and low power. However, designing such accelerators normally takes a long time and requires significant effort. Unfortunately, the pace of development of ML software models is much faster than the accelerator design cycle, leading to frequent and drastic modifications in the model architecture, thus rendering many accelerators obsolete. Existing design tools and frameworks can provide quick accelerator prototyping, but only for a limited range of models that can fit into a single hardware device, such as an FPGA. Furthermore, with the emergence of large language models, such as GPT-3, there is an increased need for hardware prototyping of these large models within a many-accelerator system to ensure the hardware can scale with the ever-growing model sizes. In this paper, we propose an efficient and scalable approach for exploring accelerator systems to compute large ML models. We developed a tool named MASE that can directly map large ML models onto an efficient streaming accelerator system. Over a set of ML models, we show that MASE can achieve better energy efficiency to GPUs when computing inference for recent transformer models. Our tool will open-sourced upon publication

arXiv.org e-Print Archive

Cancer-associated fibroblast related gene signature in Helicobacter pylori-based subtypes of gastric carcinoma for prognosis and tumor microenvironment estimation in silico analysis

Author: Dawei Zhou
Dawei Zhou
Haoyu Guan
Haoyu Guan
Jiahao Li
Jiahao Li
Le Yang
Le Yang
Ruofan Xu
Ruofan Xu
Wei Xiao
Yao Yu
Yao Yu
Yuxuan Liao
Yuxuan Liao
Zhewen Zhang
Zhewen Zhang
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

IntroductionGastric cancer (GC) remains the major constituent of cancer-related deaths and a global public health challenge with a high incidence rate. Helicobacter pylori (HP) plays an essential role in promoting the occurrence and progression of GC. Cancer-associated fibroblasts (CAFs) are regarded as a significant component in the tumor microenvironment (TME), which is related to the metastasis of GC. However, the regulation mechanisms of CAFs in HP-related GC are not elucidated thoroughly.MethodsHP-related genes (HRGs) were downloaded from the GSE84437 and TCGA-GC databases. The two databases were combined into one cohort for training. Furthermore, the consensus unsupervised clustering analysis was obtained to sort the training cohort into different groups for the identification of differential expression genes (DEGs). Weighted correlation network analysis (WGCNA) was performed to verify the correlation between the DEGs and cancer-associated fibroblasts which were key components in the tumor microenvironment. The least absolute shrinkage and selection operator (LASSO) was executed to find cancer-associated fibroblast-related differential expression genes (CDEGs) for the further establishment of a prognostic model.Results and discussionIn this study, 52 HP-related genes (HRGs) were screened out based on the GSE84437 and TCGA-GC databases. A total of 804 GC samples were analyzed, respectively, and clustered into two HP-related subtypes. The DEGs identified from the two subtypes were proved to have a relationship with TME. After WGCNA and LASSO, the CAFs-related module was identified, from which 21 gene signatures were confirmed. Then, a CDEGs-Score was constructed and its prediction efficiency in GC patients was conducted for validation. Overall, a highly precise nomogram was established for enhancing the adaptability of the CDEGs-Score. Furthermore, our findings revealed the applicability of CDEGs-Score in the sensitivity of chemotherapeutic drugs. In general, our research provided brand-new possibilities for comprehending HP-related GC, evaluating survival, and more efficient therapeutic strategies

Directory of Open Access Journals

PGAweb: A Web Server for Bacterial Pan-Genome Analysis

Author: Baohua Zhang
Baohua Zhang
Chen Sun
Jiayan Wu
Jingfa Xiao
Jingfa Xiao
Jingfa Xiao
Jinyue Wang
Jinyue Wang
Jinyue Wang
Jun Yu
Jun Yu
Meili Chen
Meili Chen
Ming Yang
Qian Liu
Qian Liu
Xinyu Chen
Yadong Zhang
Yadong Zhang
Yadong Zhang
Yongbing Zhao
Zhewen Zhang
Zhewen Zhang
Zhong Jin
Zhong Jin
Publication venue: 'Frontiers Media SA'
Publication date: 01/08/2018
Field of study

An astronomical increase in microbial genome data in recent years has led to strong demand for bioinformatic tools for pan-genome analysis within and across species. Here, we present PGAweb, a user-friendly, web-based tool for bacterial pan-genome analysis, which is composed of two main pan-genome analysis modules, PGAP and PGAP-X. PGAweb provides key interactive and customizable functions that include orthologous clustering, pan-genome profiling, sequence variation and evolution analysis, and functional classification. PGAweb presents features of genomic structural dynamics and sequence diversity with different visualization methods that are helpful for intuitively understanding the dynamics and evolution of bacterial genomes. PGAweb has an intuitive interface with one-click setting of parameters and is freely available at http://PGAweb.vlcc.cn/

Directory of Open Access Journals

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Author: :
Bi Xiao
Chen Deli
Chen Guanting
Chen Shanhuang
Dai Damai
DeepSeek-AI
Deng Chengqi
Ding Honghui
Dong Kai
Du Qiushi
Fu Zhe
Gao Huazuo
Gao Kaige
Gao Wenjun
Ge Ruiqi
Guan Kang
Guo Daya
Guo Jianzhong
Hao Guangbo
Hao Zhewen
He Ying
Hu Wenjie
Huang Panpan
Li Erhang
Li Guowei
Li Jiashi
Li Y. K.
Li Yao
Liang Wenfeng
Lin Fangyun
Liu A. X.
Liu Bo
Liu Wen
Liu Xiaodong
Liu Xin
Liu Yiyuan
Lu Haoyu
Lu Shanghao
Luo Fuli
Ma Shirong
Nie Xiaotao
Pei Tian
Piao Yishi
Qiu Junjie
Qu Hui
Ren Tongzheng
Ren Zehui
Ruan Chong
Sha Zhangli
Shao Zhihong
Song Junxiao
Su Xuecheng
Sun Jingxiang
Sun Yaofeng
Tang Minghui
Wang Bingxuan
Wang Peiyi
Wang Shiyu
Wang Yaohui
Wang Yongji
Wu Tong
Wu Y.
Xie Xin
Xie Zhenda
Xie Ziwei
Xiong Yiliang
Xu Hanwei
Xu R. X.
Xu Yanhong
Yang Dejian
You Yuxiang
Yu Shuiping
Yu Xingkai
Zhang B.
Zhang Haowei
Zhang Lecong
Zhang Liyue
Zhang Mingchuan
Zhang Minghua
Zhang Wentao
Zhang Yichao
Zhao Chenggang
Zhao Yao
Zhou Shangyan
Zhou Shunfeng
Zhu Qihao
Zou Yuheng
Publication venue
Publication date: 05/01/2024
Field of study

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5

arXiv.org e-Print Archive

Recent Advances in the Distribution, Chemical Composition, Health Benefits, and Application of the Fruit of <i>Siraitia grosvenorii</i>

Author: Ke Feng
Minke Shi
Qihan Guo
Sarengaowa
Ying Xiao
Zhewen Xiao
Publication venue: MDPI AG
Publication date: 01/07/2024
Field of study

The fruits of Siraitia grosvenorii (S. grosvenorii) have attracted a lot of scientific interest as part of the current healthy diet. S. grosvenorii has diverse health-promoting effects, including antioxidant, anti-inflammatory, antimicrobial, respiratory modulation, metabolic modulation, antitumor, and neuroprotective effects, as well as gastrointestinal function modulation. As a plant resource, S. grosvenorii has broad application prospects, which promotes the development of the horticultural industry. Moreover, Mogroside has attracted much attention as an important active ingredient of S. grosvenorii. This review provides an in-depth exploration of the distribution, chemical composition, health benefits, and application of S. grosvenorii, particularly Mogroside. This comprehensive exploration highlights the important therapeutic potential of S. grosvenorii, prompting further research into its applications. As value-added functional ingredients, S. grosvenorii and its constituents have significant potential for disease prevention and are widely used in the development of food and health supplements

Directory of Open Access Journals

Optimization of OpenCV based spot identification method for surface plasmon resonance imaging

Author: Chang Ou
Feiyu Liu
Wenxuan Xiao
Zhewen Fang
Zhiyou Wang
Publication venue: AIP Publishing LLC
Publication date: 01/02/2024
Field of study

In this work, we focus on the OpenCV based microarray recognition method for Surface Plasmon Resonance Imaging (SPRi), proposing the hit-ratio of global light pixels and coverage of the potential spots in a microarray as the criteria for identification evaluation in SPRi data. We optimized the design of the ellipse fitting strategy by analyzing the impact of different parameters in the method. After optimization of the parameters, the accuracy of microarray recognition was successfully increased to over 90%. This work not only contributes to reducing errors in microarray signal extraction and improving signal processing quality but also has significant implications for applying computer graphic technology in high-throughput biochemical analysis

Directory of Open Access Journals

Combining aggregate and individual-level data to estimate individual-level associations between air pollution and COVID-19 mortality in the United States.

Author: Daniel Mork
Danielle Braun
Francesca Dominici
Sophie M Woodward
Xiao Wu
Zhewen Hou
Publication venue: Public Library of Science (PLoS)
Publication date: 01/01/2023
Field of study

Imposing stricter regulations for PM2.5 has the potential to mitigate damaging health and climate change effects. Recent evidence establishing a link between exposure to air pollution and COVID-19 outcomes is one of many arguments for the need to reduce the National Ambient Air Quality Standards (NAAQS) for PM2.5. However, many studies reporting a relationship between COVID-19 outcomes and PM2.5 have been criticized because they are based on ecological regression analyses, where area-level counts of COVID-19 outcomes are regressed on area-level exposure to air pollution and other covariates. It is well known that regression models solely based on area-level data are subject to ecological bias, i.e., they may provide a biased estimate of the association at the individual-level, due to within-area variability of the data. In this paper, we augment county-level COVID-19 mortality data with a nationally representative sample of individual-level covariate information from the American Community Survey along with high-resolution estimates of PM2.5 concentrations obtained from a validated model and aggregated to the census tract for the contiguous United States. We apply a Bayesian hierarchical modeling approach to combine county-, census tract-, and individual-level data to ultimately draw inference about individual-level associations between long-term exposure to PM2.5 and mortality for COVID-19. By analyzing data prior to the Emergency Use Authorization for the COVID-19 vaccines we found that an increase of 1 μg/m3 in long-term PM2.5 exposure, averaged over the 17-year period 2000-2016, is associated with a 3.3% (95% credible interval, 2.8 to 3.8%) increase in an individual's odds of COVID-19 mortality. Code to reproduce our study is publicly available at https://github.com/NSAPH/PM_COVID_ecoinference. The results confirm previous evidence of an association between long-term exposure to PM2.5 and COVID-19 mortality and strengthen the case for tighter regulations on harmful air pollution and greenhouse gas emissions

Directory of Open Access Journals

Characterization of Spectinomycin Resistance in Streptococcus suis Leads to Two Novel Insights into Drug Resistance Formation and Dissemination Mechanism

Author: Anding Zhang
Jingfa Xiao
Kaisong Huang
Meilin Jin
Qiang Zhang
Yajing Song
Zhewen Zhang
Publication venue: 'American Society for Microbiology'
Publication date
Field of study

Crossref

The C-Terminal Repeat Units of SpaA Mediate Adhesion of <i>Erysipelothrix rhusiopathiae</i> to Host Cells and Regulate Its Virulence

Author: Chao Kang
Chao Wu
Hao Zhang
Jingfa Xiao
Meilin Jin
Qiang Zhang
Weifeng Zhu
Yadong Zhang
Zhewen Zhang
Publication venue: MDPI AG
Publication date: 01/07/2022
Field of study

Erysipelothrix rhusiopathiae is a causative agent of erysipelas in animals and erysipeloid in humans. However, current information regarding E. rhusiopathiae pathogenesis remains limited. Previously, we identified two E. rhusiopathiae strains, SE38 and G4T10, which were virulent and avirulent in pigs, respectively. Here, to further study the pathogenic mechanism of E. rhusiopathiae, we sequenced and assembled the genomes of strains SE38 and G4T10, and performed a comparative genomic analysis to identify differences or mutations in virulence-associated genes. Next, we comparatively analyzed 25 E. rhusiopathiae virulence-associated genes in SE38 and G4T10. Compared with that of SE38, the spaA gene of the G4T10 strain lacked 120 bp, encoding repeat units at the C-terminal of SpaA. To examine whether these deletions or splits influence E. rhusiopathiae virulence, these 120 bp were successfully deleted from the spaA gene in strain SE38 by homologous recombination. The mutant strain ΔspaA displayed attenuated virulence in mice and decreased adhesion to porcine iliac artery endothelial cells, which was also observed using the corresponding mutant protein SpaA’. Our results demonstrate that SpaA-mediated adhesion between E. rhusiopathiae and host cells is dependent on its C-terminal repeat units

Directory of Open Access Journals

PubMed Central