Search CORE

107 research outputs found

Spatial clustering and common regulatory elements correlate with coordinated gene expression

Author: Bai Fan
Chen Hengyu
Li Ruoyan
Taft David A.
Xing Jianhua
Yao Guang
Zhang Jingyu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 18/01/2019
Field of study

Many cellular responses to surrounding cues require temporally concerted transcriptional regulation of multiple genes. In prokaryotic cells, a single-input-module motif with one transcription factor regulating multiple target genes can generate coordinated gene expression. In eukaryotic cells, transcriptional activity of a gene is affected by not only transcription factors but also the epigenetic modifications and three-dimensional chromosome structure of the gene. To examine how local gene environment and transcription factor regulation are coupled, we performed a combined analysis of time-course RNA-seq data of TGF-\b{eta} treated MCF10A cells and related epigenomic and Hi-C data. Using Dynamic Regulatory Events Miner (DREM), we clustered differentially expressed genes based on gene expression profiles and associated transcription factors. Genes in each class have similar temporal gene expression patterns and share common transcription factors. Next, we defined a set of linear and radial distribution functions, as used in statistical physics, to measure the distributions of genes within a class both spatially and linearly along the genomic sequence. Remarkably, genes within the same class despite sometimes being separated by tens of million bases (Mb) along genomic sequence show a significantly higher tendency to be spatially close despite sometimes being separated by tens of Mb along the genomic sequence than those belonging to different classes do. Analyses extended to the process of mouse nervous system development arrived at similar conclusions. Future studies will be able to test whether this spatial organization of chromosomes contributes to concerted gene expression.Comment: 30 pages, 9 figures, accepted in PLoS Computational Biolog

arXiv.org e-Print Archive

Directory of Open Access Journals

The University of Arizona

FigShare

Endogenous cross-region human mobility and pandemics

Author: Chen Xiao
Huang Hanwei
Ju Jiandong
Sun Ruoyan
Zhang Jialiang
Publication venue: Centre for Economic Performance, London School of Economics and Poiltical Science
Publication date: 13/07/2022
Field of study

We study infectious diseases using a Susceptible-Infected-Recovered-Deceased model with endogenous cross-region human mobility. Individuals weigh the risk of infection against economic opportunities when moving across regions. The model predicts that the mobility rate of susceptible individuals declines with a higher infection rate at the destination. With cross-region mobility, a decrease in the transmission rate or an increase in the removal rate of the virus in any region reduces the global basic reproduction number (R0). Global R0 falls between the minimum and maximum of local R0s. A new method of Normalized Hat Algebra is developed to solve the model dynamics. Simulations indicate that a decrease in global R0 does not always imply a lower cumulative infection rate. Local and central governments may prefer different mobility control policies

LSE Research Online

Impact of vaccination on the COVID-19 pandemic in U.S. states

Author: Chen Xiao
Huang Hanwei
Ju Jiandong
Sun Ruoyan
Zhang Jialiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/01/2022
Field of study

Governments worldwide are implementing mass vaccination programs in an effort to end the novel coronavirus (COVID-19) pandemic. Here, we evaluated the effectiveness of the COVID-19 vaccination program in its early stage and predicted the path to herd immunity in the U.S. By early March 2021, we estimated that vaccination reduced the total number of new cases by 4.4 million (from 33.0 to 28.6 million), prevented approximately 0.12 million hospitalizations (from 0.89 to 0.78 million), and decreased the population infection rate by 1.34 percentage points (from 10.10 to 8.76%). We built a Susceptible-Infected-Recovered (SIR) model with vaccination to predict herd immunity, following the trends from the early-stage vaccination program. Herd immunity could be achieved earlier with a faster vaccination pace, lower vaccine hesitancy, and higher vaccine effectiveness. The Delta variant has substantially postponed the predicted herd immunity date, through a combination of reduced vaccine effectiveness, lowered recovery rate, and increased infection and death rates. These findings improve our understanding of the COVID-19 vaccination and can inform future public health policies

LSE Research Online

PubMed Central

FedRecAttack: Model Poisoning Attack to Federated Recommendation

Author: Chen Jianhai
He Qinming
Rong Dazhong
Ye Shuai
Yuen Hon Ning
Zhao Ruoyan
Publication venue
Publication date: 01/04/2022
Field of study

Federated Recommendation (FR) has received considerable popularity and attention in the past few years. In FR, for each user, its feature vector and interaction data are kept locally on its own client thus are private to others. Without the access to above information, most existing poisoning attacks against recommender systems or federated learning lose validity. Benifiting from this characteristic, FR is commonly considered fairly secured. However, we argue that there is still possible and necessary security improvement could be made in FR. To prove our opinion, in this paper we present FedRecAttack, a model poisoning attack to FR aiming to raise the exposure ratio of target items. In most recommendation scenarios, apart from private user-item interactions (e.g., clicks, watches and purchases), some interactions are public (e.g., likes, follows and comments). Motivated by this point, in FedRecAttack we make use of the public interactions to approximate users' feature vectors, thereby attacker can generate poisoned gradients accordingly and control malicious users to upload the poisoned gradients in a well-designed way. To evaluate the effectiveness and side effects of FedRecAttack, we conduct extensive experiments on three real-world datasets of different sizes from two completely different scenarios. Experimental results demonstrate that our proposed FedRecAttack achieves the state-of-the-art effectiveness while its side effects are negligible. Moreover, even with small proportion (3%) of malicious users and small proportion (1%) of public interactions, FedRecAttack remains highly effective, which reveals that FR is more vulnerable to attack than people commonly considered.Comment: This paper has been accepted by IEEE International Conference on Data Engineering 2022 (Second Research Round

arXiv.org e-Print Archive

Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model

Author: Chen Chen
Chen Jianyu
Gao Zeyu
Li Shengbo Eben
Lu Yanfeng
Luo Ping
Mu Yao
Ren Yangang
Shen Ruoyan
Publication venue
Publication date: 08/10/2022
Field of study

End-to-end autonomous driving provides a feasible way to automatically maximize overall driving system performance by directly mapping the raw pixels from a front-facing camera to control signals. Recent advanced methods construct a latent world model to map the high dimensional observations into compact latent space. However, the latent states embedded by the world model proposed in previous works may contain a large amount of task-irrelevant information, resulting in low sampling efficiency and poor robustness to input perturbations. Meanwhile, the training data distribution is usually unbalanced, and the learned policy is hard to cope with the corner cases during the driving process. To solve the above challenges, we present a semantic masked recurrent world model (SEM2), which introduces a latent filter to extract key task-relevant features and reconstruct a semantic mask via the filtered features, and is trained with a multi-source data sampler, which aggregates common data and multiple corner case data in a single batch, to balance the data distribution. Extensive experiments on CARLA show that our method outperforms the state-of-the-art approaches in terms of sample efficiency and robustness to input permutations.Comment: 11 pages, 7 figures, 1 table, submitted to Deep RL Workshop 202

arXiv.org e-Print Archive

Getting the Most from Eye-Tracking: User-Interaction Based Reading Region Estimation Dataset and Models

Author: Chen Chen
Gajjela Gayathri
Kong Ruoyan
Konstan Joseph A.
Patri Sneha
Sun Ruixuan
Zhang Charles Chuankai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/06/2023
Field of study

A single digital newsletter usually contains many messages (regions). Users' reading time spent on, and read level (skip/skim/read-in-detail) of each message is important for platforms to understand their users' interests, personalize their contents, and make recommendations. Based on accurate but expensive-to-collect eyetracker-recorded data, we built models that predict per-region reading time based on easy-to-collect Javascript browser tracking data. With eye-tracking, we collected 200k ground-truth datapoints on participants reading news on browsers. Then we trained machine learning and deep learning models to predict message-level reading time based on user interactions like mouse position, scrolling, and clicking. We reached 27\% percentage error in reading time estimation with a two-tower neural network based on user interactions only, against the eye-tracking ground truth data, while the heuristic baselines have around 46\% percentage error. We also discovered the benefits of replacing per-session models with per-timestamp models, and adding user pattern features. We concluded with suggestions on developing message-level reading estimation techniques based on available data.Comment: Ruoyan Kong, Ruixuan Sun, Charles Chuankai Zhang, Chen Chen, Sneha Patri, Gayathri Gajjela, and Joseph A. Konstan. Getting the most from eyetracking: User-interaction based reading region estimation dataset and models. In Proceedings of the 2023 Symposium on Eye Tracking Research and Applications, ETRA 23, New York, NY, USA, 2023. Association for Computing Machiner

arXiv.org e-Print Archive

SmartChoices: Augmenting Software with Learned Implementations

Author: Bartok Gabor
Chen Eric
Donahue Emily
Golovin Daniel
Huang Tzu-Kuo
Kokiopoulou Efi
Qin Ruoyan
Sarda Nikhil
Sybrandt Justin
Tjeng Vincent
Publication venue
Publication date: 12/04/2023
Field of study

We are living in a golden age of machine learning. Powerful models are being trained to perform many tasks far better than is possible using traditional software engineering approaches alone. However, developing and deploying those models in existing software systems remains difficult. In this paper we present SmartChoices, a novel approach to incorporating machine learning into mature software stacks easily, safely, and effectively. We explain the overall design philosophy and present case studies using SmartChoices within large scale industrial systems

arXiv.org e-Print Archive

Comprehensive analysis of lncRNA-associated competing endogenous RNA network in tongue squamous cell carcinoma

Author: Hongbo Zhou
Mianfeng Yao
Qiulan Li
Ruoyan Cao
Shusen Zhang
Yu Chen
Publication venue: 'PeerJ'
Publication date: 01/02/2019
Field of study

Background Increasing evidence has demonstrated that long non-coding RNAs (lncRNAs) play an important role in the competitive endogenous RNA (ceRNA) networks in that they regulate protein-coding gene expression by sponging microRNAs (miRNAs). However, the understanding of the ceRNA network in tongue squamous cell carcinoma (TSCC) remains limited. Methods Expression profile data regarding mRNAs, miRNAs and lncRNAs as well as clinical information on 122 TSCC tissues and 15 normal controls from The Cancer Genome Atlas (TCGA) database were collected. We used the edgR package to identify differentially expressed mRNAs (DEmRNAs), lncRNAs (DElncRNAs) and miRNAs (DEmiRNAs) between TSCC samples and normal samples. In order to explore the functions of DEmRNAs, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was performed. Subsequently, a ceRNA network was established based on the identified DElncRNAs–DEmiRNAs and DEmiRNAs–DEmRNAs interactions. The RNAs within the ceRNA network were analyzed for their correlation with overall disease survival. Finally, lncRNAs were specifically analyzed for their correlation with clinical features in the included TSCC patient samples. Results A total of 1867 mRNAs, 828 lncRNAs and 81 miRNAs were identified as differentially expressed in TSCC tissues (—log 2fold change— ≥ 2; adjusted P value <0.01). The resulting ceRNA network included 16 mRNAs, 56 lncRNAs and 6 miRNAs. Ten out of the 56 lncRNAs were found to be associated with the overall survival in TSCC patients (P < 0.05); 10 lncRNAs were correlated with TSCC progression (P < 0.05). Conclusion Our study deepens the understanding of ceRNA network regulatory mechanisms in TSCC. Furthermore, we identified ten lncRNAs (PART1, LINC00261, AL163952.1, C2orf48, FAM87A, LINC00052, LINC00472, STEAP3-AS1, TSPEAR-AS1 and ERVH48-1) as novel, potential prognostic biomarkers and therapeutic targets for TSCC

Directory of Open Access Journals