107 research outputs found
Spatial clustering and common regulatory elements correlate with coordinated gene expression
Many cellular responses to surrounding cues require temporally concerted
transcriptional regulation of multiple genes. In prokaryotic cells, a
single-input-module motif with one transcription factor regulating multiple
target genes can generate coordinated gene expression. In eukaryotic cells,
transcriptional activity of a gene is affected by not only transcription
factors but also the epigenetic modifications and three-dimensional chromosome
structure of the gene. To examine how local gene environment and transcription
factor regulation are coupled, we performed a combined analysis of time-course
RNA-seq data of TGF-\b{eta} treated MCF10A cells and related epigenomic and
Hi-C data. Using Dynamic Regulatory Events Miner (DREM), we clustered
differentially expressed genes based on gene expression profiles and associated
transcription factors. Genes in each class have similar temporal gene
expression patterns and share common transcription factors. Next, we defined a
set of linear and radial distribution functions, as used in statistical
physics, to measure the distributions of genes within a class both spatially
and linearly along the genomic sequence. Remarkably, genes within the same
class despite sometimes being separated by tens of million bases (Mb) along
genomic sequence show a significantly higher tendency to be spatially close
despite sometimes being separated by tens of Mb along the genomic sequence than
those belonging to different classes do. Analyses extended to the process of
mouse nervous system development arrived at similar conclusions. Future studies
will be able to test whether this spatial organization of chromosomes
contributes to concerted gene expression.Comment: 30 pages, 9 figures, accepted in PLoS Computational Biolog
Endogenous cross-region human mobility and pandemics
We study infectious diseases using a Susceptible-Infected-Recovered-Deceased model with endogenous cross-region human mobility. Individuals weigh the risk of infection against economic opportunities when moving across regions. The model predicts that the mobility rate of susceptible individuals declines with a higher infection rate at the destination. With cross-region mobility, a decrease in the transmission rate or an increase in the removal rate of the virus in any region reduces the global basic reproduction number (R0). Global R0 falls between the minimum and maximum of local R0s. A new method of Normalized Hat Algebra is developed to solve the model dynamics. Simulations indicate that a decrease in global R0 does not always imply a lower cumulative infection rate. Local and central governments may prefer different mobility control policies
Impact of vaccination on the COVID-19 pandemic in U.S. states
Governments worldwide are implementing mass vaccination programs in an effort to end the novel coronavirus (COVID-19) pandemic. Here, we evaluated the effectiveness of the COVID-19 vaccination program in its early stage and predicted the path to herd immunity in the U.S. By early March 2021, we estimated that vaccination reduced the total number of new cases by 4.4 million (from 33.0 to 28.6 million), prevented approximately 0.12 million hospitalizations (from 0.89 to 0.78 million), and decreased the population infection rate by 1.34 percentage points (from 10.10 to 8.76%). We built a Susceptible-Infected-Recovered (SIR) model with vaccination to predict herd immunity, following the trends from the early-stage vaccination program. Herd immunity could be achieved earlier with a faster vaccination pace, lower vaccine hesitancy, and higher vaccine effectiveness. The Delta variant has substantially postponed the predicted herd immunity date, through a combination of reduced vaccine effectiveness, lowered recovery rate, and increased infection and death rates. These findings improve our understanding of the COVID-19 vaccination and can inform future public health policies
FedRecAttack: Model Poisoning Attack to Federated Recommendation
Federated Recommendation (FR) has received considerable popularity and
attention in the past few years. In FR, for each user, its feature vector and
interaction data are kept locally on its own client thus are private to others.
Without the access to above information, most existing poisoning attacks
against recommender systems or federated learning lose validity. Benifiting
from this characteristic, FR is commonly considered fairly secured. However, we
argue that there is still possible and necessary security improvement could be
made in FR. To prove our opinion, in this paper we present FedRecAttack, a
model poisoning attack to FR aiming to raise the exposure ratio of target
items. In most recommendation scenarios, apart from private user-item
interactions (e.g., clicks, watches and purchases), some interactions are
public (e.g., likes, follows and comments). Motivated by this point, in
FedRecAttack we make use of the public interactions to approximate users'
feature vectors, thereby attacker can generate poisoned gradients accordingly
and control malicious users to upload the poisoned gradients in a well-designed
way. To evaluate the effectiveness and side effects of FedRecAttack, we conduct
extensive experiments on three real-world datasets of different sizes from two
completely different scenarios. Experimental results demonstrate that our
proposed FedRecAttack achieves the state-of-the-art effectiveness while its
side effects are negligible. Moreover, even with small proportion (3%) of
malicious users and small proportion (1%) of public interactions, FedRecAttack
remains highly effective, which reveals that FR is more vulnerable to attack
than people commonly considered.Comment: This paper has been accepted by IEEE International Conference on Data
Engineering 2022 (Second Research Round
Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model
End-to-end autonomous driving provides a feasible way to automatically
maximize overall driving system performance by directly mapping the raw pixels
from a front-facing camera to control signals. Recent advanced methods
construct a latent world model to map the high dimensional observations into
compact latent space. However, the latent states embedded by the world model
proposed in previous works may contain a large amount of task-irrelevant
information, resulting in low sampling efficiency and poor robustness to input
perturbations. Meanwhile, the training data distribution is usually unbalanced,
and the learned policy is hard to cope with the corner cases during the driving
process. To solve the above challenges, we present a semantic masked recurrent
world model (SEM2), which introduces a latent filter to extract key
task-relevant features and reconstruct a semantic mask via the filtered
features, and is trained with a multi-source data sampler, which aggregates
common data and multiple corner case data in a single batch, to balance the
data distribution. Extensive experiments on CARLA show that our method
outperforms the state-of-the-art approaches in terms of sample efficiency and
robustness to input permutations.Comment: 11 pages, 7 figures, 1 table, submitted to Deep RL Workshop 202
Getting the Most from Eye-Tracking: User-Interaction Based Reading Region Estimation Dataset and Models
A single digital newsletter usually contains many messages (regions). Users'
reading time spent on, and read level (skip/skim/read-in-detail) of each
message is important for platforms to understand their users' interests,
personalize their contents, and make recommendations. Based on accurate but
expensive-to-collect eyetracker-recorded data, we built models that predict
per-region reading time based on easy-to-collect Javascript browser tracking
data.
With eye-tracking, we collected 200k ground-truth datapoints on participants
reading news on browsers. Then we trained machine learning and deep learning
models to predict message-level reading time based on user interactions like
mouse position, scrolling, and clicking. We reached 27\% percentage error in
reading time estimation with a two-tower neural network based on user
interactions only, against the eye-tracking ground truth data, while the
heuristic baselines have around 46\% percentage error. We also discovered the
benefits of replacing per-session models with per-timestamp models, and adding
user pattern features. We concluded with suggestions on developing
message-level reading estimation techniques based on available data.Comment: Ruoyan Kong, Ruixuan Sun, Charles Chuankai Zhang, Chen Chen, Sneha
Patri, Gayathri Gajjela, and Joseph A. Konstan. Getting the most from
eyetracking: User-interaction based reading region estimation dataset and
models. In Proceedings of the 2023 Symposium on Eye Tracking Research and
Applications, ETRA 23, New York, NY, USA, 2023. Association for Computing
Machiner
SmartChoices: Augmenting Software with Learned Implementations
We are living in a golden age of machine learning. Powerful models are being
trained to perform many tasks far better than is possible using traditional
software engineering approaches alone. However, developing and deploying those
models in existing software systems remains difficult. In this paper we present
SmartChoices, a novel approach to incorporating machine learning into mature
software stacks easily, safely, and effectively. We explain the overall design
philosophy and present case studies using SmartChoices within large scale
industrial systems
Comprehensive analysis of lncRNA-associated competing endogenous RNA network in tongue squamous cell carcinoma
Background Increasing evidence has demonstrated that long non-coding RNAs (lncRNAs) play an important role in the competitive endogenous RNA (ceRNA) networks in that they regulate protein-coding gene expression by sponging microRNAs (miRNAs). However, the understanding of the ceRNA network in tongue squamous cell carcinoma (TSCC) remains limited. Methods Expression profile data regarding mRNAs, miRNAs and lncRNAs as well as clinical information on 122 TSCC tissues and 15 normal controls from The Cancer Genome Atlas (TCGA) database were collected. We used the edgR package to identify differentially expressed mRNAs (DEmRNAs), lncRNAs (DElncRNAs) and miRNAs (DEmiRNAs) between TSCC samples and normal samples. In order to explore the functions of DEmRNAs, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was performed. Subsequently, a ceRNA network was established based on the identified DElncRNAs–DEmiRNAs and DEmiRNAs–DEmRNAs interactions. The RNAs within the ceRNA network were analyzed for their correlation with overall disease survival. Finally, lncRNAs were specifically analyzed for their correlation with clinical features in the included TSCC patient samples. Results A total of 1867 mRNAs, 828 lncRNAs and 81 miRNAs were identified as differentially expressed in TSCC tissues (—log 2fold change— ≥ 2; adjusted P value <0.01). The resulting ceRNA network included 16 mRNAs, 56 lncRNAs and 6 miRNAs. Ten out of the 56 lncRNAs were found to be associated with the overall survival in TSCC patients (P < 0.05); 10 lncRNAs were correlated with TSCC progression (P < 0.05). Conclusion Our study deepens the understanding of ceRNA network regulatory mechanisms in TSCC. Furthermore, we identified ten lncRNAs (PART1, LINC00261, AL163952.1, C2orf48, FAM87A, LINC00052, LINC00472, STEAP3-AS1, TSPEAR-AS1 and ERVH48-1) as novel, potential prognostic biomarkers and therapeutic targets for TSCC
- …