69 research outputs found
A statistical normalization method and differential expression analysis for RNA-seq data between different species
Background: High-throughput techniques bring novel tools but also statistical
challenges to genomic research. Identifying genes with differential expression
between different species is an effective way to discover evolutionarily
conserved transcriptional responses. To remove systematic variation between
different species for a fair comparison, the normalization procedure serves as
a crucial pre-processing step that adjusts for the varying sample sequencing
depths and other confounding technical effects.
Results: In this paper, we propose a scale based normalization (SCBN) method
by taking into account the available knowledge of conserved orthologous genes
and hypothesis testing framework. Considering the different gene lengths and
unmapped genes between different species, we formulate the problem from the
perspective of hypothesis testing and search for the optimal scaling factor
that minimizes the deviation between the empirical and nominal type I errors.
Conclusions: Simulation studies show that the proposed method performs
significantly better than the existing competitor in a wide range of settings.
An RNA-seq dataset of different species is also analyzed and it coincides with
the conclusion that the proposed method outperforms the existing method. For
practical applications, we have also developed an R package named "SCBN" and
the software is available at
http://www.bioconductor.org/packages/devel/bioc/html/SCBN.html
Bayesian Non-parametric Hidden Markov Model for Agile Radar Pulse Sequences Streaming Analysis
Multi-function radars (MFRs) are sophisticated types of sensors with the
capabilities of complex agile inter-pulse modulation implementation and dynamic
work mode scheduling. The developments in MFRs pose great challenges to modern
electronic reconnaissance systems or radar warning receivers for recognition
and inference of MFR work modes. To address this issue, this paper proposes an
online processing framework for parameter estimation and change point detection
of MFR work modes. At first, this paper designed a fully-conjugate Bayesian
non-parametric hidden Markov model with a designed prior distribution (agile
BNP-HMM) to represent the MFR pulse agility characteristics. The proposed model
allows fully-variational Bayesian inference. Then, the proposed framework is
constructed by two main parts. The first part is the agile BNP-HMM model for
automatically inferring the number of HMM hidden states and emission
distribution of the corresponding hidden states. An estimation error lower
bound on performance is derived and the proposed algorithm is shown to be close
to the bound. The second part utilizes the streaming Bayesian updating to
facilitate computation, and designed an online work mode change detection
framework based upon a weighted sequential probability ratio test. We
demonstrate that the proposed framework is consistently highly effective and
robust to baseline methods on diverse simulated data-sets.Comment: 15 pages, 10 figures, submitted to IEEE transactions on signal
processin
A Survey on Cross-domain Recommendation: Taxonomies, Methods, and Future Directions
Traditional recommendation systems are faced with two long-standing
obstacles, namely, data sparsity and cold-start problems, which promote the
emergence and development of Cross-Domain Recommendation (CDR). The core idea
of CDR is to leverage information collected from other domains to alleviate the
two problems in one domain. Over the last decade, many efforts have been
engaged for cross-domain recommendation. Recently, with the development of deep
learning and neural networks, a large number of methods have emerged. However,
there is a limited number of systematic surveys on CDR, especially regarding
the latest proposed methods as well as the recommendation scenarios and
recommendation tasks they address. In this survey paper, we first proposed a
two-level taxonomy of cross-domain recommendation which classifies different
recommendation scenarios and recommendation tasks. We then introduce and
summarize existing cross-domain recommendation approaches under different
recommendation scenarios in a structured manner. We also organize datasets
commonly used. We conclude this survey by providing several potential research
directions about this field
Incorporating Heterogeneous User Behaviors and Social Influences for Predictive Analysis
Behavior prediction based on historical behavioral data have practical
real-world significance. It has been applied in recommendation, predicting
academic performance, etc. With the refinement of user data description, the
development of new functions, and the fusion of multiple data sources,
heterogeneous behavioral data which contain multiple types of behaviors become
more and more common. In this paper, we aim to incorporate heterogeneous user
behaviors and social influences for behavior predictions. To this end, this
paper proposes a variant of Long-Short Term Memory (LSTM) which can consider
context information while modeling a behavior sequence, a projection mechanism
which can model multi-faceted relationships among different types of behaviors,
and a multi-faceted attention mechanism which can dynamically find out
informative periods from different facets. Many kinds of behavioral data belong
to spatio-temporal data. An unsupervised way to construct a social behavior
graph based on spatio-temporal data and to model social influences is proposed.
Moreover, a residual learning-based decoder is designed to automatically
construct multiple high-order cross features based on social behavior
representation and other types of behavior representations. Qualitative and
quantitative experiments on real-world datasets have demonstrated the
effectiveness of this model
Jointly Modeling Heterogeneous Student Behaviors and Interactions Among Multiple Prediction Tasks
Prediction tasks about students have practical significance for both student
and college. Making multiple predictions about students is an important part of
a smart campus. For instance, predicting whether a student will fail to
graduate can alert the student affairs office to take predictive measures to
help the student improve his/her academic performance. With the development of
information technology in colleges, we can collect digital footprints which
encode heterogeneous behaviors continuously. In this paper, we focus on
modeling heterogeneous behaviors and making multiple predictions together,
since some prediction tasks are related and learning the model for a specific
task may have the data sparsity problem. To this end, we propose a variant of
LSTM and a soft-attention mechanism. The proposed LSTM is able to learn the
student profile-aware representation from heterogeneous behavior sequences. The
proposed soft-attention mechanism can dynamically learn different importance
degrees of different days for every student. In this way, heterogeneous
behaviors can be well modeled. In order to model interactions among multiple
prediction tasks, we propose a co-attention mechanism based unit. With the help
of the stacked units, we can explicitly control the knowledge transfer among
multiple tasks. We design three motivating behavior prediction tasks based on a
real-world dataset collected from a college. Qualitative and quantitative
experiments on the three prediction tasks have demonstrated the effectiveness
of our model
Modeling Multi-aspect Preferences and Intents for Multi-behavioral Sequential Recommendation
Multi-behavioral sequential recommendation has recently attracted increasing
attention. However, existing methods suffer from two major limitations.
Firstly, user preferences and intents can be described in fine-grained detail
from multiple perspectives; yet, these methods fail to capture their
multi-aspect nature. Secondly, user behaviors may contain noises, and most
existing methods could not effectively deal with noises. In this paper, we
present an attentive recurrent model with multiple projections to capture
Multi-Aspect preferences and INTents (MAINT in short). To extract multi-aspect
preferences from target behaviors, we propose a multi-aspect projection
mechanism for generating multiple preference representations from multiple
aspects. To extract multi-aspect intents from multi-typed behaviors, we propose
a behavior-enhanced LSTM and a multi-aspect refinement attention mechanism. The
attention mechanism can filter out noises and generate multiple intent
representations from different aspects. To adaptively fuse user preferences and
intents, we propose a multi-aspect gated fusion mechanism. Extensive
experiments conducted on real-world datasets have demonstrated the
effectiveness of our model
Designing Artificial Two-Dimensional Landscapes via Room-Temperature Atomic-Layer Substitution
Manipulating materials with atomic-scale precision is essential for the
development of next-generation material design toolbox. Tremendous efforts have
been made to advance the compositional, structural, and spatial accuracy of
material deposition and patterning. The family of 2D materials provides an
ideal platform to realize atomic-level material architectures. The wide and
rich physics of these materials have led to fabrication of heterostructures,
superlattices, and twisted structures with breakthrough discoveries and
applications. Here, we report a novel atomic-scale material design tool that
selectively breaks and forms chemical bonds of 2D materials at room
temperature, called atomic-layer substitution (ALS), through which we can
substitute the top layer chalcogen atoms within the 3-atom-thick
transition-metal dichalcogenides using arbitrary patterns. Flipping the layer
via transfer allows us to perform the same procedure on the other side,
yielding programmable in-plane multi-heterostructures with different
out-of-plane crystal symmetry and electric polarization. First-principle
calculations elucidate how the ALS process is overall exothermic in energy and
only has a small reaction barrier, facilitating the reaction to occur at room
temperature. Optical characterizations confirm the fidelity of this design
approach, while TEM shows the direct evidence of Janus structure and suggests
the atomic transition at the interface of designed heterostructure. Finally,
transport and Kelvin probe measurements on MoXY (X,Y=S,Se; X and Y
corresponding to the bottom and top layers) lateral multi-heterostructures
reveal the surface potential and dipole orientation of each region, and the
barrier height between them. Our approach for designing artificial 2D landscape
down to a single layer of atoms can lead to unique electronic, photonic and
mechanical properties previously not found in nature
High performance MoSâ‚‚ transistors based on wafer-scale low-temperature MOCVD synthesis
Among all the possible back-end-of-line (BEOL) solutions to improve the integration density and functionality of conventional silicon circuits, two-dimensional (2D) material devices are believed to be very promising, due to their high mobility, relatively large band gaps, atom-level thickness, performance comparable to the one of silicon devices, and great potential in realizing 3D integration. However, wafer-scale growth of high-quality, continuous 2D materials thin film with BEOL compatible temperature (<400°C) and good uniformity has always been difficult to realize. To achieve low contact resistance to these materials is also very challenging and hinders the development of 2D material devices and circuits.
In this thesis, we will demonstrate a novel 8-inch, BEOL-compatible metal organic chemical vapor deposition (MOCVD) method for the synthesis of 2D transition metal dichalcogenide materials with growth temperature lower than 400°C. Highly-scaled high-performance MoS₂ transistors will also be investigated with different contact engineering methods. These findings represent crucial steps for high performance power electronic circuits as well as realizing ultra-large scale BEOL integration with silicon circuits.S.M
scMEB: a fast and clustering-independent method for detecting differentially expressed genes in single-cell RNA-seq data
Abstract Background Cell clustering is a prerequisite for identifying differentially expressed genes (DEGs) in single-cell RNA sequencing (scRNA-seq) data. Obtaining a perfect clustering result is of central importance for subsequent analyses, but not easy. Additionally, the increase in cell throughput due to the advancement of scRNA-seq protocols exacerbates many computational issues, especially regarding method runtime. To address these difficulties, a new, accurate, and fast method for detecting DEGs in scRNA-seq data is needed. Results Here, we propose single-cell minimum enclosing ball (scMEB), a novel and fast method for detecting single-cell DEGs without prior cell clustering results. The proposed method utilizes a small part of known non-DEGs (stably expressed genes) to build a minimum enclosing ball and defines the DEGs based on the distance of a mapped gene to the center of the hypersphere in a feature space. Conclusions We compared scMEB to two different approaches that could be used to identify DEGs without cell clustering. The investigation of 11 real datasets revealed that scMEB outperformed rival methods in terms of cell clustering, predicting genes with biological functions, and identifying marker genes. Moreover, scMEB was much faster than the other methods, making it particularly effective for finding DEGs in high-throughput scRNA-seq data. We have developed a package scMEB for the proposed method, which could be available at https://github.com/FocusPaka/scMEB
EMP: Exploiting Mobility Patterns for Collaborative Localization in Sparse Mobile Networks
Location awareness plays an indispensable role in a wide variety of application domains such as environment monitoring and vehicle tracking. In this paper we focus on the localization of mobile users in sparse mobile networks which exist in many practical scenarios where users are distributed over a vast area. The unique characteristics of sparse mobile networks present several challenges for accurate localization, such as constant movement and little information from anchors. By analyzing five large datasets of real users traces with entropy analysis from five sites, we make an important observation that there are strong patterns with user mobility. Motivated by this observation, we propose a localization approach called EMP by exploiting mobility patterns of users for localization in sparse mobile networks. EMP implements a range-free distributed algorithm, with which each user collaboratively estimates its current location by fusing two localization sources, that is, network connectivity with other nodes and mobility patterns . With trace driven simulations, we demonstrate that EMP significantly improves the localization accuracy, comparing with other existing localization approaches
- …