196,466 research outputs found
A sparse regulatory network of copy-number driven expression reveals putative breast cancer oncogenes
The influence of DNA cis-regulatory elements on a gene's expression has been
intensively studied. However, little is known about expressions driven by
trans-acting DNA hotspots. DNA hotspots harboring copy number aberrations are
recognized to be important in cancer as they influence multiple genes on a
global scale. The challenge in detecting trans-effects is mainly due to the
computational difficulty in detecting weak and sparse trans-acting signals
amidst co-occuring passenger events. We propose an integrative approach to
learn a sparse interaction network of DNA copy-number regions with their
downstream targets in a breast cancer dataset. Information from this network
helps distinguish copy-number driven from copy-number independent expression
changes on a global scale. Our result further delineates cis- and trans-effects
in a breast cancer dataset, for which important oncogenes such as ESR1 and
ERBB2 appear to be highly copy-number dependent. Further, our model is shown to
be efficient and in terms of goodness of fit no worse than other state-of the
art predictors and network reconstruction models using both simulated and real
data.Comment: Accepted at IEEE International Conference on Bioinformatics &
Biomedicine (BIBM 2010
A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure
We often seek to estimate the impact of an exposure naturally occurring or
randomly assigned at the cluster-level. For example, the literature on
neighborhood determinants of health continues to grow. Likewise, community
randomized trials are applied to learn about real-world implementation,
sustainability, and population effects of interventions with proven
individual-level efficacy. In these settings, individual-level outcomes are
correlated due to shared cluster-level factors, including the exposure, as well
as social or biological interactions between individuals. To flexibly and
efficiently estimate the effect of a cluster-level exposure, we present two
targeted maximum likelihood estimators (TMLEs). The first TMLE is developed
under a non-parametric causal model, which allows for arbitrary interactions
between individuals within a cluster. These interactions include direct
transmission of the outcome (i.e. contagion) and influence of one individual's
covariates on another's outcome (i.e. covariate interference). The second TMLE
is developed under a causal sub-model assuming the cluster-level and
individual-specific covariates are sufficient to control for confounding.
Simulations compare the alternative estimators and illustrate the potential
gains from pairing individual-level risk factors and outcomes during
estimation, while avoiding unwarranted assumptions. Our results suggest that
estimation under the sub-model can result in bias and misleading inference in
an observational setting. Incorporating working assumptions during estimation
is more robust than assuming they hold in the underlying causal model. We
illustrate our approach with an application to HIV prevention and treatment
Structured Training for Neural Network Transition-Based Parsing
We present structured perceptron training for neural network transition-based
dependency parsing. We learn the neural network representation using a gold
corpus augmented by a large number of automatically parsed sentences. Given
this fixed network representation, we learn a final layer using the structured
perceptron with beam-search decoding. On the Penn Treebank, our parser reaches
94.26% unlabeled and 92.41% labeled attachment accuracy, which to our knowledge
is the best accuracy on Stanford Dependencies to date. We also provide in-depth
ablative analysis to determine which aspects of our model provide the largest
gains in accuracy
Incorporating Intra-Class Variance to Fine-Grained Visual Recognition
Fine-grained visual recognition aims to capture discriminative
characteristics amongst visually similar categories. The state-of-the-art
research work has significantly improved the fine-grained recognition
performance by deep metric learning using triplet network. However, the impact
of intra-category variance on the performance of recognition and robust feature
representation has not been well studied. In this paper, we propose to leverage
intra-class variance in metric learning of triplet network to improve the
performance of fine-grained recognition. Through partitioning training images
within each category into a few groups, we form the triplet samples across
different categories as well as different groups, which is called Group
Sensitive TRiplet Sampling (GS-TRS). Accordingly, the triplet loss function is
strengthened by incorporating intra-class variance with GS-TRS, which may
contribute to the optimization objective of triplet network. Extensive
experiments over benchmark datasets CompCar and VehicleID show that the
proposed GS-TRS has significantly outperformed state-of-the-art approaches in
both classification and retrieval tasks.Comment: 6 pages, 5 figure
Assessing the effect of advertising expenditures upon sales: a Bayesian structural time series model
We propose a robust implementation of the Nerlove--Arrow model using a
Bayesian structural time series model to explain the relationship between
advertising expenditures of a country-wide fast-food franchise network with its
weekly sales. Thanks to the flexibility and modularity of the model, it is well
suited to generalization to other markets or situations. Its Bayesian nature
facilitates incorporating \emph{a priori} information (the manager's views),
which can be updated with relevant data. This aspect of the model will be used
to present a strategy of budget scheduling across time and channels.Comment: Published at Applied Stochastic Models in Business and Industry,
https://onlinelibrary.wiley.com/doi/full/10.1002/asmb.246
Value at Risk models with long memory features and their economic performance
We study alternative dynamics for Value at Risk (VaR) that incorporate a slow moving component and information on recent aggregate returns in established quantile (auto) regression models. These models are compared on their economic performance, and also on metrics of first-order importance such as violation ratios. By better economic performance, we mean that changes in the VaR forecasts should have a lower variance to reduce transaction costs and should lead to lower exceedance sizes without raising the average level of the VaR. We find that, in combination with a targeted estimation strategy, our proposed models lead to improved performance in both statistical and economic terms
- …