122 research outputs found
Effective Evaluation using Logged Bandit Feedback from Multiple Loggers
Accurately evaluating new policies (e.g. ad-placement models, ranking
functions, recommendation functions) is one of the key prerequisites for
improving interactive systems. While the conventional approach to evaluation
relies on online A/B tests, recent work has shown that counterfactual
estimators can provide an inexpensive and fast alternative, since they can be
applied offline using log data that was collected from a different policy
fielded in the past. In this paper, we address the question of how to estimate
the performance of a new target policy when we have log data from multiple
historic policies. This question is of great relevance in practice, since
policies get updated frequently in most online systems. We show that naively
combining data from multiple logging policies can be highly suboptimal. In
particular, we find that the standard Inverse Propensity Score (IPS) estimator
suffers especially when logging and target policies diverge -- to a point where
throwing away data improves the variance of the estimator. We therefore propose
two alternative estimators which we characterize theoretically and compare
experimentally. We find that the new estimators can provide substantially
improved estimation accuracy.Comment: KDD 201
Cellular Responses to Cisplatin-Induced DNA Damage
Cisplatin is one of the most effective anticancer agents widely used in the treatment of solid tumors. It is generally considered as a cytotoxic drug which kills cancer cells by damaging DNA and inhibiting DNA synthesis. How cells respond to cisplatin-induced DNA damage plays a critical role in deciding cisplatin sensitivity. Cisplatin-induced DNA damage activates various signaling pathways to prevent or promote cell death. This paper summarizes our current understandings regarding the mechanisms by which cisplatin induces cell death and the bases of cisplatin resistance. We have discussed various steps, including the entry of cisplatin inside cells, DNA repair, drug detoxification, DNA damage response, and regulation of cisplatin-induced apoptosis by protein kinases. An understanding of how various signaling pathways regulate cisplatin-induced cell death should aid in the development of more effective therapeutic strategies for the treatment of cancer
Adaptive TTL-Based Caching for Content Delivery
Content Delivery Networks (CDNs) deliver a majority of the user-requested
content on the Internet, including web pages, videos, and software downloads. A
CDN server caches and serves the content requested by users. Designing caching
algorithms that automatically adapt to the heterogeneity, burstiness, and
non-stationary nature of real-world content requests is a major challenge and
is the focus of our work. While there is much work on caching algorithms for
stationary request traffic, the work on non-stationary request traffic is very
limited. Consequently, most prior models are inaccurate for production CDN
traffic that is non-stationary.
We propose two TTL-based caching algorithms and provide provable guarantees
for content request traffic that is bursty and non-stationary. The first
algorithm called d-TTL dynamically adapts a TTL parameter using a stochastic
approximation approach. Given a feasible target hit rate, we show that the hit
rate of d-TTL converges to its target value for a general class of bursty
traffic that allows Markov dependence over time and non-stationary arrivals.
The second algorithm called f-TTL uses two caches, each with its own TTL. The
first-level cache adaptively filters out non-stationary traffic, while the
second-level cache stores frequently-accessed stationary traffic. Given
feasible targets for both the hit rate and the expected cache size, f-TTL
asymptotically achieves both targets. We implement d-TTL and f-TTL and evaluate
both algorithms using an extensive nine-day trace consisting of 500 million
requests from a production CDN server. We show that both d-TTL and f-TTL
converge to their hit rate targets with an error of about 1.3%. But, f-TTL
requires a significantly smaller cache size than d-TTL to achieve the same hit
rate, since it effectively filters out the non-stationary traffic for
rarely-accessed objects
Robust Clustering with Normal Mixture Models: A Pseudo -Likelihood Approach
As in other estimation scenarios, likelihood based estimation in the normal
mixture set-up is highly non-robust against model misspecification and presence
of outliers (apart from being an ill-posed optimization problem). We propose a
robust alternative to the ordinary likelihood approach for this estimation
problem which performs simultaneous estimation and data clustering and leads to
subsequent anomaly detection. To invoke robustness, we follow, in spirit, the
methodology based on the minimization of the density power divergence (or
alternatively, the maximization of the -likelihood) under suitable
constraints. An iteratively reweighted least squares approach has been followed
in order to compute our estimators for the component means (or equivalently
cluster centers) and component dispersion matrices which leads to simultaneous
data clustering. Some exploratory techniques are also suggested for anomaly
detection, a problem of great importance in the domain of statistics and
machine learning. Existence and consistency of the estimators are established
under the aforesaid constraints. We validate our method with simulation studies
under different set-ups; it is seen to perform competitively or better compared
to the popular existing methods like K-means and TCLUST, especially when the
mixture components (i.e., the clusters) share regions with significant overlap
or outlying clusters exist with small but non-negligible weights. Two real
datasets are also used to illustrate the performance of our method in
comparison with others along with an application in image processing. It is
observed that our method detects the clusters with lower misclassification
rates and successfully points out the outlying (anomalous) observations from
these datasets.Comment: Pre-prin
- …