Search CORE

292,494 research outputs found

Adaptive Data Depth via Multi-Armed Bandits

Author: Baharav Tavor Z.
Lai Tze Leung
Publication venue
Publication date: 09/11/2022
Field of study

Data depth, introduced by Tukey (1975), is an important tool in data science, robust statistics, and computational geometry. One chief barrier to its broader practical utility is that many common measures of depth are computationally intensive, requiring on the order of

n^d

operations to exactly compute the depth of a single point within a data set of

n

points in

d

-dimensional space. Often however, we are not directly interested in the absolute depths of the points, but rather in their relative ordering. For example, we may want to find the most central point in a data set (a generalized median), or to identify and remove all outliers (points on the fringe of the data set with low depth). With this observation, we develop a novel and instance-adaptive algorithm for adaptive data depth computation by reducing the problem of exactly computing

n

depths to an

n

-armed stochastic multi-armed bandit problem which we can efficiently solve. We focus our exposition on simplicial depth, developed by Liu (1990), which has emerged as a promising notion of depth due to its interpretability and asymptotic properties. We provide general instance-dependent theoretical guarantees for our proposed algorithms, which readily extend to many other common measures of data depth including majority depth, Oja depth, and likelihood depth. When specialized to the case where the gaps in the data follow a power law distribution with parameter

\alpha<2

, we show that we can reduce the complexity of identifying the deepest point in the data set (the simplicial median) from

O(n^d)

\tilde{O}(n^{d-(d-1)\alpha/2})

, where

\tilde{O}

suppresses logarithmic factors. We corroborate our theoretical results with numerical experiments on synthetic data, showing the practical utility of our proposed methods.Comment: Keywords: multi-armed bandits, data depth, adaptivity, large-scale computation, simplicial dept

arXiv.org e-Print Archive

An In-Depth Look at Recent Influenza Seasons and Vaccine Effectiveness

Author: Ricci Ariana
Publication venue: Bryant Digital Repository
Publication date: 26/04/2013
Field of study

This paper aims to present an in-depth exploration of immunology, the influenza virus, vaccination, and vaccination’s effectiveness with respect to influenza. It also delves into the possible causes behind the large increase in early childhood deaths during the 2003-2004 influenza season, which was a turning point in terms of influenza incident reporting. Finally, data analysis on the relationship between childhood flu vaccine coverage and childhood outpatient ILI (influenza-like illness) visits by region is presented as a measurement of vaccine effectiveness and identifier of trends. Although this relationship was not statistically significant (alpha=0.05) regionally, this simply points to alternate factors that exist among the relationship between vaccine coverage and outpatient visits in children. The same comparison made over time with national statistics did prove statistically significant (p=0.02), however, other variables are hypothesized to be present in this relationship as well

DigitalCommons@Bryant University

Covariance-Aware Private Mean Estimation Without Private Covariance Estimation

Author: Brown Gavin
Gaboardi Marco
Smith Adam
Ullman Jonathan
Zakynthinou Lydia
Publication venue
Publication date: 22/07/2021
Field of study

We present two sample-efficient differentially private mean estimators for

d

-dimensional (sub)Gaussian distributions with unknown covariance. Informally, given

n \gtrsim d/\alpha^2

samples from such a distribution with mean

\mu

and covariance

\Sigma

, our estimators output

\tilde\mu

such that

\| \tilde\mu - \mu \|_{\Sigma} \leq \alpha

, where

\| \cdot \|_{\Sigma}

is the Mahalanobis distance. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require

\Omega(d^{3/2})

samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efficient. Our first estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance, without releasing the covariance itself. Its sample complexity guarantees hold more generally for subgaussian distributions, albeit with a slightly worse dependence on the privacy parameter. For both estimators, careful preprocessing of the data is required to satisfy differential privacy

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Reionization constraints using Principal Component Analysis

Author: Mitra Sourav
Choudhury
T. Roy
FERRARA A.
Publication venue
Publication date: 01/01/2005
Field of study

Using a semi-analytical model developed by Choudhury & Ferrara (2005) we study the observational constraints on reionization via a principal component analysis (PCA). Assuming that reionization at z>6 is primarily driven by stellar sources, we decompose the unknown function N_{ion}(z), representing the number of photons in the IGM per baryon in collapsed objects, into its principal components and constrain the latter using the photoionization rate obtained from Ly-alpha forest Gunn-Peterson optical depth, the WMAP7 electron scattering optical depth and the redshift distribution of Lyman-limit systems at z \sim 3.5. The main findings of our analysis are: (i) It is sufficient to model N_{ion}(z) over the redshift range 2<z<14 using 5 parameters to extract the maximum information contained within the data. (ii) All quantities related to reionization can be severely constrained for z<6 because of a large number of data points whereas constraints at z>6 are relatively loose. (iii) The weak constraints on N_{ion}(z) at z>6 do not allow to disentangle different feedback models with present data. There is a clear indication that N_{ion}(z) must increase at z>6, thus ruling out reionization by a single stellar population with non-evolving IMF, and/or star-forming efficiency, and/or photon escape fraction. The data allows for non-monotonic N_{ion}(z) which may contain sharp features around z \sim 7. (iv) The PCA implies that reionization must be 99% completed between 5.8<z<10.3 (95% confidence level) and is expected to be 50% complete at z \approx 9.5-12. With future data sets, like those obtained by Planck, the z>6 constraints will be significantly improved.Comment: Accepted in MNRAS. Revised to match the accepted versio

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Copenhagen University Research Information System

Reionization constraints using Principal Component Analysis

Author: Andrea Ferrara
Barkana
Bolton
Burigana
Chiu
Choudhury
Choudhury
Choudhury
Choudhury
Choudhury
Clarkson
Crittenden
Efstathiou
Efstathiou
Fan
Fan
Furlanetto
Gallerani
Gallerani
Gallerani
Hopkins
Hu
Huterer
Huterer
Kadota
Kashikawa
Lacey
Larson
Leach
Lewis
Loeb
Miralda-Escudé
Mortonson
Munshi
Press
Press
Pritchard
Prochaska
Schneider
Songaila
Sourav Mitra
T. Roy Choudhury
Totani
Wyithe
Publication venue: 'Wiley'
Publication date: 01/01/2010
Field of study

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio istituzionale della Ricerca - Scuola Normale Superiore

A Monte Carlo simulation of the Sudbury Neutrino Observatory proportional counters

Author: A Hime
A L Hallin
A W P Poon
A Wright
B Beltran
B Cai
B Monreal
C Kraus
Firestone R B
G A Cox
G Prior
H Bichsel
H Deng
H R Leslie
H Wan Chan Tseung
J A Formaggio
J C Loach
J Detwiler
J F Wilkerson
J Monroe
J Wendland
K Rielage
King R W P
Knoll G F
L C Stonehill
Lyons L
M Huang
M L Miller
M W E Smith
Martin R Oblath N S Tolich N
N S Oblath
N Tolich
Nelson W R Hirayama H Rogers D W O
Oblath N S
R G H Robertson
R Martin
S Habib
S J M Peeters
S McGee
T Van Wechel
Veenhof R
Wan Chan Tseung H S
Wilkinson D H
Ziegler J F
Ziegler J F
Publication venue: 'IOP Publishing'
Publication date: 01/04/2011
Field of study

The third phase of the Sudbury Neutrino Observatory (SNO) experiment added an array of 3He proportional counters to the detector. The purpose of this Neutral Current Detection (NCD) array was to observe neutrons resulting from neutral-current solar neutrino-deuteron interactions. We have developed a detailed simulation of the current pulses from the NCD array proportional counters, from the primary neutron capture on 3He through the NCD array signal-processing electronics. This NCD array Monte Carlo simulation was used to model the alpha-decay background in SNO's third-phase 8B solar-neutrino measurement.Comment: 38 pages; submitted to the New Journal of Physic

arXiv.org e-Print Archive

DSpace@MIT

Crossref

CERN Document Server