159,707 research outputs found
On the Power of Conditional Samples in Distribution Testing
In this paper we define and examine the power of the {\em
conditional-sampling} oracle in the context of distribution-property testing.
The conditional-sampling oracle for a discrete distribution takes as
input a subset of the domain, and outputs a random sample drawn according to , conditioned on (and independently of all
prior samples). The conditional-sampling oracle is a natural generalization of
the ordinary sampling oracle in which always equals .
We show that with the conditional-sampling oracle, testing uniformity,
testing identity to a known distribution, and testing any label-invariant
property of distributions is easier than with the ordinary sampling oracle. On
the other hand, we also show that for some distribution properties the
sample-complexity remains near-maximal even with conditional sampling
New goodness-of-fit diagnostics for conditional discrete response models
This paper proposes new specification tests for conditional models with
discrete responses, which are key to apply efficient maximum likelihood
methods, to obtain consistent estimates of partial effects and to get
appropriate predictions of the probability of future events. In particular, we
test the static and dynamic ordered choice model specifications and can cover
infinite support distributions for e.g. count data. The traditional approach
for specification testing of discrete response models is based on probability
integral transforms of a jittered discrete data which leads to continuous
uniform iid series under the true conditional distribution. Then, standard
specification testing techniques for continuous variables could be applied to
the transformed series, but the extra randomness from jitters affects the power
properties of these methods. We investigate in this paper an alternative
transformation based only on original discrete data that avoids any
randomization. We analyze the asymptotic properties of goodness-of-fit tests
based on this new transformation and explore the properties in finite samples
of a bootstrap algorithm to approximate the critical values of test statistics
which are model and parameter dependent. We show analytically and in
simulations that our approach dominates the methods based on randomization in
terms of power. We apply the new tests to models of the monetary policy
conducted by the Federal Reserve
New Goodness-of-fit Diagnostics for Conditional Discrete Response Models
This paper proposes new speciļ¬cation tests for conditional models with discrete responses, which are key to apply eļ¬icient maximum likelihood methods, to obtain consistent estimates of partial eļ¬ects and to get appropriate predictions of the probability of future events. In particular, we test the static and dynamic ordered choice model speciļ¬cations and can cover inļ¬nite support distributions for e.g. count data. The traditional approach for speciļ¬cation testing of discrete response models is based on probability integral transforms of a jittered discrete data which leads to continuous uniform iid series under the true conditional distribution. Then, standard speciļ¬cation testing techniques for continuous variables could be applied to the transformed series, but the extra randomness from jitters aļ¬ects the power properties of these methods. We investigate in this paper an alternative transformation based only on original discrete data that avoids any randomization. We analyze the asymptotic properties of goodness-of- t tests based on this new transformation and explore the properties in ļ¬nite samples of a bootstrap algorithm to approximate the critical values of test statistics which are model and parameter dependent. We show analytically and in simulations that our approach dominates the methods based on randomization in terms of power. We apply the new tests to models of the monetary policy conducted by the Federal Reserve
Support Size Estimation: The Power of Conditioning
We consider the problem of estimating the support size of a distribution .
Our investigations are pursued through the lens of distribution testing and
seek to understand the power of conditional sampling (denoted as COND), wherein
one is allowed to query the given distribution conditioned on an arbitrary
subset . The primary contribution of this work is to introduce a new
approach to lower bounds for the COND model that relies on using powerful tools
from information theory and communication complexity.
Our approach allows us to obtain surprisingly strong lower bounds for the
COND model and its extensions.
1) We bridge the longstanding gap between the upper () and the lower bound for
COND model by providing a nearly matching lower bound. Surprisingly, we show
that even if we get to know the actual probabilities along with COND samples,
still queries
are necessary.
2) We obtain the first non-trivial lower bound for COND equipped with an
additional oracle that reveals the conditional probabilities of the samples (to
the best of our knowledge, this subsumes all of the models previously studied):
in particular, we demonstrate that queries are necessary
Comparing the Accuracy of Copula-Based Multivariate Density Forecasts in Selected Regions of Support
This paper develops a testing framework for comparing the predictive accuracy of copula-based multivariate density forecasts, focusing on a specific part of the joint distribution. The test is framed in the context of the Kullback-Leibler Information Criterion, but using (out-of-sample) conditional likelihood and censored likelihood in order to focus the evaluation on the region of interest. Monte Carlo simulations document that the resulting test statistics have satisfactory size and power properties in small samples. In an empirical application to daily exchange rate returns we find evidence that the dependence structure varies with the sign and magnitude of returns, such that different parametric copula models achieve superior forecasting performance in different regions of the support. Our analysis highlights the importance of allowing for lower and upper tail dependence for accurate forecasting of common extreme appreciation and depreciation of different currencies
Adapting Deep Learning for Underwater Acoustic Communication Channel Modeling
The recent emerging applications of novel underwater systems lead to increasing demand for underwater acoustic (UWA) communication and networking techniques. However, due to the challenging UWA channel characteristics, conventional wireless techniques are rarely applicable to UWA communication and networking. The cognitive and software-defined communication and networking are considered promising architecture of a novel UWA system design. As an essential component of a cognitive communication system, the modeling and prediction of the UWA channel impulse response (CIR) with deep generative models are studied in this work.
Firstly, an underwater acoustic communication and networking testbed is developed for conducting various simulations and field experiments. The proposed test-bed also demonstrated the capabilities of developing and testing SDN protocols for a UWA network in both simulation and field experiments.
Secondly, due to the lack of appropriate UWA CIR data sets for deep learning, a series of field UWA channel experiments have been conducted across a shallow freshwater river. Abundant UWA CIR data under various weather conditions have been collected and studied. The environmental factors that significantly affect the UWA channel state, including the solar radiation rate, the air temperature, the ice cover, the precipitation rate, etc., are analyzed in the case studies. The obtained UWA CIR data set with significant correlations to weather conditions can benefit future deep-learning research on UWA channels.
Thirdly, a Wasserstein conditional generative adversarial network (WCGAN) is proposed to model the observed UWA CIR distribution. A power-weighted JensenāShannon divergence (JSD) is proposed to measure the similarity between the generated distribution and the experimental observations. The CIR samples generated by the WCGAN model show a lower power-weighted JSD than conventional estimated stochastic distributions.
Finally, a modified conditional generative adversarial network (CGAN) model is proposed for predicting the UWA CIR distribution in the 15-minute range near future. This prediction model takes a sequence of historical and forecast weather information with a recent CIR observation as the conditional input. The generated CIR sample predictions also show a lower power-weighted JSD than conventional estimated stochastic distributions
Near-optimal multiple testing in Bayesian linear models with finite-sample FDR control
In high dimensional variable selection problems, statisticians often seek to
design multiple testing procedures controlling the false discovery rate (FDR)
and simultaneously discovering more relevant variables. Model-X methods, such
as Knockoffs and conditional randomization tests, achieve the first goal of
finite-sample FDR control under the assumption of known covariates
distribution. However, it is not clear whether these methods can concurrently
achieve the second goal of maximizing the number of discoveries. In fact,
designing procedures to discover more relevant variables with finite-sample FDR
control is a largely open question, even in the arguably simplest linear
models.
In this paper, we derive near-optimal testing procedures in high dimensional
Bayesian linear models with isotropic covariates. We propose a Model-X multiple
testing procedure, PoEdCe, which provably controls the frequentist FDR from
finite samples even under model misspecification, and conjecturally achieves
near-optimal power when the data follow the Bayesian linear model with a known
prior. PoEdCe has three important ingredients: Posterior Expectation, distilled
Conditional randomization test (dCRT), and the Benjamini-Hochberg procedure
with e-values (eBH). The optimality conjecture of PoEdCe is based on a
heuristic calculation of its asymptotic true positive proportion (TPP) and
false discovery proportion (FDP), which is supported by methods from
statistical physics as well as extensive numerical simulations. Furthermore,
when the prior is unknown, we show that an empirical Bayes variant of PoEdCe
still has finite-sample FDR control and achieves near-optimal power.Comment: 45 pages, 5 figure
- ā¦