51,575 research outputs found
Moderate Deviations Analysis of Binary Hypothesis Testing
This paper is focused on the moderate-deviations analysis of binary
hypothesis testing. The analysis relies on a concentration inequality for
discrete-parameter martingales with bounded jumps, where this inequality forms
a refinement to the Azuma-Hoeffding inequality. Relations of the analysis to
the moderate deviations principle for i.i.d. random variables and to the
relative entropy are considered.Comment: Presented at the 2012 IEEE International Symposium on Information
Theory (ISIT 2012) at MIT, Boston, July 2012. It appears in the Proceedings
of ISIT 2012 on pages 826-83
Moderate Deviation Analysis for Classical Communication over Quantum Channels
© 2017, Springer-Verlag GmbH Germany. We analyse families of codes for classical data transmission over quantum channels that have both a vanishing probability of error and a code rate approaching capacity as the code length increases. To characterise the fundamental tradeoff between decoding error, code rate and code length for such codes we introduce a quantum generalisation of the moderate deviation analysis proposed by Altŭg and Wagner as well as Polyanskiy and Verdú. We derive such a tradeoff for classical-quantum (as well as image-additive) channels in terms of the channel capacity and the channel dispersion, giving further evidence that the latter quantity characterises the necessary backoff from capacity when transmitting finite blocks of classical data. To derive these results we also study asymmetric binary quantum hypothesis testing in the moderate deviations regime. Due to the central importance of the latter task, we expect that our techniques will find further applications in the analysis of other quantum information processing tasks
Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities
This monograph presents a unified treatment of single- and multi-user
problems in Shannon's information theory where we depart from the requirement
that the error probability decays asymptotically in the blocklength. Instead,
the error probabilities for various problems are bounded above by a
non-vanishing constant and the spotlight is shone on achievable coding rates as
functions of the growing blocklengths. This represents the study of asymptotic
estimates with non-vanishing error probabilities.
In Part I, after reviewing the fundamentals of information theory, we discuss
Strassen's seminal result for binary hypothesis testing where the type-I error
probability is non-vanishing and the rate of decay of the type-II error
probability with growing number of independent observations is characterized.
In Part II, we use this basic hypothesis testing result to develop second- and
sometimes, even third-order asymptotic expansions for point-to-point
communication. Finally in Part III, we consider network information theory
problems for which the second-order asymptotics are known. These problems
include some classes of channels with random state, the multiple-encoder
distributed lossless source coding (Slepian-Wolf) problem and special cases of
the Gaussian interference and multiple-access channels. Finally, we discuss
avenues for further research.Comment: Further comments welcom
Discrete Optimization for Interpretable Study Populations and Randomization Inference in an Observational Study of Severe Sepsis Mortality
Motivated by an observational study of the effect of hospital ward versus
intensive care unit admission on severe sepsis mortality, we develop methods to
address two common problems in observational studies: (1) when there is a lack
of covariate overlap between the treated and control groups, how to define an
interpretable study population wherein inference can be conducted without
extrapolating with respect to important variables; and (2) how to use
randomization inference to form confidence intervals for the average treatment
effect with binary outcomes. Our solution to problem (1) incorporates existing
suggestions in the literature while yielding a study population that is easily
understood in terms of the covariates themselves, and can be solved using an
efficient branch-and-bound algorithm. We address problem (2) by solving a
linear integer program to utilize the worst case variance of the average
treatment effect among values for unobserved potential outcomes that are
compatible with the null hypothesis. Our analysis finds no evidence for a
difference between the sixty day mortality rates if all individuals were
admitted to the ICU and if all patients were admitted to the hospital ward
among less severely ill patients and among patients with cryptic septic shock.
We implement our methodology in R, providing scripts in the supplementary
material
Decision trees in epidemiological research
Background:
In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods.
Main text:
We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees.
Conclusions:
Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation
- …