187 research outputs found
Multiclass Learnability Does Not Imply Sample Compression
A hypothesis class admits a sample compression scheme, if for every sample
labeled by a hypothesis from the class, it is possible to retain only a small
subsample, using which the labels on the entire sample can be inferred. The
size of the compression scheme is an upper bound on the size of the subsample
produced. Every learnable binary hypothesis class (which must necessarily have
finite VC dimension) admits a sample compression scheme of size only a finite
function of its VC dimension, independent of the sample size. For multiclass
hypothesis classes, the analog of VC dimension is the DS dimension. We show
that the analogous statement pertaining to sample compression is not true for
multiclass hypothesis classes: every learnable multiclass hypothesis class,
which must necessarily have finite DS dimension, does not admit a sample
compression scheme of size only a finite function of its DS dimension
A Characterization of List Learnability
A classical result in learning theory shows the equivalence of PAC
learnability of binary hypothesis classes and the finiteness of VC dimension.
Extending this to the multiclass setting was an open problem, which was settled
in a recent breakthrough result characterizing multiclass PAC learnability via
the DS dimension introduced earlier by Daniely and Shalev-Shwartz. In this work
we consider list PAC learning where the goal is to output a list of
predictions. List learning algorithms have been developed in several settings
before and indeed, list learning played an important role in the recent
characterization of multiclass learnability. In this work we ask: when is it
possible to -list learn a hypothesis class? We completely characterize
-list learnability in terms of a generalization of DS dimension that we call
the -DS dimension. Generalizing the recent characterization of multiclass
learnability, we show that a hypothesis class is -list learnable if and only
if the -DS dimension is finite
Testing with Non-identically Distributed Samples
We examine the extent to which sublinear-sample property testing and
estimation applies to settings where samples are independently but not
identically distributed. Specifically, we consider the following distributional
property testing framework: Suppose there is a set of distributions over a
discrete support of size , ,
and we obtain independent draws from each distribution. Suppose the goal is
to learn or test a property of the average distribution,
. This setup models a number of important practical
settings where the individual distributions correspond to heterogeneous
entities -- either individuals, chronologically distinct time periods,
spatially separated data sources, etc. From a learning standpoint, even with
samples from each distribution, samples are
necessary and sufficient to learn to within error
in TV distance. To test uniformity or identity -- distinguishing
the case that is equal to some reference
distribution, versus has distance at least from the
reference distribution, we show that a linear number of samples in is
necessary given samples from each distribution. In contrast, for , we recover the usual sublinear sample testing of the i.i.d. setting: we
show that samples are sufficient,
matching the optimal sample complexity in the i.i.d. case in the regime where
. Additionally, we show that in the case, there
is a constant such that even in the linear regime with
samples, no tester that considers the multiset of samples (ignoring which
samples were drawn from the same ) can perform uniformity
testing
NEPHROPROTECTIVE ROLE OF ZINC AGAINST THE AMMONIUM SULFATE TOXICITY IN MALE ALBINO RATS
Objective: Intention of the present study is to investigate the protective role of zinc against ammonium sulfate (AS) toxicity in renal tissue by evaluating certain biochemical activities of albino rats.
Methods: Rats were divided into four groups, namely control, ammonia, zinc, and ammonia + zinc. Rats were exposed to AS (18.3 mg/kg body weight) or zinc chloride (4 mg/kg body weight) or both through intraperitoneally for 7-day experimentation with 24-h time interval.
Results: AS-administered rats showed significant increased levels of ammonia, urea, glutamine, glutamine synthetase, free amino acids, and lactate dehydrogenase and decreased levels of total proteins, pyruvate, succinate dehydrogenases, malate dehydrogenase, and biochemical activities when compared with control. Supplementation of zinc mitigated AS-induced oxidative stress and restored all the biochemical parameter activities. Zinc administered to normal rats did not exhibit any significant changes in any of the parameters studied.
Conclusion: From the study, it concluded that zinc cotreatment with AS has effectively recovered the mitochondrial enzyme activities and ammonia metabolic biochemical parameters in renal tissue of rat treated with AS
Provable benefits of score matching
Score matching is an alternative to maximum likelihood (ML) for estimating a
probability distribution parametrized up to a constant of proportionality. By
fitting the ''score'' of the distribution, it sidesteps the need to compute
this constant of proportionality (which is often intractable). While score
matching and variants thereof are popular in practice, precise theoretical
understanding of the benefits and tradeoffs with maximum likelihood -- both
computational and statistical -- are not well understood. In this work, we give
the first example of a natural exponential family of distributions such that
the score matching loss is computationally efficient to optimize, and has a
comparable statistical efficiency to ML, while the ML loss is intractable to
optimize using a gradient-based method. The family consists of exponentials of
polynomials of fixed degree, and our result can be viewed as a continuous
analogue of recent developments in the discrete setting. Precisely, we show:
(1) Designing a zeroth-order or first-order oracle for optimizing the maximum
likelihood loss is NP-hard. (2) Maximum likelihood has a statistical efficiency
polynomial in the ambient dimension and the radius of the parameters of the
family. (3) Minimizing the score matching loss is both computationally and
statistically efficient, with complexity polynomial in the ambient dimension.Comment: 25 Page
Harnessing the Power of Choices in Decision Tree Learning
We propose a simple generalization of standard and empirically successful
decision tree learning algorithms such as ID3, C4.5, and CART. These
algorithms, which have been central to machine learning for decades, are greedy
in nature: they grow a decision tree by iteratively splitting on the best
attribute. Our algorithm, Top-, considers the best attributes as
possible splits instead of just the single best attribute. We demonstrate,
theoretically and empirically, the power of this simple generalization. We
first prove a {\sl greediness hierarchy theorem} showing that for every , Top- can be dramatically more powerful than Top-: there
are data distributions for which the former achieves accuracy ,
whereas the latter only achieves accuracy . We then
show, through extensive experiments, that Top- outperforms the two main
approaches to decision tree learning: classic greedy algorithms and more recent
"optimal decision tree" algorithms. On one hand, Top- consistently enjoys
significant accuracy gains over greedy algorithms across a wide range of
benchmarks. On the other hand, Top- is markedly more scalable than optimal
decision tree algorithms and is able to handle dataset and feature set sizes
that remain far beyond the reach of these algorithms.Comment: NeurIPS 202
A cost effective real-time PCR for the detection of adenovirus from viral swabs
Compared to traditional testing strategies, nucleic acid amplification tests such as real-time PCR offer many advantages for the detection of human adenoviruses. However, commercial assays are expensive and cost prohibitive for many clinical laboratories. To overcome fiscal challenges, a cost effective strategy was developed using a combination of homogenization and heat treatment with an “in-house” real-time PCR. In 196 swabs submitted for adenovirus detection, this crude extraction method showed performance characteristics equivalent to viral DNA obtained from a commercial nucleic acid extraction. In addition, the in-house real-time PCR outperformed traditional testing strategies using virus culture, with sensitivities of 100% and 69.2%, respectively. Overall, the combination of homogenization and heat treatment with a sensitive in-house real-time PCR provides accurate results at a cost comparable to viral culture
- …