99,622 research outputs found
The Limits of Post-Selection Generalization
While statistics and machine learning offers numerous methods for ensuring
generalization, these methods often fail in the presence of adaptivity---the
common practice in which the choice of analysis depends on previous
interactions with the same dataset. A recent line of work has introduced
powerful, general purpose algorithms that ensure post hoc generalization (also
called robust or post-selection generalization), which says that, given the
output of the algorithm, it is hard to find any statistic for which the data
differs significantly from the population it came from.
In this work we show several limitations on the power of algorithms
satisfying post hoc generalization. First, we show a tight lower bound on the
error of any algorithm that satisfies post hoc generalization and answers
adaptively chosen statistical queries, showing a strong barrier to progress in
post selection data analysis. Second, we show that post hoc generalization is
not closed under composition, despite many examples of such algorithms
exhibiting strong composition properties
Generalization from correlated sets of patterns in the perceptron
Generalization is a central aspect of learning theory. Here, we propose a
framework that explores an auxiliary task-dependent notion of generalization,
and attempts to quantitatively answer the following question: given two sets of
patterns with a given degree of dissimilarity, how easily will a network be
able to "unify" their interpretation? This is quantified by the volume of the
configurations of synaptic weights that classify the two sets in a similar
manner. To show the applicability of our idea in a concrete setting, we compute
this quantity for the perceptron, a simple binary classifier, using the
classical statistical physics approach in the replica-symmetric ansatz. In this
case, we show how an analytical expression measures the "distance-based
capacity", the maximum load of patterns sustainable by the network, at fixed
dissimilarity between patterns and fixed allowed number of errors. This curve
indicates that generalization is possible at any distance, but with decreasing
capacity. We propose that a distance-based definition of generalization may be
useful in numerical experiments with real-world neural networks, and to explore
computationally sub-dominant sets of synaptic solutions
Asymptotic Freedom: From Paradox to Paradigm
Asymptotic freedom was developed as a response to two paradoxes: the
weirdness of quarks, and in particular their failure to radiate copiously when
struck; and the coexistence of special relativity and quantum theory, despite
the apparent singularity of quantum field theory. It resolved these paradoxes,
and catalyzed the development of several modern paradigms: the hard reality of
quarks and gluons, the origin of mass from energy, the simplicity of the early
universe, and the power of symmetry as a guide to physical law.Comment: 26 pages, 10 figures. Lecture on receipt of the 2004 Nobel Prize. v2:
typo (in Ohm's law) correcte
Superstatistical generalization of the work fluctuation theorem
We derive a generalized version of the work fluctuation theorem for
nonequilibrium systems with spatio-temporal temperature fluctuations. For
chi-square distributed inverse temperature we obtain a generalized fluctuation
theorem based on q-exponentials, whereas for other temperature distributions
more complicated formulae arise. Since q-exponentials have a power law decay,
the decay rate in this generalized fluctuation theorem is much slower than the
conventional exponential decay. This implies that work fluctuations can be of
relevance for the design of micro and nano structures, since the work done on
the system is relatively much larger than in the conventional fluctuation
theorem.Comment: 13 pages. Contribution to the Proceedings of `Trends and Perspectives
in Extensive and Nonextensive Statistical Mechanics', in honour of
Constantino Tsallis' 60th birthday (to appear in Physica A
- …