1,526 research outputs found
Analysis of overfitting in the regularized Cox model
The Cox proportional hazards model is ubiquitous in the analysis of
time-to-event data. However, when the data dimension p is comparable to the
sample size , maximum likelihood estimates for its regression parameters are
known to be biased or break down entirely due to overfitting. This prompted the
introduction of the so-called regularized Cox model. In this paper we use the
replica method from statistical physics to investigate the relationship between
the true and inferred regression parameters in regularized multivariate Cox
regression with L2 regularization, in the regime where both p and N are large
but with p/N ~ O(1). We thereby generalize a recent study from maximum
likelihood to maximum a posteriori inference. We also establish a relationship
between the optimal regularization parameter and p/N, allowing for
straightforward overfitting corrections in time-to-event analysis
Dynamical Probability Distribution Function of the SK Model at High Temperatures
The microscopic probability distribution function of the
Sherrington-Kirkpatrick (SK) model of spin glasses is calculated explicitly as
a function of time by a high-temperature expansion. The resulting formula to
the third order of the inverse temperature shows that an assumption made by
Coolen, Laughton and Sherrington in their recent theory of dynamics is
violated. Deviations of their theory from exact results are estimated
quantitatively. Our formula also yields explicit expressions of the time
dependence of various macroscopic physical quantities when the temperature is
suddenly changed within the high-temperature region.Comment: LaTeX, 6 pages, Figures upon request (here revised), To be published
in J. Phys. Soc. Jpn. 65 (1996) No.
Nonparametric predictive inference for diagnostic test thresholds
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve and surface are useful tools to assess the ability of diagnostic tests to discriminate between ordered classes or groups. To define these diagnostic tests, selecting the optimal thresholds that maximize the accuracy of these tests is required. One procedure that is commonly used to find the optimal thresholds is by maximizing what is known as Youden’s index. This article presents nonparametric predictive inference (NPI) for selecting the optimal thresholds of a diagnostic test. NPI is a frequentist statistical method that is explicitly aimed at using few modeling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. Based on multiple future observations, the NPI approach is presented for selecting the optimal thresholds for two-group and three-group scenarios. In addition, a pairwise approach has also been presented for the three-group scenario. The article ends with an example to illustrate the proposed methods and a simulation study of the predictive performance of the proposed methods along with some classical methods such as Youden index. The NPI-based methods show some interesting results that overcome some of the issues concerning the predictive performance of Youden’s index
Dynamics of on-line Hebbian learning with structurally unrealizable restricted training sets
We present an exact solution for the dynamics of on-line Hebbian learning in
neural networks, with restricted and unrealizable training sets. In contrast to
other studies on learning with restricted training sets, unrealizability is
here caused by structural mismatch, rather than data noise: the teacher machine
is a perceptron with a reversed wedge-type transfer function, while the student
machine is a perceptron with a sigmoidal transfer function. We calculate the
glassy dynamics of the macroscopic performance measures, training error and
generalization error, and the (non-Gaussian) student field distribution. Our
results, which find excellent confirmation in numerical simulations, provide a
new benchmark test for general formalisms with which to study unrealizable
learning processes with restricted training sets.Comment: 7 pages including 3 figures, using IOP latex2e preprint class fil
Theory of agent-based market models with controlled levels of greed and anxiety
We use generating functional analysis to study minority-game type market
models with generalized strategy valuation updates that control the psychology
of agents' actions. The agents' choice between trend following and contrarian
trading, and their vigor in each, depends on the overall state of the market.
Even in `fake history' models, the theory now involves an effective overall bid
process (coupled to the effective agent process) which can exhibit profound
remanence effects and new phase transitions. For some models the bid process
can be solved directly, others require Maxwell-construction type
approximations.Comment: 30 pages, 10 figure
Nonparametric predictive inference for comparison of two diagnostic tests
An important aim in diagnostic medical research is comparison of the accuracy of two diagnostic tests. In this paper, comparison of two diagnostic tests is presented using nonparametric predictive inference (NPI) for future order statistics. The tests are assumed to be applied on the same individuals from two groups, e.g., healthy and diseased individuals, or from three groups with a known ordering, e.g., adding a group of severely diseased individuals to the two group scenario. Our comparison is explicitly in terms of lower and upper probabilities for proportions of correctly diagnosed future individuals from each group, for a given total number of such individuals. We include in our comparison the possibility that it is more important to get a correct diagnosis for individuals from one group than from another group
Generating functional analysis of Minority Games with real market histories
It is shown how the generating functional method of De Dominicis can be used
to solve the dynamics of the original version of the minority game (MG), in
which agents observe real as opposed to fake market histories. Here one again
finds exact closed equations for correlation and response functions, but now
these are defined in terms of two connected effective non-Markovian stochastic
processes: a single effective agent equation similar to that of the `fake'
history models, and a second effective equation for the overall market bid
itself (the latter is absent in `fake' history models). The result is an exact
theory, from which one can calculate from first principles both the persistent
observables in the MG and the distribution of history frequencies.Comment: 39 pages, 5 postscript figures, iop styl
- …