57 research outputs found

    Extreme(ly) mean(ingful): Sequential formation of a quality group

    Full text link
    The present paper studies the limiting behavior of the average score of a sequentially selected group of items or individuals, the underlying distribution of which, FF, belongs to the Gumbel domain of attraction of extreme value distributions. This class contains the Normal, Lognormal, Gamma, Weibull and many other distributions. The selection rules are the "better than average" (β=1\beta=1) and the "β\beta-better than average" rule, defined as follows. After the first item is selected, another item is admitted into the group if and only if its score is greater than β\beta times the average score of those already selected. Denote by Yˉk\bar{Y}_k the average of the kk first selected items, and by TkT_k the time it takes to amass them. Some of the key results obtained are: under mild conditions, for the better than average rule, Yˉk\bar{Y}_k less a suitable chosen function of logk\log k converges almost surely to a finite random variable. When 1F(x)=e[xα+h(x)]1-F(x)=e^{-[x^{\alpha}+h(x)]}, α>0\alpha>0 and h(x)/xαx0h(x)/x^{\alpha}\stackrel{x\rightarrow \infty}{\longrightarrow}0, then TkT_k is of approximate order k2k^2. When β>1\beta>1, the asymptotic results for Yˉk\bar{Y}_k are of a completely different order of magnitude. Interestingly, for a class of distributions, TkT_k, suitably normalized, asymptotically approaches 1, almost surely for relatively small β1\beta\ge1, in probability for moderate sized β\beta and in distribution when β\beta is large.Comment: Published in at http://dx.doi.org/10.1214/10-AAP684 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    The Noisy Secretary Problem and Some Results on Extreme Concomitant Variables

    Get PDF
    The classical secretary problem for selecting the best item is studied when the actual values of the items are observed with noise. One of the main appeals of the secretary problem is that the optimal strategy is able to find the best observation with the nontrivial probability of about 0.37, even when the number of observations is arbitrarily large. The results are strikingly di↵erent when the quality of the secretaries are observed with noise. If there is no noise, then the only information that is needed is whether an observation is the best among those already observed. Since observations are assumed to be i.i.d. this is distribution free. In the case of noisy data, the results are no longer distrubtion free. Furthermore, one needs to know the rank of the noisy observation among those already seen. Finally, the probability of finding the best secretary often goes to 0 as the number of obsevations, n, goes to infinity. The results depend heavily on the behavior of pn, the probability that the observation that is best among the noisy observations is also best among the noiseless observations. Results involving optimal strategies if all that is available is noisy data are described and examples are given to elucidate the results

    The Optimality of Blocking Designs in Equally and Unequally Allocated Randomized Experiments with General Response

    Full text link
    We consider the performance of the difference-in-means estimator in a two-arm randomized experiment under common experimental endpoints such as continuous (regression), incidence, proportion and survival. We examine performance under both equal and unequal allocation to treatment groups and we consider both the Neyman randomization model and the population model. We show that in the Neyman model, where the only source of randomness is the treatment manipulation, there is no free lunch: complete randomization is minimax for the estimator's mean squared error. In the population model, where each subject experiences response noise with zero mean, the optimal design is the deterministic perfect-balance allocation. However, this allocation is generally NP-hard to compute and moreover, depends on unknown response parameters. When considering the tail criterion of Kapelner et al. (2021), we show the optimal design is less random than complete randomization and more random than the deterministic perfect-balance allocation. We prove that Fisher's blocking design provides the asymptotically optimal degree of experimental randomness. Theoretical results are supported by simulations in all considered experimental settings.Comment: 33 pages, 1 figure, 2 table

    R-Estimates vs. GMM: A Theoretical Case Study of Validity and Efficiency

    Get PDF
    What role should assumptions play in inference? We present a small theoretical case study of a simple, clean case, namely the nonparametric comparison of two continuous distributions using (essentially) information about quartiles, that is, the central information displayed in a pair of boxplots. In particular, we contrast a suggestion of John Tukey—that the validity of inferences should not depend on assumptions, but assumptions have a role in efficiency—with a competing suggestion that is an aspect of Hansen’s generalized method of moments—that methods should achieve maximum asymptotic efficiency with fewer assumptions. In our case study, the practical performance of these two suggestions is strikingly different. An aspect of this comparison concerns the unification or separation of the tasks of estimation assuming a model and testing the fit of that model. We also look at a method (MERT) that aims not at best performance, but rather at achieving reasonable performance across a set of plausible models

    Early Biometric Lag in the Prediction of Small for Gestational Age Neonates and Preeclampsia

    Get PDF
    OBJECTIVE: An early fetal growth lag may be a marker of future complications. We sought to determine the utility of early biometric variables in predicting adverse pregnancy outcomes. METHODS: In this retrospective cohort study, the crown-rump length at 11 to 14 weeks and the head circumference, biparietal diameter, abdominal circumference, femur length, humerus length, transverse cerebellar diameter, and estimated fetal weight at 18 to 24 weeks were converted to an estimated gestational age using published regression formulas. Sonographic fetal growth (difference between each biometric gestational age and the crown-rump length gestational age) minus expected fetal growth (number of days elapsed between the two scans) yielded the biometric growth lag. These lags were tested as predictors of small for gestational age (SGA) neonates (≤10th percentile) and preeclampsia. RESULTS: A total of 245 patients were included. Thirty-two (13.1%) delivered an SGA neonate, and 43 (17.6%) had the composite outcome. The head circumference, biparietal diameter, abdominal circumference, and estimated fetal weight lags were identified as significant predictors of SGA neonates after adjusted analyses (P \u3c .05). The addition of either the estimated fetal weight or abdominal circumference lag to maternal characteristics alone significantly improved the performance of the predictive model, achieving areas under the curve of 0.72 and 0.74, respectively. No significant association was found between the biometric lag variables and the development of preeclampsia. CONCLUSIONS: Routinely available biometric data can be used to improve the prediction of adverse outcomes such as SGA. These biometric lags should be considered in efforts to develop screening algorithms for adverse outcomes

    Temporal Changes of Neocortical High-Frequency Oscillations in Epilepsy

    Get PDF
    High-frequency (100–500 Hz) oscillations (HFOs) recorded from intracranial electrodes are a potential biomarker for epileptogenic brain. HFOs are commonly categorized as ripples (100–250 Hz) or fast ripples (250–500 Hz), and a third class of mixed frequency events has also been identified. We hypothesize that temporal changes in HFOs may identify periods of increased the likelihood of seizure onset. HFOs (86,151) from five patients with neocortical epilepsy implanted with hybrid (micro + macro) intracranial electrodes were detected using a previously validated automated algorithm run over all channels of each patient\u27s entire recording. HFOs were characterized by extracting quantitative morphologic features and divided into four time epochs (interictal, preictal, ictal, and postictal) and three HFO clusters (ripples, fast ripples, and mixed events). We used supervised classification and nonparametric statistical tests to explore quantitative changes in HFO features before, during, and after seizures. We also analyzed temporal changes in the rates and proportions of events from each HFO cluster during these periods. We observed patient-specific changes in HFO morphology linked to fluctuation in the relative rates of ripples, fast ripples, and mixed frequency events. These changes in relative rate occurred in pre- and postictal periods up to thirty min before and after seizures. We also found evidence that the distribution of HFOs during these different time periods varied greatly between individual patients. These results suggest that temporal analysis of HFO features has potential for designing custom seizure prediction algorithms and for exploring the relationship between HFOs and seizure generation
    corecore