55 research outputs found

    Nonparametric Estimation via Mixed Gradients

    Full text link
    Traditional nonparametric estimation methods often lead to a slow convergence rate in large dimensions and require unrealistically enormous sizes of datasets for reliable conclusions. We develop an approach based on mixed gradients, either observed or estimated, to effectively estimate the function at near-parametric convergence rates. The novel approach and computational algorithm could lead to methods useful to practitioners in many areas of science and engineering. Our theoretical results reveal a behavior universal to this class of nonparametric estimation problems. We explore a general setting involving tensor product spaces and build upon the smoothing spline analysis of variance (SS-ANOVA) framework. For dd-dimensional models under full interaction, the optimal rates with gradient information on pp covariates are identical to those for the (d−p)(d-p)-interaction models without gradients and, therefore, the models are immune to the "curse of interaction". For additive models, the optimal rates using gradient information are root-nn, thus achieving the "parametric rate". We demonstrate aspects of the theoretical results through synthetic and real data applications

    Alzheimer's Disease Prediction Using Longitudinal and Heterogeneous Magnetic Resonance Imaging

    Full text link
    Recent evidence has shown that structural magnetic resonance imaging (MRI) is an effective tool for Alzheimer's disease (AD) prediction and diagnosis. While traditional MRI-based diagnosis uses images acquired at a single time point, a longitudinal study is more sensitive and accurate in detecting early pathological changes of the AD. Two main difficulties arise in longitudinal MRI-based diagnosis: (1) the inconsistent longitudinal scans among subjects (i.e., different scanning time and different total number of scans); (2) the heterogeneous progressions of high-dimensional regions of interest (ROIs) in MRI. In this work, we propose a novel feature selection and estimation method which can be applied to extract features from the heterogeneous longitudinal MRI. A key ingredient of our method is the combination of smoothing splines and the l1l_1-penalty. We perform experiments on the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results corroborate the advantages of the proposed method for AD prediction in longitudinal studies

    Learning Strategies in Decentralized Matching Markets under Uncertain Preferences

    Full text link
    We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori and must be learned from data. Taking the two-sided matching market as a running example, we focus on the decentralized setting, where agents do not share their learned preferences with a central authority. Our approach is based on the representation of preferences in a reproducing kernel Hilbert space, and a learning algorithm for preferences that accounts for uncertainty due to the competition among the agents in the market. Under regularity conditions, we show that our estimator of preferences converges at a minimax optimal rate. Given this result, we derive optimal strategies that maximize agents' expected payoffs and we calibrate the uncertain state by taking opportunity costs into account. We also derive an incentive-compatibility property and show that the outcome from the learned strategies has a stability property. Finally, we prove a fairness property that asserts that there exists no justified envy according to the learned strategies

    Li, Li, and Dai's Contribution to the Discussion of "Estimating Means of Bounded Random Variables by Betting" by Waudby-Smith and Aaditya Ramdas

    Full text link
    We congratulate Waudby-Smith and Ramdas for their interesting paper \cite{waudbysmith2022estimating} in generating confidence intervals and time-uniform confidence sequences for mean estimation with bounded observations. Their methodology utilizes composite nonnegative martingales and establishes a connection to game-theoretic probability. Our comments will focus on numerical comparisons with alternative methods.Comment: 3 pages; 2 figure

    Incentive-Aware Recommender Systems in Two-Sided Markets

    Full text link
    Online platforms in the Internet Economy commonly incorporate recommender systems that recommend arms (e.g., products) to agents (e.g., users). In such platforms, a myopic agent has a natural incentive to exploit, by choosing the best product given the current information rather than to explore various alternatives to collect information that will be used for other agents. We propose a novel recommender system that respects agents' incentives and enjoys asymptotically optimal performances expressed by the regret in repeated games. We model such an incentive-aware recommender system as a multi-agent bandit problem in a two-sided market which is equipped with an incentive constraint induced by agents' opportunity costs. If the opportunity costs are known to the principal, we show that there exists an incentive-compatible recommendation policy, which pools recommendations across a genuinely good arm and an unknown arm via a randomized and adaptive approach. On the other hand, if the opportunity costs are unknown to the principal, we propose a policy that randomly pools recommendations across all arms and uses each arm's cumulative loss as feedback for exploration. We show that both policies also satisfy an ex-post fairness criterion, which protects agents from over-exploitation

    A Resampling Approach For causal Inference On Novel Two-Point Time-Series With Application To Identify Risk Factors For Type-2 Diabetes And Cardiovascular Disease

    Full text link
    Two-point time-series data, characterized by baseline and follow-up observations, are frequently encountered in health research. We study a novel two-point time series structure without a control group, which is driven by an observational routine clinical dataset collected to monitor key risk markers of type-22 diabetes (T2D) and cardiovascular disease (CVD). We propose a resampling approach called 'I-Rand' for independently sampling one of the two time points for each individual and making inference on the estimated causal effects based on matching methods. The proposed method is illustrated with data from a service-based dietary intervention to promote a low-carbohydrate diet (LCD), designed to impact risk of T2D and CVD. Baseline data contain a pre-intervention health record of study participants, and health data after LCD intervention are recorded at the follow-up visit, providing a two-point time-series pattern without a parallel control group. Using this approach we find that obesity is a significant risk factor of T2D and CVD, and an LCD approach can significantly mitigate the risks of T2D and CVD. We provide code that implements our method
    • …
    corecore