2,517 research outputs found

    Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

    Full text link
    We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically characterize the generalization error of the target task in terms of the salient factors in the transfer learning architecture, i.e., the number of examples available, the number of (free) parameters in each of the tasks, the number of parameters transferred from the source to target task, and the correlation between the two tasks. Our non-asymptotic analysis shows that the generalization error of the target task follows a two-dimensional double descent trend (with respect to the number of free parameters in each of the tasks) that is controlled by the transfer learning factors. Our analysis points to specific cases where the transfer of parameters is beneficial. Specifically, we show that transferring a specific set of parameters that generalizes well on the respective part of the source task can soften the demand on the task correlation level that is required for successful transfer learning. Moreover, we show that the usefulness of a transfer learning setting is fragile and depends on a delicate interplay among the set of transferred parameters, the relation between the tasks, and the true solution

    Field theoretic calculation of scalar turbulence

    Full text link
    The cascade rate of passive scalar and Bachelor's constant in scalar turbulence are calculated using the flux formula. This calculation is done to first order in perturbation series. Batchelor's constant in three dimension is found to be approximately 1.25. In higher dimension, the constant increases as d1/3d^{1/3}.Comment: RevTex4, publ. in Int. J. Mod. Phy. B, v.15, p.3419, 200

    Overfreezing Meets Overparameterization: A Double Descent Perspective on Transfer Learning of Deep Neural Networks

    Full text link
    We study the generalization behavior of transfer learning of deep neural networks (DNNs). We adopt the overparameterization perspective -- featuring interpolation of the training data (i.e., approximately zero train error) and the double descent phenomenon -- to explain the delicate effect of the transfer learning setting on generalization performance. We study how the generalization behavior of transfer learning is affected by the dataset size in the source and target tasks, the number of transferred layers that are kept frozen in the target DNN training, and the similarity between the source and target tasks. We show that the test error evolution during the target DNN training has a more significant double descent effect when the target training dataset is sufficiently large with some label noise. In addition, a larger source training dataset can delay the arrival to interpolation and double descent peak in the target DNN training. Moreover, we demonstrate that the number of frozen layers can determine whether the transfer learning is effectively underparameterized or overparameterized and, in turn, this may affect the relative success or failure of learning. Specifically, we show that too many frozen layers may make a transfer from a less related source task better or on par with a transfer from a more related source task; we call this case overfreezing. We establish our results using image classification experiments with the residual network (ResNet) and vision transformer (ViT) architectures

    TNM cancer staging: can it help develop a novel staging system for type 2 diabetes?

    Get PDF
    Abstract: Type 2 diabetes (DM2) constitutes 90%–95% of the diabetes cases and is increasing at an alarming rate in the world. The Centers for Disease Control and Prevention (CDC) esti- mates that more than 29 million people in the United States have diabetes, which often causes mortality from macrovascular complications and morbidity from microvascular complications. Despite these troubling facts, there is currently no widely accepted staging system for DM2 like there is for cancer. TNM oncologic staging has taken a complex condition like cancer and conveyed likelihood of survival in simple alpha-numeric terms that both patients and providers can understand. Oncology is now entering the era of precision medicine where cancer treatment is increasingly being tailored to each patient’s cancer. In contrast, DM2 lacks a staging system and remains a largely invisible disease even though it kills more Americans and costs more to treat than cancer. Is a comparable staging system for DM2 possible? We propose the Diabetes Staging System for DM2 that utilizes macrovascular events, microvascular complications, estimated glomerular filtration rate (GFR), and hemoglobin A1C to stage DM2

    Cyclic Testing of Aggregates for Pavement Design

    Get PDF
    Two most commonly encountered aggregates that are used as subbases/bases of roadways in Oklahoma were selected and tested under cyclic loading to evaluate their Resilient Modulus (RM). Following the repeated triaxial RM testing, the specimens were subjected to the triaxial compression tests from which the parameters of cohesion (C), and friction angle (Φ) were obtained. A good statistical correlation was established between RM and C and Φ. The repeated triaxial RM testing procedure serves as a “conditioning” prior to the static triaxial compression and it simulates the loads imposed by the moving vehicle. The effects of conditioning on C and Φ were investigated. The strength increase through conditioning was found to vary from 18 to 85 percent, depending confining pressure and aggregate type. Also, it was found that C increases and ø decreased because of conditioning

    Kerr-Schild type initial data for black holes with angular momenta

    Full text link
    Generalizing previous work we propose how to superpose spinning black holes in a Kerr-Schild initial slice. This superposition satisfies several physically meaningful limits, including the close and the far ones. Further we consider the close limit of two black holes with opposite angular momenta and explicitly solve the constraint equations in this case. Evolving the resulting initial data with a linear code, we compute the radiated energy as a function of the masses and the angular momenta of the black holes.Comment: 13 pages, 3 figures. Revised version. To appear in Classical and Quantum Gravit
    • …
    corecore