2,517 research outputs found
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks
We study the transfer learning process between two linear regression
problems. An important and timely special case is when the regressors are
overparameterized and perfectly interpolate their training data. We examine a
parameter transfer mechanism whereby a subset of the parameters of the target
task solution are constrained to the values learned for a related source task.
We analytically characterize the generalization error of the target task in
terms of the salient factors in the transfer learning architecture, i.e., the
number of examples available, the number of (free) parameters in each of the
tasks, the number of parameters transferred from the source to target task, and
the correlation between the two tasks. Our non-asymptotic analysis shows that
the generalization error of the target task follows a two-dimensional double
descent trend (with respect to the number of free parameters in each of the
tasks) that is controlled by the transfer learning factors. Our analysis points
to specific cases where the transfer of parameters is beneficial. Specifically,
we show that transferring a specific set of parameters that generalizes well on
the respective part of the source task can soften the demand on the task
correlation level that is required for successful transfer learning. Moreover,
we show that the usefulness of a transfer learning setting is fragile and
depends on a delicate interplay among the set of transferred parameters, the
relation between the tasks, and the true solution
Field theoretic calculation of scalar turbulence
The cascade rate of passive scalar and Bachelor's constant in scalar
turbulence are calculated using the flux formula. This calculation is done to
first order in perturbation series. Batchelor's constant in three dimension is
found to be approximately 1.25. In higher dimension, the constant increases as
.Comment: RevTex4, publ. in Int. J. Mod. Phy. B, v.15, p.3419, 200
Overfreezing Meets Overparameterization: A Double Descent Perspective on Transfer Learning of Deep Neural Networks
We study the generalization behavior of transfer learning of deep neural
networks (DNNs). We adopt the overparameterization perspective -- featuring
interpolation of the training data (i.e., approximately zero train error) and
the double descent phenomenon -- to explain the delicate effect of the transfer
learning setting on generalization performance. We study how the generalization
behavior of transfer learning is affected by the dataset size in the source and
target tasks, the number of transferred layers that are kept frozen in the
target DNN training, and the similarity between the source and target tasks. We
show that the test error evolution during the target DNN training has a more
significant double descent effect when the target training dataset is
sufficiently large with some label noise. In addition, a larger source training
dataset can delay the arrival to interpolation and double descent peak in the
target DNN training. Moreover, we demonstrate that the number of frozen layers
can determine whether the transfer learning is effectively underparameterized
or overparameterized and, in turn, this may affect the relative success or
failure of learning. Specifically, we show that too many frozen layers may make
a transfer from a less related source task better or on par with a transfer
from a more related source task; we call this case overfreezing. We establish
our results using image classification experiments with the residual network
(ResNet) and vision transformer (ViT) architectures
TNM cancer staging: can it help develop a novel staging system for type 2 diabetes?
Abstract: Type 2 diabetes (DM2) constitutes 90%ââŹâ95% of the diabetes cases and is increasing at an alarming rate in the world. The Centers for Disease Control and Prevention (CDC) esti- mates that more than 29 million people in the United States have diabetes, which often causes mortality from macrovascular complications and morbidity from microvascular complications. Despite these troubling facts, there is currently no widely accepted staging system for DM2 like there is for cancer. TNM oncologic staging has taken a complex condition like cancer and conveyed likelihood of survival in simple alpha-numeric terms that both patients and providers can understand. Oncology is now entering the era of precision medicine where cancer treatment is increasingly being tailored to each patientââŹâ˘s cancer. In contrast, DM2 lacks a staging system and remains a largely invisible disease even though it kills more Americans and costs more to treat than cancer. Is a comparable staging system for DM2 possible? We propose the Diabetes Staging System for DM2 that utilizes macrovascular events, microvascular complications, estimated glomerular filtration rate (GFR), and hemoglobin A1C to stage DM2
Cyclic Testing of Aggregates for Pavement Design
Two most commonly encountered aggregates that are used as subbases/bases of roadways in Oklahoma were selected and tested under cyclic loading to evaluate their Resilient Modulus (RM). Following the repeated triaxial RM testing, the specimens were subjected to the triaxial compression tests from which the parameters of cohesion (C), and friction angle (ÎŚ) were obtained. A good statistical correlation was established between RM and C and ÎŚ. The repeated triaxial RM testing procedure serves as a âconditioningâ prior to the static triaxial compression and it simulates the loads imposed by the moving vehicle. The effects of conditioning on C and ÎŚ were investigated. The strength increase through conditioning was found to vary from 18 to 85 percent, depending confining pressure and aggregate type. Also, it was found that C increases and ø decreased because of conditioning
Kerr-Schild type initial data for black holes with angular momenta
Generalizing previous work we propose how to superpose spinning black holes
in a Kerr-Schild initial slice. This superposition satisfies several physically
meaningful limits, including the close and the far ones. Further we consider
the close limit of two black holes with opposite angular momenta and explicitly
solve the constraint equations in this case. Evolving the resulting initial
data with a linear code, we compute the radiated energy as a function of the
masses and the angular momenta of the black holes.Comment: 13 pages, 3 figures. Revised version. To appear in Classical and
Quantum Gravit
- âŚ