1,142 research outputs found
Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence
Incremental learning (IL) has received a lot of attention recently, however,
the literature lacks a precise problem definition, proper evaluation settings,
and metrics tailored specifically for the IL problem. One of the main
objectives of this work is to fill these gaps so as to provide a common ground
for better understanding of IL. The main challenge for an IL algorithm is to
update the classifier whilst preserving existing knowledge. We observe that, in
addition to forgetting, a known issue while preserving knowledge, IL also
suffers from a problem we call intransigence, inability of a model to update
its knowledge. We introduce two metrics to quantify forgetting and
intransigence that allow us to understand, analyse, and gain better insights
into the behaviour of IL algorithms. We present RWalk, a generalization of
EWC++ (our efficient version of EWC [Kirkpatrick2016EWC]) and Path Integral
[Zenke2017Continual] with a theoretically grounded KL-divergence based
perspective. We provide a thorough analysis of various IL algorithms on MNIST
and CIFAR-100 datasets. In these experiments, RWalk obtains superior results in
terms of accuracy, and also provides a better trade-off between forgetting and
intransigence
Modeling drying kinetics of thyme (thymus vulgaris l.): theoretical and empirical models, and neural networks
[EN] The drying kinetics of thyme was analyzed by considering different conditions: air temperature of between
40 C and 70 C, and air velocity of 1 m/s. A theoretical diffusion model and eight different empirical models
were fitted to the experimental data. From the theoretical model application, the effective diffusivity per unit
area of the thyme was estimated (between 3.68 10 5 and 2.12 10 4 s 1). The temperature dependence
of the effective diffusivity was described by the Arrhenius relationship with activation energy of 49.42 kJ/mol.
Eight different empirical models were fitted to the experimental data. Additionally, the dependence of the
parameters of each model on the drying temperature was determined, obtaining equations that allow estimating
the evolution of the moisture content at any temperature in the established range. Furthermore,
artificial neural networks were developed and compared with the theoretical and empirical models using
the percentage of the relative errors and the explained variance. The artificial neural networks were found
to be more accurate predictors of moisture evolution with VAR 99.3% and ER 8.7%.The authors acknowledge the financial support from the 'Ministerio de Educacion y Ciencia' in Spain, CONSOLIDER INGENIO 2010 (CSD2007-00016).Rodríguez Cortina, J.; Clemente Polo, G.; Sanjuán Pellicer, MN.; Bon Corbín, J. (2014). Modeling drying kinetics of thyme (thymus vulgaris l.): theoretical and empirical models, and neural networks. Food Science and Technology International. 20(1):13-22. https://doi.org/10.1177/1082013212469614S132220
Greater body mass index is a better predictor of subclinical cardiac damage at long-term follow-up in men than is insulin sensitivity:a prospective, population-based cohort study
To examine whether lower insulin sensitivity as determined by homeostatic model assessment (HOMA-%S) was associated with increased left ventricular mass (LVM) and presence of LV diastolic dysfunction at long-term follow-up, independently of body mass index (BMI), in middle-aged, otherwise healthy males
Genuinely Distributed Byzantine Machine Learning
Machine Learning (ML) solutions are nowadays distributed, according to the
so-called server/worker architecture. One server holds the model parameters
while several workers train the model. Clearly, such architecture is prone to
various types of component failures, which can be all encompassed within the
spectrum of a Byzantine behavior. Several approaches have been proposed
recently to tolerate Byzantine workers. Yet all require trusting a central
parameter server. We initiate in this paper the study of the ``general''
Byzantine-resilient distributed machine learning problem where no individual
component is trusted.
We show that this problem can be solved in an asynchronous system, despite
the presence of Byzantine parameter servers and
Byzantine workers (which is optimal). We present a new algorithm, ByzSGD, which
solves the general Byzantine-resilient distributed machine learning problem by
relying on three major schemes. The first, Scatter/Gather, is a communication
scheme whose goal is to bound the maximum drift among models on correct
servers. The second, Distributed Median Contraction (DMC), leverages the
geometric properties of the median in high dimensional spaces to bring
parameters within the correct servers back close to each other, ensuring
learning convergence. The third, Minimum-Diameter Averaging (MDA), is a
statistically-robust gradient aggregation rule whose goal is to tolerate
Byzantine workers. MDA requires loose bound on the variance of non-Byzantine
gradient estimates, compared to existing alternatives (e.g., Krum).
Interestingly, ByzSGD ensures Byzantine resilience without adding communication
rounds (on a normal path), compared to vanilla non-Byzantine alternatives.
ByzSGD requires, however, a larger number of messages which, we show, can be
reduced if we assume synchrony.Comment: This is a merge of arXiv:1905.03853 and arXiv:1911.07537;
arXiv:1911.07537 will be retracte
Effect of Preinjury Oral Anticoagulants on Outcomes Following Traumatic Brain Injury from Falls in Older Adults
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/156128/2/phar2435_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156128/1/phar2435.pd
Identification of Important Factors Affecting Use of Digital Individualised Coaching and Treatment of Type 2 Diabetes in General Practice: A Qualitative Feasibility Study
Most type 2 diabetes patients are treated in general practice and there is a need of developing and implementing efficient lifestyle interventions. eHealth interventions have shown to be effective in promoting a healthy lifestyle. The purpose of this study was to test the feasibility, including the identification of factors of importance, when offering digital lifestyle coaching to type 2 diabetes patients in general practice. We conducted a qualitative feasibility study with focus group interviews in four general practices. We identified two overall themes and four subthemes: (1) the distribution of roles and lifestyle interventions in general practice (subthemes: external and internal distribution of roles) and (2) the pros and cons for digital lifestyle interventions in general practice (subthemes: access to real life data and change in daily routines). We conclude that for digital lifestyle coaching to be feasible in a general practice setting, it was of great importance that the general practitioners and practice nurses knew the role and content of the intervention. In general, there was a positive attitude in the general practice setting towards referring type 2 diabetes patients to digital lifestyle intervention if it was easy to refer the patients and if easily understandable and accessible feedback was implemented into the electronic health record. It was important that the digital lifestyle intervention was flexible and offered healthcare providers in general practice an opportunity to follow the type 2 diabetes patient closely
Validation of nonlinear PCA
Linear principal component analysis (PCA) can be extended to a nonlinear PCA
by using artificial neural networks. But the benefit of curved components
requires a careful control of the model complexity. Moreover, standard
techniques for model selection, including cross-validation and more generally
the use of an independent test set, fail when applied to nonlinear PCA because
of its inherent unsupervised characteristics. This paper presents a new
approach for validating the complexity of nonlinear PCA models by using the
error in missing data estimation as a criterion for model selection. It is
motivated by the idea that only the model of optimal complexity is able to
predict missing values with the highest accuracy. While standard test set
validation usually favours over-fitted nonlinear PCA models, the proposed model
validation approach correctly selects the optimal model complexity.Comment: 12 pages, 5 figure
The Relativistic Bound State Problem in QCD: Transverse Lattice Methods
The formalism for describing hadrons using a light-cone Hamiltonian of SU(N)
gauge theory on a coarse transverse lattice is reviewed. Physical gauge degrees
of freedom are represented by disordered flux fields on the links of the
lattice. A renormalised light-cone Hamiltonian is obtained by making a
colour-dielectric expansion for the link-field interactions. Parameters in the
Hamiltonian are renormalised non-perturbatively by seeking regions in parameter
space with enhanced Lorentz symmetry. In the case of pure gauge theories to
lowest non-trivial order of the colour-dielectric expansion, this is sufficient
to determine accurately all parameters in the large-N limit. We summarize
results from applications to glueballs. After quarks are added, the Hamiltonian
and Hilbert space are expanded in both dynamical fermion and link fields.
Lorentz and chiral symmetry are not sufficient to accurately determine all
parameters to lowest non-trivial order of these expansions. However, Lorentz
symmetry and one phenomenological input, a chiral symmetry breaking scale, are
enough to fix all parameters unambiguously. Applications to light-light and
heavy-light mesons are described.Comment: 55 pp, revised version, to appear in `Progress in Particle and
Nuclear Physics
An intense traveling airglow front in the upper mesosphere-lower thermosphere with characteristics of a bore observed over Alice Springs, Australia, during a strong 2 day wave episode
Extent: 13p.The Aerospace Corporation's Nightglow Imager observed a large step function change in airglow in the form of a traveling front in the OH Meinel (OHM) and O2atmospheric (O2A) airglow emissions over Alice Springs, Australia, on 2 February 2003. The front exhibited nearly a factor of 2 stepwise increase in the OHM brightness and a stepwise decrease in the O2A brightness. There was significant (∼25 K) cooling behind the airglow fronts. The OHM airglow brightness behind the front was among the brightest for Alice Springs that we have measured in 7 years of observations. The event was associated with a strong phase-locked 2 day wave (PL/TDW). We have analyzed the wave trapping conditions for the upper mesosphere and lower thermosphere using a combination of data and empirical models and found that the airglow layers were located in a region of ducting. The PL/TDW-disturbed wind profile was effective in supporting a high degree of ducting, whereas without the PL/TDW the ducting was minimal or nonexistent. The change in brightness in each layer was associated with a strong leading disturbance followed by a train of weak barely visible waves. In OHM the leading disturbance was an isolated disturbance resembling a solitary wave. The characteristics of the wave train suggest an undular bore with some turbulent dissipation at the leading edge.R. L. Walterscheid, J. H. Hecht, L. J. Gelinas, M. P. Hickey, and I. M. Rei
- …