1,142 research outputs found

    Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence

    Full text link
    Incremental learning (IL) has received a lot of attention recently, however, the literature lacks a precise problem definition, proper evaluation settings, and metrics tailored specifically for the IL problem. One of the main objectives of this work is to fill these gaps so as to provide a common ground for better understanding of IL. The main challenge for an IL algorithm is to update the classifier whilst preserving existing knowledge. We observe that, in addition to forgetting, a known issue while preserving knowledge, IL also suffers from a problem we call intransigence, inability of a model to update its knowledge. We introduce two metrics to quantify forgetting and intransigence that allow us to understand, analyse, and gain better insights into the behaviour of IL algorithms. We present RWalk, a generalization of EWC++ (our efficient version of EWC [Kirkpatrick2016EWC]) and Path Integral [Zenke2017Continual] with a theoretically grounded KL-divergence based perspective. We provide a thorough analysis of various IL algorithms on MNIST and CIFAR-100 datasets. In these experiments, RWalk obtains superior results in terms of accuracy, and also provides a better trade-off between forgetting and intransigence

    Modeling drying kinetics of thyme (thymus vulgaris l.): theoretical and empirical models, and neural networks

    Full text link
    [EN] The drying kinetics of thyme was analyzed by considering different conditions: air temperature of between 40 C and 70 C, and air velocity of 1 m/s. A theoretical diffusion model and eight different empirical models were fitted to the experimental data. From the theoretical model application, the effective diffusivity per unit area of the thyme was estimated (between 3.68 10 5 and 2.12 10 4 s 1). The temperature dependence of the effective diffusivity was described by the Arrhenius relationship with activation energy of 49.42 kJ/mol. Eight different empirical models were fitted to the experimental data. Additionally, the dependence of the parameters of each model on the drying temperature was determined, obtaining equations that allow estimating the evolution of the moisture content at any temperature in the established range. Furthermore, artificial neural networks were developed and compared with the theoretical and empirical models using the percentage of the relative errors and the explained variance. The artificial neural networks were found to be more accurate predictors of moisture evolution with VAR 99.3% and ER 8.7%.The authors acknowledge the financial support from the 'Ministerio de Educacion y Ciencia' in Spain, CONSOLIDER INGENIO 2010 (CSD2007-00016).Rodríguez Cortina, J.; Clemente Polo, G.; Sanjuán Pellicer, MN.; Bon Corbín, J. (2014). Modeling drying kinetics of thyme (thymus vulgaris l.): theoretical and empirical models, and neural networks. Food Science and Technology International. 20(1):13-22. https://doi.org/10.1177/1082013212469614S132220

    Greater body mass index is a better predictor of subclinical cardiac damage at long-term follow-up in men than is insulin sensitivity:a prospective, population-based cohort study

    Get PDF
    To examine whether lower insulin sensitivity as determined by homeostatic model assessment (HOMA-%S) was associated with increased left ventricular mass (LVM) and presence of LV diastolic dysfunction at long-term follow-up, independently of body mass index (BMI), in middle-aged, otherwise healthy males

    Genuinely Distributed Byzantine Machine Learning

    Full text link
    Machine Learning (ML) solutions are nowadays distributed, according to the so-called server/worker architecture. One server holds the model parameters while several workers train the model. Clearly, such architecture is prone to various types of component failures, which can be all encompassed within the spectrum of a Byzantine behavior. Several approaches have been proposed recently to tolerate Byzantine workers. Yet all require trusting a central parameter server. We initiate in this paper the study of the ``general'' Byzantine-resilient distributed machine learning problem where no individual component is trusted. We show that this problem can be solved in an asynchronous system, despite the presence of 13\frac{1}{3} Byzantine parameter servers and 13\frac{1}{3} Byzantine workers (which is optimal). We present a new algorithm, ByzSGD, which solves the general Byzantine-resilient distributed machine learning problem by relying on three major schemes. The first, Scatter/Gather, is a communication scheme whose goal is to bound the maximum drift among models on correct servers. The second, Distributed Median Contraction (DMC), leverages the geometric properties of the median in high dimensional spaces to bring parameters within the correct servers back close to each other, ensuring learning convergence. The third, Minimum-Diameter Averaging (MDA), is a statistically-robust gradient aggregation rule whose goal is to tolerate Byzantine workers. MDA requires loose bound on the variance of non-Byzantine gradient estimates, compared to existing alternatives (e.g., Krum). Interestingly, ByzSGD ensures Byzantine resilience without adding communication rounds (on a normal path), compared to vanilla non-Byzantine alternatives. ByzSGD requires, however, a larger number of messages which, we show, can be reduced if we assume synchrony.Comment: This is a merge of arXiv:1905.03853 and arXiv:1911.07537; arXiv:1911.07537 will be retracte

    Effect of Preinjury Oral Anticoagulants on Outcomes Following Traumatic Brain Injury from Falls in Older Adults

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/156128/2/phar2435_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156128/1/phar2435.pd

    Identification of Important Factors Affecting Use of Digital Individualised Coaching and Treatment of Type 2 Diabetes in General Practice: A Qualitative Feasibility Study

    Get PDF
    Most type 2 diabetes patients are treated in general practice and there is a need of developing and implementing efficient lifestyle interventions. eHealth interventions have shown to be effective in promoting a healthy lifestyle. The purpose of this study was to test the feasibility, including the identification of factors of importance, when offering digital lifestyle coaching to type 2 diabetes patients in general practice. We conducted a qualitative feasibility study with focus group interviews in four general practices. We identified two overall themes and four subthemes: (1) the distribution of roles and lifestyle interventions in general practice (subthemes: external and internal distribution of roles) and (2) the pros and cons for digital lifestyle interventions in general practice (subthemes: access to real life data and change in daily routines). We conclude that for digital lifestyle coaching to be feasible in a general practice setting, it was of great importance that the general practitioners and practice nurses knew the role and content of the intervention. In general, there was a positive attitude in the general practice setting towards referring type 2 diabetes patients to digital lifestyle intervention if it was easy to refer the patients and if easily understandable and accessible feedback was implemented into the electronic health record. It was important that the digital lifestyle intervention was flexible and offered healthcare providers in general practice an opportunity to follow the type 2 diabetes patient closely

    Validation of nonlinear PCA

    Full text link
    Linear principal component analysis (PCA) can be extended to a nonlinear PCA by using artificial neural networks. But the benefit of curved components requires a careful control of the model complexity. Moreover, standard techniques for model selection, including cross-validation and more generally the use of an independent test set, fail when applied to nonlinear PCA because of its inherent unsupervised characteristics. This paper presents a new approach for validating the complexity of nonlinear PCA models by using the error in missing data estimation as a criterion for model selection. It is motivated by the idea that only the model of optimal complexity is able to predict missing values with the highest accuracy. While standard test set validation usually favours over-fitted nonlinear PCA models, the proposed model validation approach correctly selects the optimal model complexity.Comment: 12 pages, 5 figure

    The Relativistic Bound State Problem in QCD: Transverse Lattice Methods

    Get PDF
    The formalism for describing hadrons using a light-cone Hamiltonian of SU(N) gauge theory on a coarse transverse lattice is reviewed. Physical gauge degrees of freedom are represented by disordered flux fields on the links of the lattice. A renormalised light-cone Hamiltonian is obtained by making a colour-dielectric expansion for the link-field interactions. Parameters in the Hamiltonian are renormalised non-perturbatively by seeking regions in parameter space with enhanced Lorentz symmetry. In the case of pure gauge theories to lowest non-trivial order of the colour-dielectric expansion, this is sufficient to determine accurately all parameters in the large-N limit. We summarize results from applications to glueballs. After quarks are added, the Hamiltonian and Hilbert space are expanded in both dynamical fermion and link fields. Lorentz and chiral symmetry are not sufficient to accurately determine all parameters to lowest non-trivial order of these expansions. However, Lorentz symmetry and one phenomenological input, a chiral symmetry breaking scale, are enough to fix all parameters unambiguously. Applications to light-light and heavy-light mesons are described.Comment: 55 pp, revised version, to appear in `Progress in Particle and Nuclear Physics

    An intense traveling airglow front in the upper mesosphere-lower thermosphere with characteristics of a bore observed over Alice Springs, Australia, during a strong 2 day wave episode

    Get PDF
    Extent: 13p.The Aerospace Corporation's Nightglow Imager observed a large step function change in airglow in the form of a traveling front in the OH Meinel (OHM) and O2atmospheric (O2A) airglow emissions over Alice Springs, Australia, on 2 February 2003. The front exhibited nearly a factor of 2 stepwise increase in the OHM brightness and a stepwise decrease in the O2A brightness. There was significant (∼25 K) cooling behind the airglow fronts. The OHM airglow brightness behind the front was among the brightest for Alice Springs that we have measured in 7 years of observations. The event was associated with a strong phase-locked 2 day wave (PL/TDW). We have analyzed the wave trapping conditions for the upper mesosphere and lower thermosphere using a combination of data and empirical models and found that the airglow layers were located in a region of ducting. The PL/TDW-disturbed wind profile was effective in supporting a high degree of ducting, whereas without the PL/TDW the ducting was minimal or nonexistent. The change in brightness in each layer was associated with a strong leading disturbance followed by a train of weak barely visible waves. In OHM the leading disturbance was an isolated disturbance resembling a solitary wave. The characteristics of the wave train suggest an undular bore with some turbulent dissipation at the leading edge.R. L. Walterscheid, J. H. Hecht, L. J. Gelinas, M. P. Hickey, and I. M. Rei
    corecore