27 research outputs found

    Implicitly Constrained Semi-Supervised Linear Discriminant Analysis

    Full text link
    Semi-supervised learning is an important and active topic of research in pattern recognition. For classification using linear discriminant analysis specifically, several semi-supervised variants have been proposed. Using any one of these methods is not guaranteed to outperform the supervised classifier which does not take the additional unlabeled data into account. In this work we compare traditional Expectation Maximization type approaches for semi-supervised linear discriminant analysis with approaches based on intrinsic constraints and propose a new principled approach for semi-supervised linear discriminant analysis, using so-called implicit constraints. We explore the relationships between these methods and consider the question if and in what sense we can expect improvement in performance over the supervised procedure. The constraint based approaches are more robust to misspecification of the model, and may outperform alternatives that make more assumptions on the data, in terms of the log-likelihood of unseen objects.Comment: 6 pages, 3 figures and 3 tables. International Conference on Pattern Recognition (ICPR) 2014, Stockholm, Swede

    Projected Estimators for Robust Semi-supervised Classification

    Get PDF
    For semi-supervised techniques to be applied safely in practice we at least want methods to outperform their supervised counterparts. We study this question for classification using the well-known quadratic surrogate loss function. Using a projection of the supervised estimate onto a set of constraints imposed by the unlabeled data, we find we can safely improve over the supervised solution in terms of this quadratic loss. Unlike other approaches to semi-supervised learning, the procedure does not rely on assumptions that are not intrinsic to the classifier at hand. It is theoretically demonstrated that, measured on the labeled and unlabeled training data, this semi-supervised procedure never gives a lower quadratic loss than the supervised alternative. To our knowledge this is the first approach that offers such strong, albeit conservative, guarantees for improvement over the supervised solution. The characteristics of our approach are explicated using benchmark datasets to further understand the similarities and differences between the quadratic loss criterion used in the theoretical results and the classification accuracy often considered in practice.Comment: 13 pages, 2 figures, 1 tabl

    A Brief Prehistory of Double Descent

    Full text link
    In their thought-provoking paper [1], Belkin et al. illustrate and discuss the shape of risk curves in the context of modern high-complexity learners. Given a fixed training sample size nn, such curves show the risk of a learner as a function of some (approximate) measure of its complexity NN. With NN the number of features, these curves are also referred to as feature curves. A salient observation in [1] is that these curves can display, what they call, double descent: with increasing NN, the risk initially decreases, attains a minimum, and then increases until NN equals nn, where the training data is fitted perfectly. Increasing NN even further, the risk decreases a second and final time, creating a peak at N=nN=n. This twofold descent may come as a surprise, but as opposed to what [1] reports, it has not been overlooked historically. Our letter draws attention to some original, earlier findings, of interest to contemporary machine learning

    Feature-level domain adaptation

    Get PDF
    Domain adaptation is the supervised learning setting in which the training and test data are sampled from different distributions: training data is sampled from a source domain, whilst test data is sampled from a target domain. This paper proposes and studies an approach, called feature-level domain adaptation (FLDA), that models the dependence between the two domains by means of a feature-level transfer model that is trained to describe the transfer from source to target domain. Subsequently, we train a domain-adapted classifier by minimizing the expected loss under the resulting transfer model. For linear classifiers and a large family of loss functions and transfer models, this expected loss can be computed or approximated analytically, and minimized efficiently. Our empirical evaluation of FLDA focuses on problems comprising binary and count data in which the transfer can be naturally modeled via a dropout distribution, which allows the classifier to adapt to differences in the marginal probability of features in the source and the target domain. Our experiments on several real-world problems show that FLDA performs on par with state-of-the-art domain-adaptation techniques.Comment: 32 pages, 13 figures, 9 table

    Real-Life Gait Performance as a Digital Biomarker for Motor Fluctuations: The Parkinson@Home Validation Study

    Get PDF
    Background: Wearable sensors have been used successfully to characterize bradykinetic gait in patients with Parkinson disease (PD), but most studies to date have been conducted in highly controlled laboratory environments. Objective: This paper aims to assess whether sensor-based analysis of real-life gait can be used to objectively and remotely monitor motor fluctuations in PD. Methods: The Parkinson@Home validation study provides a new reference data set for the development of digital biomarkers to monitor persons with PD in daily life. Specifically, a group of 25 patients with PD with motor fluctuations and 25 age-matched controls performed unscripted daily activities in and around their homes for at least one hour while being recorded on video. Patients with PD did this twice: once after overnight withdrawal of dopaminergic medication and again 1 hour after medication intake. Participants wore sensors on both wrists and ankles, on the lower back, and in the front pants pocket, capturing movement and contextual data. Gait segments of 25 seconds were extracted from accelerometer signals based on manual video annotations. The power spectral density of each segment and device was estimated using Welch’s method, from which the total power in the 0.5- to 10-Hz band, width of the dominant frequency, and cadence were derived. The ability to discriminate between before and after medication intake and between patients with PD and controls was evaluated using leave-one-subject-out nested cross-validation. Results: From 18 patients with PD (11 men; median age 65 years) and 24 controls (13 men; median age 68 years), ≥10 gait segments were available. Using logistic LASSO (least absolute shrinkage and selection operator) regression, we classified whether the unscripted gait segments occurred before or after medication intake, with mean area under the receiver operator curves (AUCs) varying between 0.70 (ankle of least affected side, 95% CI 0.60-0.81) and 0.82 (ankle of most affected side, 95% CI 0.72-0.92) across sensor locations. Combining all sensor locations did not significantly improve classification (AUC 0.84, 95% CI 0.75-0.93). Of all signal properties, the total power in the 0.5- to 10-Hz band was most responsive to dopaminergic medication. Discriminating between patients with PD and controls was generally more difficult (AUC of all sensor locations combined: 0.76, 95% CI 0.62-0.90). The video recordings revealed that the positioning of the hands during real-life gait had a substantial impact on the power spectral density of both the wrist and pants pocket sensor. Conclusions: We present a new video-referenced data set that includes unscripted activities in and around the participants’ homes. Using this data set, we show the feasibility of using sensor-based analysis of real-life gait to monitor motor fluctuations with a single sensor location. Future work may assess the value of contextual sensors to control for real-world confounders

    Possible modification of BRSK1 on the risk of alkylating chemotherapy-related reduced ovarian function

    Get PDF
    STUDY QUESTION: Do genetic variations in the DNA damage response pathway modify the adverse effect of alkylating agents on ovarian function in female childhood cancer survivors (CCS)? SUMMARY ANSWER: Female CCS carrying a common BR serine/threonine kinase 1 (BRSK1) gene variant appear to be at 2.5-fold increased odds of reduced ovarian function after treatment with high doses of alkylating chemotherapy. WHAT IS KNOWN ALREADY: Female CCS show large inter-individual variability in the impact of DNA-damaging alkylating chemotherapy, given as treatment of childhood cancer, on adult ovarian function. Genetic variants in DNA repair genes affecting ovarian function might explain this variability. STUDY DESIGN, SIZE, DURATION: CCS for the discovery cohort were identified from the Dutch Childhood Oncology Group (DCOG) LATER VEVO-study, a multi-centre retrospective cohort study evaluating fertility, ovarian reserve and risk of premature menopause among adult female 5-year survivors of childhood cancer. Female 5-year CCS, diagnosed with cancer and treated with chemotherapy before the age of 25 years, and aged 18 years or older at time of study were enrolled in the current study. Results from the discovery Dutch DCOG-LATER VEVO cohort (n = 285) were validated in the pan-European PanCareLIFE (n =465) and the USA-based St. Jude Lifetime Cohort (n = 391). PARTICIPANTS/MATERIALS, SETTING, METHODS: To evaluate ovarian function, anti-Miillerian hormone (AMH) levels were assessed in both the discovery cohort and the replication cohorts. Using additive genetic models in linear and logistic regression, five genetic variants involved in DNA damage response were analysed in relation to cyclophosphamide equivalent dose (CED) score and their impact on ovarian function. Results were then examined using fixed-effect meta-analysis. MAIN RESULTS AND THE ROLE OF CHANCE: Meta-analysis across the three independent cohorts showed a significant interaction effect (P= 3.0 x 10(-4)) between rs11668344 of BRSK 1 (allele frequency = 0.34) among CCS treated with high-dose alkylating agents (CED score >= 8000 mg/m(2)), resulting in a 2.5-fold increased odds of a reduced ovarian function (lowest AMH tertile) for CCS carrying one G allele compared to CCS without this allele (odds ratio genotype AA: 2.01 vs AG: 5.00). LIMITATIONS, REASONS FOR CAUTION: While low AMH levels can also identify poor responders in assisted reproductive technology, it needs to be emphasized that AMH remains a surrogate marker of ovarian function. WIDER IMPLICATIONS OF THE FINDINGS: Further research, validating our findings and identifying additional risk contributing genetic variants, may enable individualized counselling regarding treatment-related risks and necessity of fertility preservation procedures in girls with cancer
    corecore