616 research outputs found

    Private Incremental Regression

    Full text link
    Data is continuously generated by modern data sources, and a recent challenge in machine learning has been to develop techniques that perform well in an incremental (streaming) setting. In this paper, we investigate the problem of private machine learning, where as common in practice, the data is not given at once, but rather arrives incrementally over time. We introduce the problems of private incremental ERM and private incremental regression where the general goal is to always maintain a good empirical risk minimizer for the history observed under differential privacy. Our first contribution is a generic transformation of private batch ERM mechanisms into private incremental ERM mechanisms, based on a simple idea of invoking the private batch ERM procedure at some regular time intervals. We take this construction as a baseline for comparison. We then provide two mechanisms for the private incremental regression problem. Our first mechanism is based on privately constructing a noisy incremental gradient function, which is then used in a modified projected gradient procedure at every timestep. This mechanism has an excess empirical risk of d\approx\sqrt{d}, where dd is the dimensionality of the data. While from the results of [Bassily et al. 2014] this bound is tight in the worst-case, we show that certain geometric properties of the input and constraint set can be used to derive significantly better results for certain interesting regression problems.Comment: To appear in PODS 201

    Quantity makes quality: learning with partial views

    Get PDF
    In many real world applications, the number of examples to learn from is plentiful, but we can only obtain limited information on each individual example. We study the possibilities of efficient, provably correct, large-scale learning in such settings. The main theme we would like to establish is that large amounts of examples can compensate for the lack of full information on each individual example. The type of partial information we consider can be due to inherent noise or from constraints on the type of interaction with the data source. In particular, we describe and analyze algorithms for budgeted learning, in which the learner can only view a few attributes of each training example (Cesa-Bianchi, Shalev-Shwartz, and Shamir 2010a; 2010c), and algorithms for learning kernel-based predictors, when individual examples are corrupted by random noise (Cesa-Bianchi, Shalev-Shwartz, and Shamir 2010b)

    Primitive Words, Free Factors and Measure Preservation

    Full text link
    Let F_k be the free group on k generators. A word w \in F_k is called primitive if it belongs to some basis of F_k. We investigate two criteria for primitivity, and consider more generally, subgroups of F_k which are free factors. The first criterion is graph-theoretic and uses Stallings core graphs: given subgroups of finite rank H \le J \le F_k we present a simple procedure to determine whether H is a free factor of J. This yields, in particular, a procedure to determine whether a given element in F_k is primitive. Again let w \in F_k and consider the word map w:G x G x ... x G \to G (from the direct product of k copies of G to G), where G is an arbitrary finite group. We call w measure preserving if given uniform measure on G x G x ... x G, w induces uniform measure on G (for every finite G). This is the second criterion we investigate: it is not hard to see that primitivity implies measure preservation and it was conjectured that the two properties are equivalent. Our combinatorial approach to primitivity allows us to make progress on this problem and in particular prove the conjecture for k=2. It was asked whether the primitive elements of F_k form a closed set in the profinite topology of free groups. Our results provide a positive answer for F_2.Comment: This is a unified version of two manuscripts: "On Primitive words I: A New Algorithm", and "On Primitive Words II: Measure Preservation". 42 pages, 14 figures. Some parts of the paper reorganized towards publication in the Israel J. of Mat

    Identification of novel subgroup a variants with enhanced receptor binding and replicative capacity in primary isolates of anaemogenic strains of feline leukaemia virus

    Get PDF
    <b>BACKGROUND:</b> The development of anaemia in feline leukaemia virus (FeLV)-infected cats is associated with the emergence of a novel viral subgroup, FeLV-C. FeLV-C arises from the subgroup that is transmitted, FeLV-A, through alterations in the amino acid sequence of the receptor binding domain (RBD) of the envelope glycoprotein that result in a shift in the receptor usage and the cell tropism of the virus. The factors that influence the transition from subgroup A to subgroup C remain unclear, one possibility is that a selective pressure in the host drives the acquisition of mutations in the RBD, creating A/C intermediates with enhanced abilities to interact with the FeLV-C receptor, FLVCR. In order to understand further the emergence of FeLV-C in the infected cat, we examined primary isolates of FeLV-C for evidence of FeLV-A variants that bore mutations consistent with a gradual evolution from FeLV-A to FeLV-C.<p></p> <b>RESULTS:</b> Within each isolate of FeLV-C, we identified variants that were ostensibly subgroup A by nucleic acid sequence comparisons, but which bore mutations in the RBD. One such mutation, N91D, was present in multiple isolates and when engineered into a molecular clone of the prototypic FeLV-A (Glasgow-1), enhanced replication was noted in feline cells. Expression of the N91D Env on murine leukaemia virus (MLV) pseudotypes enhanced viral entry mediated by the FeLV-A receptor THTR1 while soluble FeLV-A Env bearing the N91D mutation bound more efficiently to mouse or guinea pig cells bearing the FeLV-A and -C receptors. Long-term in vitro culture of variants bearing the N91D substitution in the presence of anti-FeLV gp70 antibodies did not result in the emergence of FeLV-C variants, suggesting that additional selective pressures in the infected cat may drive the subsequent evolution from subgroup A to subgroup C.<p></p> <b>CONCLUSIONS:</b> Our data support a model in which variants of FeLV-A, bearing subtle differences in the RBD of Env, may be predisposed towards enhanced replication in vivo and subsequent conversion to FeLV-C. The selection pressures in vivo that drive the emergence of FeLV-C in a proportion of infected cats remain to be established

    Invariant Distribution of Promoter Activities in Escherichia coli

    Get PDF
    Cells need to allocate their limited resources to express a wide range of genes. To understand how Escherichia coli partitions its transcriptional resources between its different promoters, we employ a robotic assay using a comprehensive reporter strain library for E. coli to measure promoter activity on a genomic scale at high-temporal resolution and accuracy. This allows continuous tracking of promoter activity as cells change their growth rate from exponential to stationary phase in different media. We find a heavy-tailed distribution of promoter activities, with promoter activities spanning several orders of magnitude. While the shape of the distribution is almost completely independent of the growth conditions, the identity of the promoters expressed at different levels does depend on them. Translation machinery genes, however, keep the same relative expression levels in the distribution across conditions, and their fractional promoter activity tracks growth rate tightly. We present a simple optimization model for resource allocation which suggests that the observed invariant distributions might maximize growth rate. These invariant features of the distribution of promoter activities may suggest design constraints that shape the allocation of transcriptional resources

    MCM9 is associated with germline predisposition to early-onset cancer-clinical evidence

    Get PDF
    Mutated MCM9 has been associated with primary ovarian insufficiency. Although MCM9 plays a role in genome maintenance and has been reported as a candidate gene in a few patients with inherited colorectal cancer (CRC), it has not been clearly established as a cancer predisposition gene. We re-evaluated family members with MCM9-associated fertility problems. The heterozygote parents had a few colonic polys. Three siblings had early-onset cancer: one had metastatic cervical cancer and two had early-onset CRC. Moreover, a review of the literature on MCM9 carriers revealed that of nine bi-allelic carriers reported, eight had early-onset cancer. We provide clinical evidence for MCM9 as a cancer germline predisposition gene associated with early-onset cancer and polyposis, mainly in a recessive inheritance pattern. These observations, coupled with the phenotype in knockout mice, suggest that diagnostic testing for polyposis, CRC, and infertility should include MCM9 analysis. Early screening protocols may be beneficial for carriers.Hereditary cancer genetic

    Dyscalculia from a developmental and differential perspective

    Get PDF
    Developmental dyscalculia (DD) and its treatment are receiving increasing research attention. A PsychInfo search for peer-reviewed articles with dyscalculia as a title word reveals 31 papers published from 1991–2001, versus 74 papers published from 2002–2012. Still, these small counts reflect the paucity of research on DD compared to dyslexia, despite the prevalence of mathematical difficulties. In the UK, 22% of adults have mathematical difficulties sufficient to impose severe practical and occupational restrictions (Bynner and Parsons, 1997; National Center for Education Statistics, 2011). It is unlikely that all of these individuals with mathematical difficulties have DD, but criteria for defining and diagnosing dyscalculia remain ambiguous (Mazzocco and Myers, 2003). What is treated as DD in one study may be conceptualized as another form of mathematical impairment in another study. Furthermore, DD is frequently—but, we believe, mistakenly- considered a largely homogeneous disorder. Here we advocate a differential and developmental perspective on DD focused on identifying behavioral, cognitive, and neural sources of individual differences that contribute to our understanding of what DD is and what it is not

    Robustness and Generalization

    Full text link
    We derive generalization bounds for learning algorithms based on their robustness: the property that if a testing sample is "similar" to a training sample, then the testing error is close to the training error. This provides a novel approach, different from the complexity or stability arguments, to study generalization of learning algorithms. We further show that a weak notion of robustness is both sufficient and necessary for generalizability, which implies that robustness is a fundamental property for learning algorithms to work

    A Prediction Model to Prioritize Individuals for a SARS-CoV-2 Test Built from National Symptom Surveys

    Get PDF
    Background: The gold standard for COVID-19 diagnosis is detection of viral RNA through PCR. Due to global limitations in testing capacity, effective prioritization of individuals for testing is essential. Methods: We devised a model estimating the probability of an individual to test positive for COVID-19 based on answers to 9 simple questions that have been associated with SARS-CoV-2 infection. Our model was devised from a subsample of a national symptom survey that was answered over 2 million times in Israel in its first 2 months and a targeted survey distributed to all residents of several cities in Israel. Overall, 43,752 adults were included, from which 498 self-reported as being COVID-19 positive. Findings: Our model was validated on a held-out set of individuals from Israel where it achieved an auROC of 0.737 (CI: 0.712–0.759) and auPR of 0.144 (CI: 0.119–0.177) and demonstrated its applicability outside of Israel in an independently collected symptom survey dataset from the US, UK, and Sweden. Our analyses revealed interactions between several symptoms and age, suggesting variation in the clinical manifestation of the disease in different age groups. Conclusions: Our tool can be used online and without exposure to suspected patients, thus suggesting worldwide utility in combating COVID-19 by better directing the limited testing resources through prioritization of individuals for testing, thereby increasing the rate at which positive individuals can be identified. Moreover, individuals at high risk for a positive test result can be isolated prior to testing. Funding: E.S. is supported by the Crown Human Genome Center, Larson Charitable Foundation New Scientist Fund, Else Kroener Fresenius Foundation, White Rose International Foundation, Ben B. and Joyce E. Eisenberg Foundation, Nissenbaum Family, Marcos Pinheiro de Andrade and Vanessa Buchheim, Lady Michelle Michels, and Aliza Moussaieff and grants funded by the Minerva foundation with funding from the Federal German Ministry for Education and Research and by the European Research Council and the Israel Science Foundation. H.R. is supported by the Israeli Council for Higher Education (CHE) via the Weizmann Data Science Research Center and by a research grant from Madame Olga Klein – Astrachan

    Stress-Induced Reinstatement of Drug Seeking: 20 Years of Progress

    Get PDF
    In human addicts, drug relapse and craving are often provoked by stress. Since 1995, this clinical scenario has been studied using a rat model of stress-induced reinstatement of drug seeking. Here, we first discuss the generality of stress-induced reinstatement to different drugs of abuse, different stressors, and different behavioral procedures. We also discuss neuropharmacological mechanisms, and brain areas and circuits controlling stress-induced reinstatement of drug seeking. We conclude by discussing results from translational human laboratory studies and clinical trials that were inspired by results from rat studies on stress-induced reinstatement. Our main conclusions are (1) The phenomenon of stress-induced reinstatement, first shown with an intermittent footshock stressor in rats trained to self-administer heroin, generalizes to other abused drugs, including cocaine, methamphetamine, nicotine, and alcohol, and is also observed in the conditioned place preference model in rats and mice. This phenomenon, however, is stressor specific and not all stressors induce reinstatement of drug seeking. (2) Neuropharmacological studies indicate the involvement of corticotropin-releasing factor (CRF), noradrenaline, dopamine, glutamate, kappa/dynorphin, and several other peptide and neurotransmitter systems in stress-induced reinstatement. Neuropharmacology and circuitry studies indicate the involvement of CRF and noradrenaline transmission in bed nucleus of stria terminalis and central amygdala, and dopamine, CRF, kappa/dynorphin, and glutamate transmission in other components of the mesocorticolimbic dopamine system (ventral tegmental area, medial prefrontal cortex, orbitofrontal cortex, and nucleus accumbens). (3) Translational human laboratory studies and a recent clinical trial study show the efficacy of alpha-2 adrenoceptor agonists in decreasing stress-induced drug craving and stress-induced initial heroin lapse
    corecore