616 research outputs found
Private Incremental Regression
Data is continuously generated by modern data sources, and a recent challenge
in machine learning has been to develop techniques that perform well in an
incremental (streaming) setting. In this paper, we investigate the problem of
private machine learning, where as common in practice, the data is not given at
once, but rather arrives incrementally over time.
We introduce the problems of private incremental ERM and private incremental
regression where the general goal is to always maintain a good empirical risk
minimizer for the history observed under differential privacy. Our first
contribution is a generic transformation of private batch ERM mechanisms into
private incremental ERM mechanisms, based on a simple idea of invoking the
private batch ERM procedure at some regular time intervals. We take this
construction as a baseline for comparison. We then provide two mechanisms for
the private incremental regression problem. Our first mechanism is based on
privately constructing a noisy incremental gradient function, which is then
used in a modified projected gradient procedure at every timestep. This
mechanism has an excess empirical risk of , where is the
dimensionality of the data. While from the results of [Bassily et al. 2014]
this bound is tight in the worst-case, we show that certain geometric
properties of the input and constraint set can be used to derive significantly
better results for certain interesting regression problems.Comment: To appear in PODS 201
Quantity makes quality: learning with partial views
In many real world applications, the number of examples to learn from is plentiful, but we can only obtain limited information on each individual example. We study the possibilities of efficient, provably correct, large-scale learning in such settings. The main theme we would like to establish is that large amounts of examples can compensate for the lack of full information on each individual example. The type of partial information we consider can be due to inherent noise or from constraints on the type of interaction with the data source. In particular, we describe and analyze algorithms for budgeted learning, in which the learner can only view a few attributes of each training example (Cesa-Bianchi, Shalev-Shwartz, and Shamir 2010a; 2010c), and algorithms for learning kernel-based predictors, when individual examples are corrupted by random noise (Cesa-Bianchi, Shalev-Shwartz, and Shamir 2010b)
Primitive Words, Free Factors and Measure Preservation
Let F_k be the free group on k generators. A word w \in F_k is called
primitive if it belongs to some basis of F_k. We investigate two criteria for
primitivity, and consider more generally, subgroups of F_k which are free
factors.
The first criterion is graph-theoretic and uses Stallings core graphs: given
subgroups of finite rank H \le J \le F_k we present a simple procedure to
determine whether H is a free factor of J. This yields, in particular, a
procedure to determine whether a given element in F_k is primitive.
Again let w \in F_k and consider the word map w:G x G x ... x G \to G (from
the direct product of k copies of G to G), where G is an arbitrary finite
group. We call w measure preserving if given uniform measure on G x G x ... x
G, w induces uniform measure on G (for every finite G). This is the second
criterion we investigate: it is not hard to see that primitivity implies
measure preservation and it was conjectured that the two properties are
equivalent. Our combinatorial approach to primitivity allows us to make
progress on this problem and in particular prove the conjecture for k=2.
It was asked whether the primitive elements of F_k form a closed set in the
profinite topology of free groups. Our results provide a positive answer for
F_2.Comment: This is a unified version of two manuscripts: "On Primitive words I:
A New Algorithm", and "On Primitive Words II: Measure Preservation". 42
pages, 14 figures. Some parts of the paper reorganized towards publication in
the Israel J. of Mat
Identification of novel subgroup a variants with enhanced receptor binding and replicative capacity in primary isolates of anaemogenic strains of feline leukaemia virus
<b>BACKGROUND:</b>
The development of anaemia in feline leukaemia virus (FeLV)-infected cats is associated with the emergence of a novel viral subgroup, FeLV-C. FeLV-C arises from the subgroup that is transmitted, FeLV-A, through alterations in the amino acid sequence of the receptor binding domain (RBD) of the envelope glycoprotein that result in a shift in the receptor usage and the cell tropism of the virus. The factors that influence the transition from subgroup A to subgroup C remain unclear, one possibility is that a selective pressure in the host drives the acquisition of mutations in the RBD, creating A/C intermediates with enhanced abilities to interact with the FeLV-C receptor, FLVCR. In order to understand further the emergence of FeLV-C in the infected cat, we examined primary isolates of FeLV-C for evidence of FeLV-A variants that bore mutations consistent with a gradual evolution from FeLV-A to FeLV-C.<p></p>
<b>RESULTS:</b>
Within each isolate of FeLV-C, we identified variants that were ostensibly subgroup A by nucleic acid sequence comparisons, but which bore mutations in the RBD. One such mutation, N91D, was present in multiple isolates and when engineered into a molecular clone of the prototypic FeLV-A (Glasgow-1), enhanced replication was noted in feline cells. Expression of the N91D Env on murine leukaemia virus (MLV) pseudotypes enhanced viral entry mediated by the FeLV-A receptor THTR1 while soluble FeLV-A Env bearing the N91D mutation bound more efficiently to mouse or guinea pig cells bearing the FeLV-A and -C receptors. Long-term in vitro culture of variants bearing the N91D substitution in the presence of anti-FeLV gp70 antibodies did not result in the emergence of FeLV-C variants, suggesting that additional selective pressures in the infected cat may drive the subsequent evolution from subgroup A to subgroup C.<p></p>
<b>CONCLUSIONS:</b>
Our data support a model in which variants of FeLV-A, bearing subtle differences in the RBD of Env, may be predisposed towards enhanced replication in vivo and subsequent conversion to FeLV-C. The selection pressures in vivo that drive the emergence of FeLV-C in a proportion of infected cats remain to be established
Invariant Distribution of Promoter Activities in Escherichia coli
Cells need to allocate their limited resources to express a wide range of genes. To understand how Escherichia coli partitions its transcriptional resources between its different promoters, we employ a robotic assay using a comprehensive reporter strain library for E. coli to measure promoter activity on a genomic scale at high-temporal resolution and accuracy. This allows continuous tracking of promoter activity as cells change their growth rate from exponential to stationary phase in different media. We find a heavy-tailed distribution of promoter activities, with promoter activities spanning several orders of magnitude. While the shape of the distribution is almost completely independent of the growth conditions, the identity of the promoters expressed at different levels does depend on them. Translation machinery genes, however, keep the same relative expression levels in the distribution across conditions, and their fractional promoter activity tracks growth rate tightly. We present a simple optimization model for resource allocation which suggests that the observed invariant distributions might maximize growth rate. These invariant features of the distribution of promoter activities may suggest design constraints that shape the allocation of transcriptional resources
MCM9 is associated with germline predisposition to early-onset cancer-clinical evidence
Mutated MCM9 has been associated with primary ovarian insufficiency. Although MCM9 plays a role in genome maintenance and has been reported as a candidate gene in a few patients with inherited colorectal cancer (CRC), it has not been clearly established as a cancer predisposition gene. We re-evaluated family members with MCM9-associated fertility problems. The heterozygote parents had a few colonic polys. Three siblings had early-onset cancer: one had metastatic cervical cancer and two had early-onset CRC. Moreover, a review of the literature on MCM9 carriers revealed that of nine bi-allelic carriers reported, eight had early-onset cancer. We provide clinical evidence for MCM9 as a cancer germline predisposition gene associated with early-onset cancer and polyposis, mainly in a recessive inheritance pattern. These observations, coupled with the phenotype in knockout mice, suggest that diagnostic testing for polyposis, CRC, and infertility should include MCM9 analysis. Early screening protocols may be beneficial for carriers.Hereditary cancer genetic
Dyscalculia from a developmental and differential perspective
Developmental dyscalculia (DD) and its treatment are receiving increasing research attention. A PsychInfo search for peer-reviewed articles with dyscalculia as a title word reveals 31 papers published from 1991–2001, versus 74 papers published from 2002–2012. Still, these small counts reflect the paucity of research on DD compared to dyslexia, despite the prevalence of mathematical difficulties. In the UK, 22% of adults have mathematical difficulties sufficient to impose severe practical and occupational restrictions (Bynner and Parsons, 1997; National Center for Education Statistics, 2011). It is unlikely that all of these individuals with mathematical difficulties have DD, but criteria for defining and diagnosing dyscalculia remain ambiguous (Mazzocco and Myers, 2003). What is treated as DD in one study may be conceptualized as another form of mathematical impairment in another study. Furthermore, DD is frequently—but, we believe, mistakenly- considered a largely homogeneous disorder. Here we advocate a differential and developmental perspective on DD focused on identifying behavioral, cognitive, and neural sources of individual differences that contribute to our understanding of what DD is and what it is not
Robustness and Generalization
We derive generalization bounds for learning algorithms based on their
robustness: the property that if a testing sample is "similar" to a training
sample, then the testing error is close to the training error. This provides a
novel approach, different from the complexity or stability arguments, to study
generalization of learning algorithms. We further show that a weak notion of
robustness is both sufficient and necessary for generalizability, which implies
that robustness is a fundamental property for learning algorithms to work
A Prediction Model to Prioritize Individuals for a SARS-CoV-2 Test Built from National Symptom Surveys
Background: The gold standard for COVID-19 diagnosis is detection of viral RNA through PCR. Due to global limitations in testing capacity, effective prioritization of individuals for testing is essential. Methods: We devised a model estimating the probability of an individual to test positive for COVID-19 based on answers to 9 simple questions that have been associated with SARS-CoV-2 infection. Our model was devised from a subsample of a national symptom survey that was answered over 2 million times in Israel in its first 2 months and a targeted survey distributed to all residents of several cities in Israel. Overall, 43,752 adults were included, from which 498 self-reported as being COVID-19 positive. Findings: Our model was validated on a held-out set of individuals from Israel where it achieved an auROC of 0.737 (CI: 0.712–0.759) and auPR of 0.144 (CI: 0.119–0.177) and demonstrated its applicability outside of Israel in an independently collected symptom survey dataset from the US, UK, and Sweden. Our analyses revealed interactions between several symptoms and age, suggesting variation in the clinical manifestation of the disease in different age groups. Conclusions: Our tool can be used online and without exposure to suspected patients, thus suggesting worldwide utility in combating COVID-19 by better directing the limited testing resources through prioritization of individuals for testing, thereby increasing the rate at which positive individuals can be identified. Moreover, individuals at high risk for a positive test result can be isolated prior to testing. Funding: E.S. is supported by the Crown Human Genome Center, Larson Charitable Foundation New Scientist Fund, Else Kroener Fresenius Foundation, White Rose International Foundation, Ben B. and Joyce E. Eisenberg Foundation, Nissenbaum Family, Marcos Pinheiro de Andrade and Vanessa Buchheim, Lady Michelle Michels, and Aliza Moussaieff and grants funded by the Minerva foundation with funding from the Federal German Ministry for Education and Research and by the European Research Council and the Israel Science Foundation. H.R. is supported by the Israeli Council for Higher Education (CHE) via the Weizmann Data Science Research Center and by a research grant from Madame Olga Klein – Astrachan
Stress-Induced Reinstatement of Drug Seeking: 20 Years of Progress
In human addicts, drug relapse and craving are often provoked by stress. Since 1995, this clinical scenario has been studied using a rat model of stress-induced reinstatement of drug seeking. Here, we first discuss the generality of stress-induced reinstatement to different drugs of abuse, different stressors, and different behavioral procedures. We also discuss neuropharmacological mechanisms, and brain areas and circuits controlling stress-induced reinstatement of drug seeking. We conclude by discussing results from translational human laboratory studies and clinical trials that were inspired by results from rat studies on stress-induced reinstatement. Our main conclusions are (1) The phenomenon of stress-induced reinstatement, first shown with an intermittent footshock stressor in rats trained to self-administer heroin, generalizes to other abused drugs, including cocaine, methamphetamine, nicotine, and alcohol, and is also observed in the conditioned place preference model in rats and mice. This phenomenon, however, is stressor specific and not all stressors induce reinstatement of drug seeking. (2) Neuropharmacological studies indicate the involvement of corticotropin-releasing factor (CRF), noradrenaline, dopamine, glutamate, kappa/dynorphin, and several other peptide and neurotransmitter systems in stress-induced reinstatement. Neuropharmacology and circuitry studies indicate the involvement of CRF and noradrenaline transmission in bed nucleus of stria terminalis and central amygdala, and dopamine, CRF, kappa/dynorphin, and glutamate transmission in other components of the mesocorticolimbic dopamine system (ventral tegmental area, medial prefrontal cortex, orbitofrontal cortex, and nucleus accumbens). (3) Translational human laboratory studies and a recent clinical trial study show the efficacy of alpha-2 adrenoceptor agonists in decreasing stress-induced drug craving and stress-induced initial heroin lapse
- …