    Reparameterizing the Birkhoff Polytope for Variational Permutation Inference

    Many matching, tracking, sorting, and ranking problems require probabilistic reasoning about possible permutations, a set that grows factorially with dimension. Combinatorial optimization algorithms may enable efficient point estimation, but fully Bayesian inference poses a severe challenge in this high-dimensional, discrete space. To surmount this challenge, we start with the usual step of relaxing a discrete set (here, of permutation matrices) to its convex hull, which here is the Birkhoff polytope: the set of all doubly-stochastic matrices. We then introduce two novel transformations: first, an invertible and differentiable stick-breaking procedure that maps unconstrained space to the Birkhoff polytope; second, a map that rounds points toward the vertices of the polytope. Both transformations include a temperature parameter that, in the limit, concentrates the densities on permutation matrices. We then exploit these transformations and reparameterization gradients to introduce variational inference over permutation matrices, and we demonstrate its utility in a series of experiments

    Risk Factors for Ebola Virus Persistence in Semen of Survivors in Liberia

    BACKGROUND: Long-term persistence of Ebola virus (EBOV) in immunologically privileged sites has been implicated in recent outbreaks of Ebola virus disease (EVD) in Guinea and the Democratic Republic of Congo. This study was designed to understand how the acute course of EVD, convalescence, and host immune and genetic factors may play a role in prolonged viral persistence in semen. METHODS: A cohort of 131 male EVD survivors in Liberia were enrolled in a case-case study. Early clearers were defined as those with 2 consecutive negative EBOV semen test results by real-time reverse-transcription polymerase chain reaction (rRT-PCR) ≥2 weeks apart within 1 year after discharge from the Ebola treatment unit or acute EVD. Late clearers had detectable EBOV RNA by rRT-PCR \u3e1 year after discharge from the Ebola treatment unit or acute EVD. Retrospective histories of their EVD clinical course were collected by questionnaire, followed by complete physical examinations and blood work. RESULTS: Compared with early clearers, late clearers were older (median, 42.5 years; P \u3c .001) and experienced fewer severe clinical symptoms (median 2, P = .006). Late clearers had more lens opacifications (odds ratio, 3.9 [95% confidence interval, 1.1-13.3]; P = .03), after accounting for age, higher total serum immunoglobulin G3 (IgG3) titers (P = .005), and increased expression of the HLA-C*03:04 allele (0.14 [.02-.70]; P = .007). CONCLUSIONS: Older age, decreased illness severity, elevated total serum IgG3 and HLA-C*03:04 allele expression may be risk factors for the persistence of EBOV in the semen of EVD survivors. EBOV persistence in semen may also be associated with its persistence in other immunologically protected sites, such as the eye

    The Allelic Landscape of Human Blood Cell Trait Variation and Links to Common Complex Disease

    Many common variants have been associated with hematological traits, but identification of causal genes and pathways has proven challenging. We performed a genome-wide association analysis in the UK Biobank and INTERVAL studies, testing 29.5 million genetic variants for association with 36 red cell, white cell, and platelet properties in 173,480 European-ancestry participants. This effort yielded hundreds of low frequency (<5%) and rare (<1%) variants with a strong impact on blood cell phenotypes. Our data highlight general properties of the allelic architecture of complex traits, including the proportion of the heritable component of each blood trait explained by the polygenic signal across different genome regulatory domains. Finally, through Mendelian randomization, we provide evidence of shared genetic pathways linking blood cell indices with complex pathologies, including autoimmune diseases, schizophrenia, and coronary heart disease and evidence suggesting previously reported population associations between blood cell indices and cardiovascular disease may be non-causal.We thank members of the Cambridge BioResource Scientific Advisory Board and Management Committee for their support of our study and the National Institute for Health Research Cambridge Biomedical Research Centre for funding. K.D. is funded as a HSST trainee by NHS Health Education England. M.F. is funded from the BLUEPRINT Grant Code HEALTH-F5-2011-282510 and the BHF Cambridge Centre of Excellence [RE/13/6/30180]. J.R.S. is funded by a MRC CASE Industrial studentship, co-funded by Pfizer. J.D. is a British Heart Foundation Professor, European Research Council Senior Investigator, and National Institute for Health Research (NIHR) Senior Investigator. S.M., S.T, M.H, K.M. and L.D. are supported by the NIHR BioResource-Rare Diseases, which is funded by NIHR. Research in the Ouwehand laboratory is supported by program grants from the NIHR to W.H.O., the European Commission (HEALTH-F2-2012-279233), the British Heart Foundation (BHF) to W.J.A. and D.R. under numbers RP-PG-0310-1002 and RG/09/12/28096 and Bristol Myers-Squibb; the laboratory also receives funding from NHSBT. W.H.O is a NIHR Senior Investigator. The INTERVAL academic coordinating centre receives core support from the UK Medical Research Council (G0800270), the BHF (SP/09/002), the NIHR and Cambridge Biomedical Research Centre, as well as grants from the European Research Council (268834), the European Commission Framework Programme 7 (HEALTH-F2-2012-279233), Merck and Pfizer. DJR and DA were supported by the NIHR Programme ‘Erythropoiesis in Health and Disease’ (Ref. NIHR-RP-PG-0310-1004). N.S. is supported by the Wellcome Trust (Grant Codes WT098051 and WT091310), the EU FP7 (EPIGENESYS Grant Code 257082 and BLUEPRINT Grant Code HEALTH-F5-2011-282510). The INTERVAL study is funded by NHSBT and has been supported by the NIHR-BTRU in Donor Health and Genomics at the University of Cambridge in partnership with NHSBT. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR, the Department of Health of England or NHSBT. D.G. is supported by a “la Caixa”-Severo Ochoa pre-doctoral fellowship

    Genetic fine mapping and genomic annotation defines causal mechanisms at type 2 diabetes susceptibility loci.

    We performed fine mapping of 39 established type 2 diabetes (T2D) loci in 27,206 cases and 57,574 controls of European ancestry. We identified 49 distinct association signals at these loci, including five mapping in or near KCNQ1. 'Credible sets' of the variants most likely to drive each distinct signal mapped predominantly to noncoding sequence, implying that association with T2D is mediated through gene regulation. Credible set variants were enriched for overlap with FOXA2 chromatin immunoprecipitation binding sites in human islet and liver cells, including at MTNR1B, where fine mapping implicated rs10830963 as driving T2D association. We confirmed that the T2D risk allele for this SNP increases FOXA2-bound enhancer activity in islet- and liver-derived cells. We observed allele-specific differences in NEUROD1 binding in islet-derived cells, consistent with evidence that the T2D risk allele increases islet MTNR1B expression. Our study demonstrates how integration of genetic and genomic information can define molecular mechanisms through which variants underlying association signals exert their effects on disease

    May Measurement Month 2018: a pragmatic global screening campaign to raise awareness of blood pressure by the International Society of Hypertension

    Aims Raised blood pressure (BP) is the biggest contributor to mortality and disease burden worldwide and fewer than half of those with hypertension are aware of it. May Measurement Month (MMM) is a global campaign set up in 2017, to raise awareness of high BP and as a pragmatic solution to a lack of formal screening worldwide. The 2018 campaign was expanded, aiming to include more participants and countries. Methods and results Eighty-nine countries participated in MMM 2018. Volunteers (≥18 years) were recruited through opportunistic sampling at a variety of screening sites. Each participant had three BP measurements and completed a questionnaire on demographic, lifestyle, and environmental factors. Hypertension was defined as a systolic BP ≥140 mmHg or diastolic BP ≥90 mmHg, or taking antihypertensive medication. In total, 74.9% of screenees provided three BP readings. Multiple imputation using chained equations was used to impute missing readings. 1 504 963 individuals (mean age 45.3 years; 52.4% female) were screened. After multiple imputation, 502 079 (33.4%) individuals had hypertension, of whom 59.5% were aware of their diagnosis and 55.3% were taking antihypertensive medication. Of those on medication, 60.0% were controlled and of all hypertensives, 33.2% were controlled. We detected 224 285 individuals with untreated hypertension and 111 214 individuals with inadequately treated (systolic BP ≥ 140 mmHg or diastolic BP ≥ 90 mmHg) hypertension. Conclusion May Measurement Month expanded significantly compared with 2017, including more participants in more countries. The campaign identified over 335 000 adults with untreated or inadequately treated hypertension. In the absence of systematic screening programmes, MMM was effective at raising awareness at least among these individuals at risk

    On Curling Numbers of Integer Sequences 1 Corresponding author.

    Given a finite nonempty sequence S of integers, write it as XY k, where Y k is a power of greatest exponent that is a suffix of S: this k is the curling number of S. The curling number conjecture is that if one starts with any initial sequence S, and extends it by repeatedly appending the curling number of the current sequence, the sequence will eventually reach 1. The conjecture remains open. In this paper we discuss the special case when S consists just of 2’s and 3’s. Even this case remains open, but we determine how far a sequence consisting of n 2’s and 3’s can extend before reaching a 1, conjecturally for n ≤ 80. We investigate several related combinatorial problems, such as finding c(n,k), the number of binary sequences of length n and curling number k, and t(n,i), the number of sequences of length n which extend for i steps before reaching a 1. A number of interesting combinatorial problems remain unsolved. 1 The curling number conjecture Given a finite nonempty sequence S of integers, write it as S = XY k, where X and Y are sequences of integers and Y k is a power of greatest exponent that is a suffix of S: this k i

    A Slow-Growing Sequence Defined by an Unusual Recurrence

    The sequence starts with a(1) = 1; to extend it one writes the sequence so far as XY k, where X and Y are strings of integers, Y is nonempty and k is as large as possible: then the next term is k. The sequence begins 1, 1, 2, 1, 1, 2, 2, 2, 3, 1, 1, 2, 1, 1, 2, 2, 2, 3, 2,... A 4 appears for the first time at position 220, but a 5 does not appear until about position 101023. The main result of the paper is a proof that the sequence is unbounded. We also present results from extensive numerical investigations of the sequence and of certain derived sequences, culminating with a heuristic argument that t (for t = 5, 6,...) appears for the first time at about position 1 All correspondence should be directed to this author. 1 2 ↑ (2 ↑ (3 ↑ (4 ↑ (5 ↑... ↑ ((t − 2) ↑ (t − 1)))))), where ↑ denotes exponentiation. The final section discusses generalizations.