138 research outputs found

    Evolution of communities of software: using tensor decompositions to compare software ecosystems

    Get PDF
    © 2019 The Authors. Published by Springer. This is an open access article available under a Creative Commons licence. The published version can be accessed at the following link on the publisher’s website: https://doi.org/10.1007/s41109-019-0193-5Modern software development is often a collaborative effort involving many authors through the re-use and sharing of code through software libraries. Modern software “ecosystems” are complex socio-technical systems which can be represented as a multilayer dynamic network. Many of these libraries and software packages are open-source and developed in the open on sites such as , so there is a large amount of data available about these networks. Studying these networks could be of interest to anyone choosing or designing a programming language. In this work, we use tensor factorisation to explore the dynamics of communities of software, and then compare these dynamics between languages on a dataset of approximately 1 million software projects. We hope to be able to inform the debate on software dependencies that has been recently re-ignited by the malicious takeover of the npm package and other incidents through giving a clearer picture of the structure of software dependency networks, and by exploring how the choices of language designers—for example, in the size of standard libraries, or the standards to which packages are held before admission to a language ecosystem is granted—may have shaped their language ecosystems. We establish that adjusted mutual information is a valid metric by which to assess the number of communities in a tensor decomposition and find that there are striking differences between the communities found across different software ecosystems and that communities do experience large and interpretable changes in activity over time. The differences between the elm and R software ecosystems, which see some communities decline over time, and the more conventional software ecosystems of Python, Java and JavaScript, which do not see many declining communities, are particularly marked.OAB’s work was supported as part of an Engineering and Physical Sciences Research Council (EPSRC) grant, project reference EP/I028099/1.Published versio

    Sequencing three crocodilian genomes to illuminate the evolution of archosaurs and amniotes

    Get PDF
    The International Crocodilian Genomes Working Group (ICGWG) will sequence and assemble the American alligator (Alligator mississippiensis), saltwater crocodile (Crocodylus porosus) and Indian gharial (Gavialis gangeticus) genomes. The status of these projects and our planned analyses are described

    Human leukocyte antigen alleles associate with COVID-19 vaccine immunogenicity and risk of breakthrough infection.

    Get PDF
    Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) vaccine immunogenicity varies between individuals, and immune responses correlate with vaccine efficacy. Using data from 1,076 participants enrolled in ChAdOx1 nCov-19 vaccine efficacy trials in the United Kingdom, we found that inter-individual variation in normalized antibody responses against SARS-CoV-2 spike and its receptor-binding domain (RBD) at 28 days after first vaccination shows genome-wide significant association with major histocompatibility complex (MHC) class II alleles. The most statistically significant association with higher levels of anti-RBD antibody was HLA-DQB1*06 (P = 3.2 × 10-9), which we replicated in 1,677 additional vaccinees. Individuals carrying HLA-DQB1*06 alleles were less likely to experience PCR-confirmed breakthrough infection during the ancestral SARS-CoV-2 virus and subsequent Alpha variant waves compared to non-carriers (hazard ratio = 0.63, 0.42-0.93, P = 0.02). We identified a distinct spike-derived peptide that is predicted to bind differentially to HLA-DQB1*06 compared to other similar alleles, and we found evidence of increased spike-specific memory B cell responses in HLA-DQB1*06 carriers at 84 days after first vaccination. Our results demonstrate association of HLA type with Coronavirus Disease 2019 (COVID-19) vaccine antibody response and risk of breakthrough infection, with implications for future vaccine design and implementation

    Multiple novel prostate cancer susceptibility signals identified by fine-mapping of known risk loci among Europeans

    Get PDF
    Genome-wide association studies (GWAS) have identified numerous common prostate cancer (PrCa) susceptibility loci. We have fine-mapped 64 GWAS regions known at the conclusion of the iCOGS study using large-scale genotyping and imputation in 25 723 PrCa cases and 26 274 controls of European ancestry. We detected evidence for multiple independent signals at 16 regions, 12 of which contained additional newly identified significant associations. A single signal comprising a spectrum of correlated variation was observed at 39 regions; 35 of which are now described by a novel more significantly associated lead SNP, while the originally reported variant remained as the lead SNP only in 4 regions. We also confirmed two association signals in Europeans that had been previously reported only in East-Asian GWAS. Based on statistical evidence and linkage disequilibrium (LD) structure, we have curated and narrowed down the list of the most likely candidate causal variants for each region. Functional annotation using data from ENCODE filtered for PrCa cell lines and eQTL analysis demonstrated significant enrichment for overlap with bio-features within this set. By incorporating the novel risk variants identified here alongside the refined data for existing association signals, we estimate that these loci now explain ∼38.9% of the familial relative risk of PrCa, an 8.9% improvement over the previously reported GWAS tag SNPs. This suggests that a significant fraction of the heritability of PrCa may have been hidden during the discovery phase of GWAS, in particular due to the presence of multiple independent signals within the same regio

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Analysis of shared heritability in common disorders of the brain

    Get PDF
    ience, this issue p. eaap8757 Structured Abstract INTRODUCTION Brain disorders may exhibit shared symptoms and substantial epidemiological comorbidity, inciting debate about their etiologic overlap. However, detailed study of phenotypes with different ages of onset, severity, and presentation poses a considerable challenge. Recently developed heritability methods allow us to accurately measure correlation of genome-wide common variant risk between two phenotypes from pools of different individuals and assess how connected they, or at least their genetic risks, are on the genomic level. We used genome-wide association data for 265,218 patients and 784,643 control participants, as well as 17 phenotypes from a total of 1,191,588 individuals, to quantify the degree of overlap for genetic risk factors of 25 common brain disorders. RATIONALE Over the past century, the classification of brain disorders has evolved to reflect the medical and scientific communities' assessments of the presumed root causes of clinical phenomena such as behavioral change, loss of motor function, or alterations of consciousness. Directly observable phenomena (such as the presence of emboli, protein tangles, or unusual electrical activity patterns) generally define and separate neurological disorders from psychiatric disorders. Understanding the genetic underpinnings and categorical distinctions for brain disorders and related phenotypes may inform the search for their biological mechanisms. RESULTS Common variant risk for psychiatric disorders was shown to correlate significantly, especially among attention deficit hyperactivity disorder (ADHD), bipolar disorder, major depressive disorder (MDD), and schizophrenia. By contrast, neurological disorders appear more distinct from one another and from the psychiatric disorders, except for migraine, which was significantly correlated to ADHD, MDD, and Tourette syndrome. We demonstrate that, in the general population, the personality trait neuroticism is significantly correlated with almost every psychiatric disorder and migraine. We also identify significant genetic sharing between disorders and early life cognitive measures (e.g., years of education and college attainment) in the general population, demonstrating positive correlation with several psychiatric disorders (e.g., anorexia nervosa and bipolar disorder) and negative correlation with several neurological phenotypes (e.g., Alzheimer's disease and ischemic stroke), even though the latter are considered to result from specific processes that occur later in life. Extensive simulations were also performed to inform how statistical power, diagnostic misclassification, and phenotypic heterogeneity influence genetic correlations. CONCLUSION The high degree of genetic correlation among many of the psychiatric disorders adds further evidence that their current clinical boundaries do not reflect distinct underlying pathogenic processes, at least on the genetic level. This suggests a deeply interconnected nature for psychiatric disorders, in contrast to neurological disorders, and underscores the need to refine psychiatric diagnostics. Genetically informed analyses may provide important "scaffolding" to support such restructuring of psychiatric nosology, which likely requires incorporating many levels of information. By contrast, we find limited evidence for widespread common genetic risk sharing among neurological disorders or across neurological and psychiatric disorders. We show that both psychiatric and neurological disorders have robust correlations with cognitive and personality measures. Further study is needed to evaluate whether overlapping genetic contributions to psychiatric pathology may influence treatment choices. Ultimately, such developments may pave the way toward reduced heterogeneity and improved diagnosis and treatment of psychiatric disorders

    Mortality and pulmonary complications in patients undergoing surgery with perioperative SARS-CoV-2 infection: an international cohort study

    Get PDF
    Background: The impact of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) on postoperative recovery needs to be understood to inform clinical decision making during and after the COVID-19 pandemic. This study reports 30-day mortality and pulmonary complication rates in patients with perioperative SARS-CoV-2 infection. Methods: This international, multicentre, cohort study at 235 hospitals in 24 countries included all patients undergoing surgery who had SARS-CoV-2 infection confirmed within 7 days before or 30 days after surgery. The primary outcome measure was 30-day postoperative mortality and was assessed in all enrolled patients. The main secondary outcome measure was pulmonary complications, defined as pneumonia, acute respiratory distress syndrome, or unexpected postoperative ventilation. Findings: This analysis includes 1128 patients who had surgery between Jan 1 and March 31, 2020, of whom 835 (74·0%) had emergency surgery and 280 (24·8%) had elective surgery. SARS-CoV-2 infection was confirmed preoperatively in 294 (26·1%) patients. 30-day mortality was 23·8% (268 of 1128). Pulmonary complications occurred in 577 (51·2%) of 1128 patients; 30-day mortality in these patients was 38·0% (219 of 577), accounting for 81·7% (219 of 268) of all deaths. In adjusted analyses, 30-day mortality was associated with male sex (odds ratio 1·75 [95% CI 1·28–2·40], p\textless0·0001), age 70 years or older versus younger than 70 years (2·30 [1·65–3·22], p\textless0·0001), American Society of Anesthesiologists grades 3–5 versus grades 1–2 (2·35 [1·57–3·53], p\textless0·0001), malignant versus benign or obstetric diagnosis (1·55 [1·01–2·39], p=0·046), emergency versus elective surgery (1·67 [1·06–2·63], p=0·026), and major versus minor surgery (1·52 [1·01–2·31], p=0·047). Interpretation: Postoperative pulmonary complications occur in half of patients with perioperative SARS-CoV-2 infection and are associated with high mortality. Thresholds for surgery during the COVID-19 pandemic should be higher than during normal practice, particularly in men aged 70 years and older. Consideration should be given for postponing non-urgent procedures and promoting non-operative treatment to delay or avoid the need for surgery. Funding: National Institute for Health Research (NIHR), Association of Coloproctology of Great Britain and Ireland, Bowel and Cancer Research, Bowel Disease Research Foundation, Association of Upper Gastrointestinal Surgeons, British Association of Surgical Oncology, British Gynaecological Cancer Society, European Society of Coloproctology, NIHR Academy, Sarcoma UK, Vascular Society for Great Britain and Ireland, and Yorkshire Cancer Research

    Models of classroom assessment for course-based research experiences

    Get PDF
    Course-based research pedagogy involves positioning students as contributors to authentic research projects as part of an engaging educational experience that promotes their learning and persistence in science. To develop a model for assessing and grading students engaged in this type of learning experience, the assessment aims and practices of a community of experienced course-based research instructors were collected and analyzed. This approach defines four aims of course-based research assessment—(1) Assessing Laboratory Work and Scientific Thinking; (2) Evaluating Mastery of Concepts, Quantitative Thinking and Skills; (3) Appraising Forms of Scientific Communication; and (4) Metacognition of Learning—along with a set of practices for each aim. These aims and practices of assessment were then integrated with previously developed models of course-based research instruction to reveal an assessment program in which instructors provide extensive feedback to support productive student engagement in research while grading those aspects of research that are necessary for the student to succeed. Assessment conducted in this way delicately balances the need to facilitate students’ ongoing research with the requirement of a final grade without undercutting the important aims of a CRE education
    corecore