18 research outputs found
An open dataset of Plasmodium falciparum genome variation in 7,000 worldwide samples.
MalariaGEN is a data-sharing network that enables groups around the world to work together on the genomic epidemiology of malaria. Here we describe a new release of curated genome variation data on 7,000 Plasmodium falciparum samples from MalariaGEN partner studies in 28 malaria-endemic countries. High-quality genotype calls on 3 million single nucleotide polymorphisms (SNPs) and short indels were produced using a standardised analysis pipeline. Copy number variants associated with drug resistance and structural variants that cause failure of rapid diagnostic tests were also analysed. Â Almost all samples showed genetic evidence of resistance to at least one antimalarial drug, and some samples from Southeast Asia carried markers of resistance to six commonly-used drugs. Genes expressed during the mosquito stage of the parasite life-cycle are prominent among loci that show strong geographic differentiation. By continuing to enlarge this open data resource we aim to facilitate research into the evolutionary processes affecting malaria control and to accelerate development of the surveillance toolkit required for malaria elimination
Magpie: Online modelling and performance-aware systems
Understanding the performance of distributed systems requires correlation of thousands of interactions between numerous components — a task best left to a computer. Today’s systems provide voluminous traces from each component but do not synthesise the data into concise models of system performance. We argue that online performance modelling should be a ubiquitous operating system service and outline several uses including performance debugging, capacity planning, system tuning and anomaly detection. We describe the Magpie modelling service which collates detailed traces from multiple machines in an e-commerce site, extracts request-specific audit trails, and constructs probabilistic models of request behaviour. A feasibility study evaluates the approach using an offline demonstrator. Results show that the approach is promising, but that there are many challenges to building a truly ubiquitious, online modelling infrastructure.
USENIX Association
Understanding the performance of distributed systems requires correlation of thousands of interactions between numerous components --- a task best left to a computer. Today's systems provide voluminous traces from each component but do not synthesise the data into concise models of system performance
Proteomic and genomic analysis reveals novel <i>Campylobacter jejuni</i> outer membrane proteins and potential heterogeneity
AbstractGram-negative bacterial outer membrane proteins play important roles in the interaction of bacteria with their environment including nutrient acquisition, adhesion and invasion, and antibiotic resistance. In this study we identified 47 proteins within the Sarkosyl-insoluble fraction of Campylobacter jejuni 81-176, using LC–ESI-MS/MS. Comparative analysis of outer membrane protein sequences was visualised to reveal protein distribution within a panel of Campylobacter spp., identifying several C. jejuni-specific proteins. Smith–Waterman analyses of C. jejuni homologues revealed high sequence conservation amongst a number of hypothetical proteins, sequence heterogeneity of other proteins and several proteins which are absent in a proportion of strains
Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing.
Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. Here we describe methods for the large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short-term culture. Analysis of 86,158 exonic single nucleotide polymorphisms that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for the exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome
The Quest for Orthologs orthology benchmark service in 2022.
The Orthology Benchmark Service (https://orthology.benchmarkservice.org) is the gold standard for orthology inference evaluation, supported and maintained by the Quest for Orthologs consortium. It is an essential resource to compare existing and new methods of orthology inference (the bedrock for many comparative genomics and phylogenetic analysis) over a standard dataset and through common procedures. The Quest for Orthologs Consortium is dedicated to maintaining the resource up to date, through regular updates of the Reference Proteomes and increasingly accessible data through the OpenEBench platform. For this update, we have added a new benchmark based on curated orthology assertion from the Vertebrate Gene Nomenclature Committee, and provided an example meta-analysis of the public predictions present on the platform
The quest for orthologs Orthology Benchmark Service in 2022
The Orthology Benchmark Service (https://orthology.benchmarkservice.org) is the gold standard for orthology inference evaluation, supported and maintained by the Quest for Orthologs consortium. It is an essential resource to compare existing and new methods of orthology inference (the bedrock for many comparative genomics and phylogenetic analysis) over a standard dataset and through common procedures. The Quest for Orthologs Consortium is dedicated to maintaining the resource up to date, through regular updates of the Reference Proteomes and increasingly accessible data through the OpenEBench platform. For this update, we have added a new benchmark based on curated orthology assertion from the Vertebrate Gene Nomenclature Committee, and provided an example meta-analysis of the public predictions present on the platform