2,826 research outputs found

    Declarative visitors to ease fine-grained source code mining with full history on billions of AST nodes

    Get PDF
    Software repositories contain a vast wealth of information about software development. Mining these repositories has proven useful for detecting patterns in software development, testing hypotheses for new software engineering approaches, etc. Specifically, mining source code has yielded significant insights into software development artifacts and processes. Unfortunately, mining source code at a large-scale remains a difficult task. Previous approaches had to either limit the scope of the projects studied, limit the scope of the mining task to be more coarse-grained, or sacrifice studying the history of the code due to both human and computational scalability issues. In this paper we address the substantial challenges of mining source code: a) at a very large scale; b) at a fine-grained level of detail; and c) with full history information. To address these challenges, we present domain-specific language features for source code mining. Our language features are inspired by object-oriented visitors and provide a default depth-first traversal strategy along with two expressions for defining custom traversals. We provide an implementation of these features in the Boa infrastructure for software repository mining and describe a code generation strategy into Java code. To show the usability of our domain-specific language features, we reproduced over 40 source code mining tasks from two large-scale previous studies in just 2 person-weeks. The resulting code for these tasks show between 2.0x--4.8x reduction in code size. Finally we perform a small controlled experiment to gain insights into how easily mining tasks written using our language features can be understood, with no prior training. We show a substantial number of tasks (77%) were understood by study participants, in about 3 minutes per task

    COVID-19 vaccine strategies for Aotearoa New Zealand:a mathematical modelling study

    Get PDF
    Summary: Background: COVID-19 elimination measures, including border closures have been applied in New Zealand. We have modelled the potential effect of vaccination programmes for opening borders.Methods: We used a deterministic age-stratified Susceptible, Exposed, Infectious, Recovered (SEIR) model. We minimised spread by varying the age-stratified vaccine allocation to find the minimum herd immunity requirements (the effective reproduction number Reff<1 with closed borders) under various vaccine effectiveness (VE) scenarios and R0 values. We ran two-year open-border simulations for two vaccine strategies: minimising Reff and targeting high-risk groups.Findings: Targeting of high-risk groups will result in lower hospitalisations and deaths in most scenarios. Reaching the herd immunity threshold (HIT) with a vaccine of 90% VE against disease and 80% VE against infection requires at least 86‱5% total population uptake for R0=4‱5 (with high vaccination coverage for 30–49-year-olds) and 98‱1% uptake for R0=6. In a two-year open-border scenario with 10 overseas cases daily and 90% total population vaccine uptake (including 0–15 year olds) with the same vaccine, the strategy of targeting high-risk groups is close to achieving HIT, with an estimated 11,400 total hospitalisations (peak 324 active and 36 new daily cases in hospitals), and 1,030 total deaths.Interpretation: Targeting high-risk groups for vaccination will result in fewer hospitalisations and deaths with open borders compared to targeting reduced transmission. With a highly effective vaccine and a high total uptake, opening borders will result in increasing cases, hospitalisations, and deaths. Other public health and social measures will still be required as part of an effective pandemic response.Funding: This project was funded by the Health Research Council [20/1018].Research in contex

    Partially Annealed Disorder and Collapse of Like-Charged Macroions

    Full text link
    Charged systems with partially annealed charge disorder are investigated using field-theoretic and replica methods. Charge disorder is assumed to be confined to macroion surfaces surrounded by a cloud of mobile neutralizing counterions in an aqueous solvent. A general formalism is developed by assuming that the disorder is partially annealed (with purely annealed and purely quenched disorder included as special cases), i.e., we assume in general that the disorder undergoes a slow dynamics relative to fast-relaxing counterions making it possible thus to study the stationary-state properties of the system using methods similar to those available in equilibrium statistical mechanics. By focusing on the specific case of two planar surfaces of equal mean surface charge and disorder variance, it is shown that partial annealing of the quenched disorder leads to renormalization of the mean surface charge density and thus a reduction of the inter-plate repulsion on the mean-field or weak-coupling level. In the strong-coupling limit, charge disorder induces a long-range attraction resulting in a continuous disorder-driven collapse transition for the two surfaces as the disorder variance exceeds a threshold value. Disorder annealing further enhances the attraction and, in the limit of low screening, leads to a global attractive instability in the system.Comment: 21 pages, 2 figure

    Design of Experiments for Screening

    Full text link
    The aim of this paper is to review methods of designing screening experiments, ranging from designs originally developed for physical experiments to those especially tailored to experiments on numerical models. The strengths and weaknesses of the various designs for screening variables in numerical models are discussed. First, classes of factorial designs for experiments to estimate main effects and interactions through a linear statistical model are described, specifically regular and nonregular fractional factorial designs, supersaturated designs and systematic fractional replicate designs. Generic issues of aliasing, bias and cancellation of factorial effects are discussed. Second, group screening experiments are considered including factorial group screening and sequential bifurcation. Third, random sampling plans are discussed including Latin hypercube sampling and sampling plans to estimate elementary effects. Fourth, a variety of modelling methods commonly employed with screening designs are briefly described. Finally, a novel study demonstrates six screening methods on two frequently-used exemplars, and their performances are compared

    PPCAS: Implementation of a Probabilistic Pairwise Model for Consistency-Based Multiple Alignment in Apache Spark

    Get PDF
    Large-scale data processing techniques, currently known as Big-Data, are used to manage the huge amount of data that are generated by sequencers. Although these techniques have significant advantages, few biological applications have adopted them. In the Bioinformatic scientific area, Multiple Sequence Alignment (MSA) tools are widely applied for evolution and phylogenetic analysis, homology and domain structure prediction. Highly-rated MSA tools, such as MAFFT, ProbCons and T-Coffee (TC), use the probabilistic consistency as a prior step to the progressive alignment stage in order to improve the final accuracy. In this paper, a novel approach named PPCAS (Probabilistic Pairwise model for Consistency-based multiple alignment in Apache Spark) is presented. PPCAS is based on the MapReduce processing paradigm in order to enable large datasets to be processed with the aim of improving the performance and scalability of the original algorithm.This work was supported by the MEyC-Spain [contract TIN2014-53234-C2-2-R]

    Length-Independent Charge Transport in Chimeric Molecular Wires

    Get PDF
    Advanced molecular electronic components remain vital for the next generation of miniaturized integrated circuits. Thus, much research effort has been devoted to the discovery of lossless molecular wires, for which the charge transport rate or conductivity is not attenuated with length in the tunneling regime. Herein, we report the synthesis and electrochemical interrogation of DNA-like molecular wires. We determine that the rate of electron transfer through these constructs is independent of their length and propose a plausible mechanism to explain our findings. The reported approach holds relevance for the development of high-performance molecular electronic components and the fundamental study of charge transport phenomena in organic semiconductors

    Multifragmentation of a very heavy nuclear system (II): bulk properties and spinodal decomposition

    Full text link
    The properties of fragments and light charged particles emitted in multifragmentation of single sources formed in central 36AMeV Gd+U collisions are reviewed. Most of the products are isotropically distributed in the reaction c.m. Fragment kinetic energies reveal the onset of radial collective energy. A bulk effect is experimentally evidenced from the similarity of the charge distribution with that from the lighter 32AMeV Xe+Sn system. Spinodal decomposition of finite nuclear matter exhibits the same property in simulated central collisions for the two systems, and appears therefore as a possible mechanism at the origin of multifragmentation in this incident energy domain.Comment: 28 pages including 14 figures; submitted to Nucl. Phys.

    The Apache Point Observatory Galactic Evolution Experiment (APOGEE) Spectrographs

    Full text link
    We describe the design and performance of the near-infrared (1.51--1.70 micron), fiber-fed, multi-object (300 fibers), high resolution (R = lambda/delta lambda ~ 22,500) spectrograph built for the Apache Point Observatory Galactic Evolution Experiment (APOGEE). APOGEE is a survey of ~ 10^5 red giant stars that systematically sampled all Milky Way populations (bulge, disk, and halo) to study the Galaxy's chemical and kinematical history. It was part of the Sloan Digital Sky Survey III (SDSS-III) from 2011 -- 2014 using the 2.5 m Sloan Foundation Telescope at Apache Point Observatory, New Mexico. The APOGEE-2 survey is now using the spectrograph as part of SDSS-IV, as well as a second spectrograph, a close copy of the first, operating at the 2.5 m du Pont Telescope at Las Campanas Observatory in Chile. Although several fiber-fed, multi-object, high resolution spectrographs have been built for visual wavelength spectroscopy, the APOGEE spectrograph is one of the first such instruments built for observations in the near-infrared. The instrument's successful development was enabled by several key innovations, including a "gang connector" to allow simultaneous connections of 300 fibers; hermetically sealed feedthroughs to allow fibers to pass through the cryostat wall continuously; the first cryogenically deployed mosaic volume phase holographic grating; and a large refractive camera that includes mono-crystalline silicon and fused silica elements with diameters as large as ~ 400 mm. This paper contains a comprehensive description of all aspects of the instrument including the fiber system, optics and opto-mechanics, detector arrays, mechanics and cryogenics, instrument control, calibration system, optical performance and stability, lessons learned, and design changes for the second instrument.Comment: 81 pages, 67 figures, PASP, accepte

    Recent advances in Leishmania reverse genetics : Manipulating a manipulative parasite

    Get PDF
    In this review we describe the expanding repertoire of molecular tools with which to study gene function in Leishmania. Specifically we review the tools available for studying functions of essential genes, such as plasmid shuffle and DiCre, as well as the rapidly expanding portfolio of available CRISPR/Cas9 approaches for large scale gene knockout and endogenous tagging. We include detail on approaches that allow the direct manipulation of RNA using RNAi and protein levels via Tet or DiCre induced overexpression and destabilization domain mediated degradation. The utilisation of current methods and the development of more advanced molecular tools will lead to greater understanding of the role of essential genes in the parasite and thereby more robust drug target validation, thereby paving the way for the development of novel therapeutics to treat this important disease

    Global burden of disease due to rifampicin-resistant tuberculosis in 2020: a mathematical modelling analysis

    Get PDF
    In 2020, almost half a million individuals developed rifampicin-resistant tuberculosis (RR-TB). We estimated the global burden of RR-TB over the lifetime of affected individuals. We synthesized data on incidence, case detection, and treatment outcomes in 192 countries (99.99% of global tuberculosis). Using a mathematical model, we projected disability-adjusted life years (DALYs) over the lifetime for individuals developing tuberculosis in 2020 stratified by country, age, sex, HIV, and rifampicin resistance. Here we show that incident RR-TB in 2020 was responsible for an estimated 6.9 (95% uncertainty interval: 5.5, 8.5) million DALYs, 44% (31, 54) of which accrued among TB survivors. We estimated an average of 17 (14, 21) DALYs per person developing RR-TB, 34% (12, 56) greater than for rifampicin-susceptible tuberculosis. RR-TB burden per 100,000 was highest in former Soviet Union countries and southern African countries. While RR-TB causes substantial short-term morbidity and mortality, nearly half of the overall disease burden of RR-TB accrues among tuberculosis survivors. The substantial long-term health impacts among those surviving RR-TB disease suggest the need for improved post-treatment care and further justify increased health expenditures to prevent RR-TB transmission
    • 

    corecore