11 research outputs found

    Slicing:A sustainable approach to structuring samples for analysis in long-term studies

    Get PDF
    The longitudinal study of populations is a core tool for understanding ecological and evolutionary processes. Long‐term studies typically collect samples repeatedly over individual lifetimes and across generations. These samples are then analysed in batches (e.g. qPCR plates) and clusters (i.e. group of batches) over time in the laboratory. However, these analyses are constrained by cross‐classified data structures introduced biologically or through experimental design. The separation of biological variation from the confounding among‐batch and among‐cluster variation is crucial, yet often ignored. The commonly used approaches to structuring samples for analysis, sequential and randomization, generate bias due to the non‐independence between time of collection and the batch and cluster they are analysed in. We propose a new sample structuring strategy, called slicing, designed to separate confounding among‐batch and among‐cluster variation from biological variation. Through simulations, we tested the statistical power and precision to detect within‐individual, between‐individual, year and cohort effects of this novel approach. Our slicing approach, whereby recently and previously collected samples are sequentially analysed in clusters together, enables the statistical separation of collection time and cluster effects by bridging clusters together, for which we provide a case study. Our simulations show, with reasonable slicing width and angle, similar precision and similar or greater statistical power to detect year, cohort, within‐ and between‐individual effects when samples are sliced across batches, compared with strategies that aggregate longitudinal samples or use randomized allocation. While the best approach to analysing long‐term datasets depends on the structure of the data and questions of interest, it is vital to account for confounding among‐cluster and batch variation. Our slicing approach is simple to apply and creates the necessary statistical independence of batch and cluster from environmental or biological variables of interest. Crucially, it allows sequential analysis of samples and flexible inclusion of current data in later analyses without completely confounding the analysis. Our approach maximizes the scientific value of every sample, as each will optimally contribute to unbiased statistical inference from the data. Slicing thereby maximizes the power of growing biobanks to address important ecological, epidemiological and evolutionary questions

    Extra-pair parentage and personality in a cooperatively breeding bird

    Get PDF
    Why so much variation in extra-pair parentage occurs within and among populations remains unclear. Often the fitness costs and benefits of extra-pair parentage are hypothesised to explain its occurrence; therefore, linking extra-pair parentage with traits such as personality (behavioural traits that can be heritable and affect reproductive behaviour) may help our understanding. Here, we investigate whether reproductive outcomes and success are associated with exploratory behaviour in a natural population of cooperatively breeding Seychelles warblers (Acrocephalus sechellensis) on Cousin Island. Exploratory behaviour correlates positively with traits such as risk-taking behaviour and activity in other wild bird species and might promote extra-pair mating by increasing the rate at which potential extra-pair partners are encountered. We therefore predicted that fast-exploring individuals would have more extra-pair offspring. There is also a potential trade-off between pursuing extra-pair parentage and mate guarding in males. We therefore also predicted that fast-exploring males would be more likely to pursue extra-pair parentage and that this would increase the propensity of their mate to gain extra-pair parentage. We found that neither the total number of offspring nor the number of extra-pair offspring were associated with a male’s or female’s exploratory behaviour. However, there was a small but significant propensity for females to have extra-pair fertilisations in pairs that were behaviourally disassortative. Overall, we conclude that, due to the small effect size, the association between exploratory behaviour and extra-pair paternity is unlikely to be biologically relevant. Significance statement: True genetic monogamy is rare, even in socially monogamous systems, and multiple factors, such as behaviour, social structure, morphology and physiology, determined by the biological system can cause variation in extra-pair parentage (EPP). Therefore, investigating the inherent differences in these factors among individuals could be informative. We investigated whether reproductive outcomes/success are associated with differences in the propensity to explore novel environments/objects in a promiscuous, island-dwelling cooperatively breeding bird, the Seychelles warbler. Our results showed that exploratory behaviour was not associated with the number of offspring produced by an individual, and thus the long-term fitness consequences of different exploratory tendencies did not differ. We also found that the propensity to engage in EPP in females was higher in dissimilar behavioural pairs, but due to the small effect size, we hesitate to conclude that there are personality-dependent mating outcomes in the population

    Provision and use of GPU resources for distributed workloads via the Grid

    No full text
    The Queen Mary University of London WLCG Tier-2 Grid site has been providing GPU resources on the Grid since 2016. GPUs are an important modern tool to assist in data analysis. They have historically been used to accelerate computationally expensive but parallelisable workloads using frameworks such as OpenCL and CUDA. However, more recently their power in accelerating machine learning, using libraries such as TensorFlow and Coffee, has come to the fore and the demand for GPU resources has increased. Significant effort is being spent in high energy physics to investigate and use machine learning to enhance the analysis of data. GPUs may also provide part of the solution to the compute challenge of the High Luminosity LHC. The motivation for providing GPU resources via the Grid is presented. The installation and configuration of the SLURM batch system together with Compute Elements (CREAM and ARC) for use with GPUs is shown. Real world use cases are presented and the success and issues discovered are discussed

    Provision and use of GPU resources for distributed workloads via the Grid

    Get PDF
    The Queen Mary University of London WLCG Tier-2 Grid site has been providing GPU resources on the Grid since 2016. GPUs are an important modern tool to assist in data analysis. They have historically been used to accelerate computationally expensive but parallelisable workloads using frameworks such as OpenCL and CUDA. However, more recently their power in accelerating machine learning, using libraries such as TensorFlow and Coffee, has come to the fore and the demand for GPU resources has increased. Significant effort is being spent in high energy physics to investigate and use machine learning to enhance the analysis of data. GPUs may also provide part of the solution to the compute challenge of the High Luminosity LHC. The motivation for providing GPU resources via the Grid is presented. The installation and configuration of the SLURM batch system together with Compute Elements (CREAM and ARC) for use with GPUs is shown. Real world use cases are presented and the success and issues discovered are discussed

    Using Lustre and Slurm to process Hadoop workloads and extending to the WLCG

    No full text
    The Queen Mary University of London Grid site has investigated the use of its Lustre file system to support Hadoop work flows. Lustre is an open source, POSIX compatible, clustered file system often used in high performance computing clusters and is often paired with the Slurm batch system. Hadoop is an open-source software framework for distributed storage and processing of data normally run on dedicated hardware utilising the HDFS file system and Yarn batch system. Hadoop is an important modern tool for data analytics used by a large range of organisation including CERN. By using our existing Lustre file system and Slurm batch system, the need to have dedicated hardware is removed and a single platform only has to be maintained for data storage and processing. The motivation and benefits of using Hadoop with Lustre and Slurm are presented. The installation, benchmarks, limitations and future plans are discussed. We also investigate using the standard WLCG Grid middleware Cream-CE service to provide a Grid enabled Hadoop service

    Using Lustre and Slurm to process Hadoop workloads and extending to the WLCG

    Get PDF
    The Queen Mary University of London Grid site has investigated the use of its Lustre file system to support Hadoop work flows. Lustre is an open source, POSIX compatible, clustered file system often used in high performance computing clusters and is often paired with the Slurm batch system. Hadoop is an open-source software framework for distributed storage and processing of data normally run on dedicated hardware utilising the HDFS file system and Yarn batch system. Hadoop is an important modern tool for data analytics used by a large range of organisation including CERN. By using our existing Lustre file system and Slurm batch system, the need to have dedicated hardware is removed and a single platform only has to be maintained for data storage and processing. The motivation and benefits of using Hadoop with Lustre and Slurm are presented. The installation, benchmarks, limitations and future plans are discussed. We also investigate using the standard WLCG Grid middleware Cream-CE service to provide a Grid enabled Hadoop service

    IPv6-only networking on WLCG

    Get PDF
    The use of IPv6 on the general Internet continues to grow. The transition of the Worldwide Large Hadron Collider Computing Grid (WLCG) central and storage services to dual-stack IPv6/IPv4 is progressing well, thus enabling the use of IPv6-only CPU resources as agreed by the WLCG Management Board and presented by us at earlier CHEP conferences. During the last year, the HEPiX IPv6 Working Group has continued to chase and support the transition to dual-stack services. We present the status of the transition and some tests that have been made of IPv6-only CPU showing the successful use of IPv6 protocols in accessing WLCG services. The dual-stack deployment does however result in a networking environment which is more complex than when using just IPv6. The group is investigating the removal of the IPv4 protocol in places. We present the areas where this could be useful together with our future plans

    IPv6 in production: its deployment and usage in WLCG

    No full text
    The fraction of general internet traffic carried over IPv6 continues to grow rapidly. The transition of WLCG central and storage services to dual-stack IPv4/IPv6 is progressing well, thus enabling the use of IPv6-only CPU resources as agreed by the WLCG Management Board and presented by us at CHEP2016. By April 2018, all WLCG Tier-1 data centres should have provided access to their services over IPv6. The LHC experiments have requested all WLCG Tier-2 centres to provide dual-stack access to their storage by the end of LHC Run 2. This paper reviews the status of IPv6 deployment in WLCG
    corecore