66 research outputs found

    Statistical Inference for Valued-Edge Networks: Generalized Exponential Random Graph Models

    Get PDF
    Across the sciences, the statistical analysis of networks is central to the production of knowledge on relational phenomena. Because of their ability to model the structural generation of networks, exponential random graph models are a ubiquitous means of analysis. However, they are limited by an inability to model networks with valued edges. We solve this problem by introducing a class of generalized exponential random graph models capable of modeling networks whose edges are valued, thus greatly expanding the scope of networks applied researchers can subject to statistical analysis

    Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants

    Get PDF
    Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants

    How Does Spatial Study Design Influence Density Estimates from Spatial Capture-Recapture Models?

    Get PDF
    When estimating population density from data collected on non-invasive detector arrays, recently developed spatial capture-recapture (SCR) models present an advance over non-spatial models by accounting for individual movement. While these models should be more robust to changes in trapping designs, they have not been well tested. Here we investigate how the spatial arrangement and size of the trapping array influence parameter estimates for SCR models. We analysed black bear data collected with 123 hair snares with an SCR model accounting for differences in detection and movement between sexes and across the trapping occasions. To see how the size of the trap array and trap dispersion influence parameter estimates, we repeated analysis for data from subsets of traps: 50% chosen at random, 50% in the centre of the array and 20% in the South of the array. Additionally, we simulated and analysed data under a suite of trap designs and home range sizes. In the black bear study, we found that results were similar across trap arrays, except when only 20% of the array was used. Black bear density was approximately 10 individuals per 100 km2. Our simulation study showed that SCR models performed well as long as the extent of the trap array was similar to or larger than the extent of individual movement during the study period, and movement was at least half the distance between traps. SCR models performed well across a range of spatial trap setups and animal movements. Contrary to non-spatial capture-recapture models, they do not require the trapping grid to cover an area several times the average home range of the studied species. This renders SCR models more appropriate for the study of wide-ranging mammals and more flexible to design studies targeting multiple species

    Using combined diagnostic test results to hindcast trends of infection from cross-sectional data

    Get PDF
    Infectious disease surveillance is key to limiting the consequences from infectious pathogens and maintaining animal and public health. Following the detection of a disease outbreak, a response in proportion to the severity of the outbreak is required. It is thus critical to obtain accurate information concerning the origin of the outbreak and its forward trajectory. However, there is often a lack of situational awareness that may lead to over- or under-reaction. There is a widening range of tests available for detecting pathogens, with typically different temporal characteristics, e.g. in terms of when peak test response occurs relative to time of exposure. We have developed a statistical framework that combines response level data from multiple diagnostic tests and is able to ‘hindcast’ (infer the historical trend of) an infectious disease epidemic. Assuming diagnostic test data from a cross-sectional sample of individuals infected with a pathogen during an outbreak, we use a Bayesian Markov Chain Monte Carlo (MCMC) approach to estimate time of exposure, and the overall epidemic trend in the population prior to the time of sampling. We evaluate the performance of this statistical framework on simulated data from epidemic trend curves and show that we can recover the parameter values of those trends. We also apply the framework to epidemic trend curves taken from two historical outbreaks: a bluetongue outbreak in cattle, and a whooping cough outbreak in humans. Together, these results show that hindcasting can estimate the time since infection for individuals and provide accurate estimates of epidemic trends, and can be used to distinguish whether an outbreak is increasing or past its peak. We conclude that if temporal characteristics of diagnostics are known, it is possible to recover epidemic trends of both human and animal pathogens from cross-sectional data collected at a single point in time

    Epistasis: Obstacle or Advantage for Mapping Complex Traits?

    Get PDF
    Identification of genetic loci in complex traits has focused largely on one-dimensional genome scans to search for associations between single markers and the phenotype. There is mounting evidence that locus interactions, or epistasis, are a crucial component of the genetic architecture of biologically relevant traits. However, epistasis is often viewed as a nuisance factor that reduces power for locus detection. Counter to expectations, recent work shows that fitting full models, instead of testing marker main effect and interaction components separately, in exhaustive multi-locus genome scans can have higher power to detect loci when epistasis is present than single-locus scans, and improvement that comes despite a much larger multiple testing alpha-adjustment in such searches. We demonstrate, both theoretically and via simulation, that the expected power to detect loci when fitting full models is often larger when these loci act epistatically than when they act additively. Additionally, we show that the power for single locus detection may be improved in cases of epistasis compared to the additive model. Our exploration of a two step model selection procedure shows that identifying the true model is difficult. However, this difficulty is certainly not exacerbated by the presence of epistasis, on the contrary, in some cases the presence of epistasis can aid in model selection. The impact of allele frequencies on both power and model selection is dramatic

    Black hole spin: theory and observation

    Full text link
    In the standard paradigm, astrophysical black holes can be described solely by their mass and angular momentum - commonly referred to as `spin' - resulting from the process of their birth and subsequent growth via accretion. Whilst the mass has a standard Newtonian interpretation, the spin does not, with the effect of non-zero spin leaving an indelible imprint on the space-time closest to the black hole. As a consequence of relativistic frame-dragging, particle orbits are affected both in terms of stability and precession, which impacts on the emission characteristics of accreting black holes both stellar mass in black hole binaries (BHBs) and supermassive in active galactic nuclei (AGN). Over the last 30 years, techniques have been developed that take into account these changes to estimate the spin which can then be used to understand the birth and growth of black holes and potentially the powering of powerful jets. In this chapter we provide a broad overview of both the theoretical effects of spin, the means by which it can be estimated and the results of ongoing campaigns.Comment: 55 pages, 5 figures. Published in: "Astrophysics of Black Holes - From fundamental aspects to latest developments", Ed. Cosimo Bambi, Springer: Astrophysics and Space Science Library. Additional corrections mad

    High-frequency variability in neutron-star low-mass X-ray binaries

    Full text link
    Binary systems with a neutron-star primary accreting from a companion star display variability in the X-ray band on time scales ranging from years to milliseconds. With frequencies of up to ~1300 Hz, the kilohertz quasi-periodic oscillations (kHz QPOs) represent the fastest variability observed from any astronomical object. The sub-millisecond time scale of this variability implies that the kHz QPOs are produced in the accretion flow very close to the surface of the neutron star, providing a unique view of the dynamics of matter under the influence of some of the strongest gravitational fields in the Universe. This offers the possibility to probe some of the most extreme predictions of General Relativity, such as dragging of inertial frames and periastron precession at rates that are sixteen orders of magnitude faster than those observed in the solar system and, ultimately, the existence of a minimum distance at which a stable orbit around a compact object is possible. Here we review the last twenty years of research on kHz QPOs, and we discuss the prospects for future developments in this field.Comment: 66 pages, 37 figures, 190 references. Review to appear in T. Belloni, M. Mendez, C. Zhang, editors, "Timing Neutron Stars: Pulsations, Oscillations and Explosions", ASSL, Springe

    Why is the Winner the Best?

    Get PDF
    International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multicenter study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and post-processing (66%). The “typical” lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work
    corecore