66 research outputs found
Statistical Inference for Valued-Edge Networks: Generalized Exponential Random Graph Models
Across the sciences, the statistical analysis of networks is central to the
production of knowledge on relational phenomena. Because of their ability to
model the structural generation of networks, exponential random graph models
are a ubiquitous means of analysis. However, they are limited by an inability
to model networks with valued edges. We solve this problem by introducing a
class of generalized exponential random graph models capable of modeling
networks whose edges are valued, thus greatly expanding the scope of networks
applied researchers can subject to statistical analysis
Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants
Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants
How Does Spatial Study Design Influence Density Estimates from Spatial Capture-Recapture Models?
When estimating population density from data collected on non-invasive detector arrays, recently developed spatial capture-recapture (SCR) models present an advance over non-spatial models by accounting for individual movement. While these models should be more robust to changes in trapping designs, they have not been well tested. Here we investigate how the spatial arrangement and size of the trapping array influence parameter estimates for SCR models. We analysed black bear data collected with 123 hair snares with an SCR model accounting for differences in detection and movement between sexes and across the trapping occasions. To see how the size of the trap array and trap dispersion influence parameter estimates, we repeated analysis for data from subsets of traps: 50% chosen at random, 50% in the centre of the array and 20% in the South of the array. Additionally, we simulated and analysed data under a suite of trap designs and home range sizes. In the black bear study, we found that results were similar across trap arrays, except when only 20% of the array was used. Black bear density was approximately 10 individuals per 100 km2. Our simulation study showed that SCR models performed well as long as the extent of the trap array was similar to or larger than the extent of individual movement during the study period, and movement was at least half the distance between traps. SCR models performed well across a range of spatial trap setups and animal movements. Contrary to non-spatial capture-recapture models, they do not require the trapping grid to cover an area several times the average home range of the studied species. This renders SCR models more appropriate for the study of wide-ranging mammals and more flexible to design studies targeting multiple species
Using combined diagnostic test results to hindcast trends of infection from cross-sectional data
Infectious disease surveillance is key to limiting the consequences from infectious pathogens and maintaining animal and public health. Following the detection of a disease outbreak, a response in proportion to the severity of the outbreak is required. It is thus critical to obtain accurate information concerning the origin of the outbreak and its forward trajectory. However, there is often a lack of situational awareness that may lead to over- or under-reaction. There is a widening range of tests available for detecting pathogens, with typically different temporal characteristics, e.g. in terms of when peak test response occurs relative to time of exposure. We have developed a statistical framework that combines response level data from multiple diagnostic tests and is able to ‘hindcast’ (infer the historical trend of) an infectious disease epidemic. Assuming diagnostic test data from a cross-sectional sample of individuals infected with a pathogen during an outbreak, we use a Bayesian Markov Chain Monte Carlo (MCMC) approach to estimate time of exposure, and the overall epidemic trend in the population prior to the time of sampling. We evaluate the performance of this statistical framework on simulated data from epidemic trend curves and show that we can recover the parameter values of those trends. We also apply the framework to epidemic trend curves taken from two historical outbreaks: a bluetongue outbreak in cattle, and a whooping cough outbreak in humans. Together, these results show that hindcasting can estimate the time since infection for individuals and provide accurate estimates of epidemic trends, and can be used to distinguish whether an outbreak is increasing or past its peak. We conclude that if temporal characteristics of diagnostics are known, it is possible to recover epidemic trends of both human and animal pathogens from cross-sectional data collected at a single point in time
Epistasis: Obstacle or Advantage for Mapping Complex Traits?
Identification of genetic loci in complex traits has focused largely on one-dimensional genome scans to search for associations between single markers and the phenotype. There is mounting evidence that locus interactions, or epistasis, are a crucial component of the genetic architecture of biologically relevant traits. However, epistasis is often viewed as a nuisance factor that reduces power for locus detection. Counter to expectations, recent work shows that fitting full models, instead of testing marker main effect and interaction components separately, in exhaustive multi-locus genome scans can have higher power to detect loci when epistasis is present than single-locus scans, and improvement that comes despite a much larger multiple testing alpha-adjustment in such searches. We demonstrate, both theoretically and via simulation, that the expected power to detect loci when fitting full models is often larger when these loci act epistatically than when they act additively. Additionally, we show that the power for single locus detection may be improved in cases of epistasis compared to the additive model. Our exploration of a two step model selection procedure shows that identifying the true model is difficult. However, this difficulty is certainly not exacerbated by the presence of epistasis, on the contrary, in some cases the presence of epistasis can aid in model selection. The impact of allele frequencies on both power and model selection is dramatic
Black hole spin: theory and observation
In the standard paradigm, astrophysical black holes can be described solely
by their mass and angular momentum - commonly referred to as `spin' - resulting
from the process of their birth and subsequent growth via accretion. Whilst the
mass has a standard Newtonian interpretation, the spin does not, with the
effect of non-zero spin leaving an indelible imprint on the space-time closest
to the black hole. As a consequence of relativistic frame-dragging, particle
orbits are affected both in terms of stability and precession, which impacts on
the emission characteristics of accreting black holes both stellar mass in
black hole binaries (BHBs) and supermassive in active galactic nuclei (AGN).
Over the last 30 years, techniques have been developed that take into account
these changes to estimate the spin which can then be used to understand the
birth and growth of black holes and potentially the powering of powerful jets.
In this chapter we provide a broad overview of both the theoretical effects of
spin, the means by which it can be estimated and the results of ongoing
campaigns.Comment: 55 pages, 5 figures. Published in: "Astrophysics of Black Holes -
From fundamental aspects to latest developments", Ed. Cosimo Bambi, Springer:
Astrophysics and Space Science Library. Additional corrections mad
High-frequency variability in neutron-star low-mass X-ray binaries
Binary systems with a neutron-star primary accreting from a companion star
display variability in the X-ray band on time scales ranging from years to
milliseconds. With frequencies of up to ~1300 Hz, the kilohertz quasi-periodic
oscillations (kHz QPOs) represent the fastest variability observed from any
astronomical object. The sub-millisecond time scale of this variability implies
that the kHz QPOs are produced in the accretion flow very close to the surface
of the neutron star, providing a unique view of the dynamics of matter under
the influence of some of the strongest gravitational fields in the Universe.
This offers the possibility to probe some of the most extreme predictions of
General Relativity, such as dragging of inertial frames and periastron
precession at rates that are sixteen orders of magnitude faster than those
observed in the solar system and, ultimately, the existence of a minimum
distance at which a stable orbit around a compact object is possible. Here we
review the last twenty years of research on kHz QPOs, and we discuss the
prospects for future developments in this field.Comment: 66 pages, 37 figures, 190 references. Review to appear in T. Belloni,
M. Mendez, C. Zhang, editors, "Timing Neutron Stars: Pulsations, Oscillations
and Explosions", ASSL, Springe
Why is the Winner the Best?
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multicenter study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and post-processing (66%). The “typical” lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work
- …