286 research outputs found

    Accuracy of Genome-Enabled Prediction in a Dairy Cattle Population using Different Cross-Validation Layouts

    Get PDF
    The impact of extent of genetic relatedness on accuracy of genome-enabled predictions was assessed using a dairy cattle population and alternative cross-validation (CV) strategies were compared. The CV layouts consisted of training and testing sets obtained from either random allocation of individuals (RAN) or from a kernel-based clustering of individuals using the additive relationship matrix, to obtain two subsets that were as unrelated as possible (UNREL), as well as a layout based on stratification by generation (GEN). The UNREL layout decreased the average genetic relationships between training and testing animals but produced similar accuracies to the RAN design, which were about 15% higher than in the GEN setting. Results indicate that the CV structure can have an important effect on the accuracy of whole-genome predictions. However, the connection between average genetic relationships across training and testing sets and the estimated predictive ability is not straightforward, and may depend also on the kind of relatedness that exists between the two subsets and on the heritability of the trait. For high heritability traits, close relatives such as parents and full-sibs make the greatest contributions to accuracy, which can be compensated by half-sibs or grandsires in the case of lack of close relatives. However, for the low heritability traits the inclusion of close relatives is crucial and including more relatives of various types in the training set tends to lead to greater accuracy. In practice, CV designs should resemble the intended use of the predictive models, e.g., within or between family predictions, or within or across generation predictions, such that estimation of predictive ability is consistent with the actual application to be considered

    A Primer on High-Throughput Computing for Genomic Selection

    Get PDF
    High-throughput computing (HTC) uses computer clusters to solve advanced computational problems, with the goal of accomplishing high-throughput over relatively long periods of time. In genomic selection, for example, a set of markers covering the entire genome is used to train a model based on known data, and the resulting model is used to predict the genetic merit of selection candidates. Sophisticated models are very computationally demanding and, with several traits to be evaluated sequentially, computing time is long, and output is low. In this paper, we present scenarios and basic principles of how HTC can be used in genomic selection, implemented using various techniques from simple batch processing to pipelining in distributed computer clusters. Various scripting languages, such as shell scripting, Perl, and R, are also very useful to devise pipelines. By pipelining, we can reduce total computing time and consequently increase throughput. In comparison to the traditional data processing pipeline residing on the central processors, performing general-purpose computation on a graphics processing unit provide a new-generation approach to massive parallel computing in genomic selection. While the concept of HTC may still be new to many researchers in animal breeding, plant breeding, and genetics, HTC infrastructures have already been built in many institutions, such as the University of Wisconsin–Madison, which can be leveraged for genomic selection, in terms of central processing unit capacity, network connectivity, storage availability, and middleware connectivity. Exploring existing HTC infrastructures as well as general-purpose computing environments will further expand our capability to meet increasing computing demands posed by unprecedented genomic data that we have today. We anticipate that HTC will impact genomic selection via better statistical models, faster solutions, and more competitive products (e.g., from design of marker panels to realized genetic gain). Eventually, HTC may change our view of data analysis as well as decision-making in the post-genomic era of selection programs in animals and plants, or in the study of complex diseases in humans

    Effects of anisotropy on the geometry of tracer particle trajectories in turbulent flows

    Get PDF
    Using curvature and torsion to describe Lagrangian trajectories gives a full description of these as well as an insight into small and large time scales as temporal derivatives up to order 3 are involved. One might expect that the statistics of these properties depend on the geometry of the flow. Therefore, we calculated curvature and torsion probability density functions (PDFs) of experimental Lagrangian trajectories processed using the Shake-the-Box algorithm of turbulent von Kármán flow, Rayleigh-Bénard convection and a zero-pressuregradient boundary layer over a flat plate. The results for the von-Kármán flow compare well with experimental results for the curvature PDF and numerical simulation of homogeneous and isotropic turbulence for the torsion PDF. For the experimental Rayleigh-Bénard convection, the power law tails found agree with those measured for von-Kármán flow. Results for the logarithmic layer within the boundary layer differ slightly, we give some potential explanation below. To detect and quantify the effect of anisotropy either resulting from a mean flow or large-scale coherent motions on the geometry or tracer particle trajectories, we introduce the curvature vector. We connect its statistics with those of velocity fluctuations and demonstrate that strong large-scale motion in a given spatial direction results in meandering rather than helical trajectories

    Effects of anisotropy on the geometry of tracer particle trajectories in turbulent flows

    Full text link
    Using curvature and torsion to describe Lagrangian trajectories gives a full description of these as well as an insight into small and large time scales as temporal derivatives up to order 3 are involved. One might expect that the statistics of these properties depend on the geometry of the flow. Therefore, we calculated curvature and torsion probability density functions (PDFs) of experimental Lagrangian trajectories processed using the Shake-the-Box algorithm of turbulent von K\'arm\'an flow, Rayleigh-B\'enard convection and a zero-pressure-gradient turbulent boundary layer over a flat plate. The results for the von K\'arm\'an flow compare well with previous experimental results for the curvature PDF and numerical simulation of homogeneous and isotropic turbulence for the torsion PDF. Results for Rayleigh-B\'enard convection agree with those obtained for K\'arm\'an flow, while results for the logarithmic layer within the boundary layer differ slightly, and we provide a potential explanation. To detect and quantify the effect of anisotropy either resulting from a mean flow or large-scale coherent motions on the geometry or tracer particle trajectories, we introduce the curvature vector. We connect its statistics with those of velocity fluctuations and demonstrate that strong large-scale motion in a given spatial direction results in meandering rather than helical trajectories

    Nitric oxide sensing in plants is mediated by proteolytic control of group VII ERF transcription factors

    Get PDF
    Nitric oxide (NO) is an important signaling compound in prokaryotes and eukaryotes. In plants, NO regulates critical developmental transitions and stress responses. Here, we identify a mechanism for NO sensing that coordinates responses throughout development based on targeted degradation of plant-specific transcriptional regulators, the group VII ethylene response factors (ERFs). We show that the N-end rule pathway of targeted proteolysis targets these proteins for destruction in the presence of NO, and we establish them as critical regulators of diverse NO-regulated processes, including seed germination, stomatal closure, and hypocotyl elongation. Furthermore, we define the molecular mechanism for NO control of germination and crosstalk with abscisic acid (ABA) signaling through ERF-regulated expression of ABSCISIC ACID INSENSITIVE5 (ABI5). Our work demonstrates how NO sensing is integrated across multiple physiological processes by direct modulation of transcription factor stability and identifies group VII ERFs as central hubs for the perception of gaseous signals in plants

    BAT AGN Spectroscopic Survey – XIII. The nature of the most luminous obscured AGN in the low-redshift universe

    Get PDF
    We present a multiwavelength analysis of 28 of the most luminous low-redshift narrow-line, ultra-hard X-ray-selected active galactic nuclei (AGN) drawn from the 70-month Swift/BAT all-sky survey, with bolometric luminosities of log(L_(bol)/ergs^(-1)) ≳ 45.25⁠. The broad goal of our study is to determine whether these objects have any distinctive properties, potentially setting them aside from lower luminosity obscured AGN in the local Universe. Our analysis relies on the first data release of the BAT AGN Spectroscopic Survey (BASS/DR1) and on dedicated observations with the VLT, Palomar, and Keck observatories. We find that the vast majority of our sources agree with commonly used AGN selection criteria which are based on emission line ratios and on mid-infrared colours. Our AGN are pre-dominantly hosted in massive galaxies (9.8 ≲ log (M*/M⊙) ≲ 11.7); based on visual inspection of archival optical images, they appear to be mostly ellipticals. Otherwise, they do not have distinctive properties. Their radio luminosities, determined from publicly available survey data, show a large spread of almost four orders of magnitude – much broader than what is found for lower X-ray luminosity obscured AGN in BASS. Moreover, our sample shows no preferred combination of black hole masses (MBH) and/or Eddington ratio (λ_(Edd)), covering 7.5 ≲ log (M_(BH)/M⊙) ≲ 10.3 and 0.01 ≲ λ_(Edd) ≲ 1. Based on the distribution of our sources in the λ_(Edd)−N_H plane, we conclude that our sample is consistent with a scenario where the amount of obscuring material along the line of sight is determined by radiation pressure exerted by the AGN on the dusty circumnuclear gas

    Model Evaluation Guidelines for Geomagnetic Index Predictions

    Full text link
    Geomagnetic indices are convenient quantities that distill the complicated physics of some region or aspect of near‐Earth space into a single parameter. Most of the best‐known indices are calculated from ground‐based magnetometer data sets, such as Dst, SYM‐H, Kp, AE, AL, and PC. Many models have been created that predict the values of these indices, often using solar wind measurements upstream from Earth as the input variables to the calculation. This document reviews the current state of models that predict geomagnetic indices and the methods used to assess their ability to reproduce the target index time series. These existing methods are synthesized into a baseline collection of metrics for benchmarking a new or updated geomagnetic index prediction model. These methods fall into two categories: (1) fit performance metrics such as root‐mean‐square error and mean absolute error that are applied to a time series comparison of model output and observations and (2) event detection performance metrics such as Heidke Skill Score and probability of detection that are derived from a contingency table that compares model and observation values exceeding (or not) a threshold value. A few examples of codes being used with this set of metrics are presented, and other aspects of metrics assessment best practices, limitations, and uncertainties are discussed, including several caveats to consider when using geomagnetic indices.Plain Language SummaryOne aspect of space weather is a magnetic signature across the surface of the Earth. The creation of this signal involves nonlinear interactions of electromagnetic forces on charged particles and can therefore be difficult to predict. The perturbations that space storms and other activity causes in some observation sets, however, are fairly regular in their pattern. Some of these measurements have been compiled together into a single value, a geomagnetic index. Several such indices exist, providing a global estimate of the activity in different parts of geospace. Models have been developed to predict the time series of these indices, and various statistical methods are used to assess their performance at reproducing the original index. Existing studies of geomagnetic indices, however, use different approaches to quantify the performance of the model. This document defines a standardized set of statistical analyses as a baseline set of comparison tools that are recommended to assess geomagnetic index prediction models. It also discusses best practices, limitations, uncertainties, and caveats to consider when conducting a model assessment.Key PointsWe review existing practices for assessing geomagnetic index prediction models and recommend a “standard set” of metricsAlong with fit performance metrics that use all data‐model pairs in their formulas, event detection performance metrics are recommendedOther aspects of metrics assessment best practices, limitations, uncertainties, and geomagnetic index caveats are also discussedPeer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/147764/1/swe20790_am.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/147764/2/swe20790.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/147764/3/swe20790-sup-0001-2018SW002067-SI.pd

    Probing the Structure and Evolution of BASS AGN through Eddington Ratios

    Full text link
    We constrain the intrinsic Eddington ratio (\lamEdd ) distribution function for local AGN in bins of low and high obscuration (log NH <= 22 and 22 < log NH < 25), using the Swift-BAT 70-month/BASS DR2 survey. We interpret the fraction of obscured AGN in terms of circum-nuclear geometry and temporal evolution. Specifically, at low Eddington ratios (log lamEdd < -2), obscured AGN outnumber unobscured ones by a factor of ~4, reflecting the covering factor of the circum-nuclear material (0.8, or a torus opening angle of ~ 34 degrees). At high Eddington ratios (\log lamEdd > -1), the trend is reversed, with < 30% of AGN having log NH > 22, which we suggest is mainly due to the small fraction of time spent in a highly obscured state. Considering the Eddington ratio distribution function of narrow-line and broad-line AGN from our prior work, we see a qualitatively similar picture. To disentangle temporal and geometric effects at high lamEdd, we explore plausible clearing scenarios such that the time-weighted covering factors agree with the observed population ratio. We find that the low fraction of obscured AGN at high lamEdd is primarily due to the fact that the covering factor drops very rapidly, with more than half the time is spent with < 10% covering factor. We also find that nearly all obscured AGN at high-lamEdd exhibit some broad-lines. We suggest that this is because the height of the depleted torus falls below the height of the broad-line region, making the latter visible from all lines of sight.Comment: Accepted by ApJ

    Discovery of SMAD4 promoters, transcription factor binding sites and deletions in juvenile polyposis patients

    Get PDF
    Inactivation of SMAD4 has been linked to several cancers and germline mutations cause juvenile polyposis (JP). We set out to identify the promoter(s) of SMAD4, evaluate their activity in cell lines and define possible transcription factor binding sites (TFBS). 5′-rapid amplification of cDNA ends (5′-RACE) and computational analyses were used to identify candidate promoters and corresponding TFBS and the activity of each was assessed by luciferase vectors in different cell lines. TFBS were disrupted by site-directed mutagenesis (SDM) to evaluate the effect on promoter activity. Four promoters were identified, two of which had significant activity in several cell lines, while two others had minimal activity. In silico analysis revealed multiple potentially important TFBS for each promoter. One promoter was deleted in the germline of two JP patients and SDM of several sites led to significant reduction in promoter activity. No mutations were found by sequencing this promoter in 65 JP probands. The predicted TFBS profiles for each of the four promoters shared few transcription factors in common, but were conserved across several species. The elucidation of these promoters and identification of TFBS has important implications for future studies in sporadic tumors from multiple sites, and in JP patients
    corecore