96 research outputs found

    Computation and approximation of the inverse of relationship matrices between genotyped animals: Algorithms and Applications

    Full text link
    The recent developments in molecular biology have made available thousands of genetic markers, allowing livestock genotyping at a reasonable cost and the subsequent development of genomic prediction. The single-step procedure, a unified approach of genomic prediction, requires inversion of two matrices gathering additive relationships between genotyped animals: the genomic relationship matrix (G) and a part of the additive relationship matrix (A22). The inverse of A22 may also be interesting for other applications. Matrix inverse can be constructed successively by, first, computing, for each animal, the vector containing contributions of other animals to its relationship and, secondly, adding the product of each vector of contributions by itself to a zeroed matrix. The objectives of this thesis were (1) to propose algorithms to compute or to approximate the vector of contributions and (2) to test the numerical efficiency of these algorithms (computing speed, memory use and, if needed, approximation accuracy). Computing contributions covered two points: (1) finding or approximating which contributions are different from zero, and (2) computing the value of contributions considered as non-zero. In the first approach, we considered that animals closely related have non-zero contributions and approximated their values by linear regression. This approach was extended in a recursive way. In the second approach, we empirically determined the set of non-zero contributions by a heuristic algorithm of pedigree exploration (only for the case of A22). Values were then computed either by linear regression, or using the already computed inverse. We also tested an approximation strategy: limiting the number of extracted generations of non-genotyped ancestors to reduce pedigree complexity. In a third approach, we followed the same heuristic algorithm as before but restricted the pedigree exploration to find out which animals have a non-zero contribution. Their values were approximated by linear regression. The presentation of the different approaches is followed by a general discussion in which the approaches are compared. It was found that the best compromise between speed, memory and approximation accuracy was achieved by the last approach for the case of A22. Use of this last approach simplified computations and therefore made predictions more feasible. However, for the case of G, no sufficient approximations could be reach in a reasonable time. Perspectives of other uses of algorithms developed and of future researches were drawn, as well as practical perspectives for animal breeding.NextGenGE

    A recursive algorithm for decomposition and creation of the inverse of the genomic relationship matrix

    Full text link
    peer reviewedSome genomic evaluation models require creation and inversion of a genomic relationship matrix (G). As the number of genotyped animals increases, G becomes larger and thus requires more time for inversion. A single-step genomic evaluation also requires inversion of the part of the pedigree relationship matrix for genotyped animals (A22). A strategy was developed to provide an approximation of the inverse of G (G− 1) that may also be applied to the inverse of A22 (A22 −1). The algorithm proceeds by creation of an incomplete Cholesky factorization (T−1) of G−1. For this purpose, a genomic relationship threshold determines whether 2 animals are closely related. For any animal, the sparsity pattern of the corresponding line in T−1 will thus gather elements corresponding to all close relatives of that animal. Any line of T−1 is filled in with resulting estimators of the least-squares regression of genomic relationships between close relatives on genomic relationship between the animal considered and those close relatives. The G−1 was computed as the matrix product (T-1)' D-1 T-1 where D−1 is a diagonal matrix. Then, T-1G(T-1)' resulted in a new matrix that is close to diagonal and also needs to be inverted. The inverse of that matrix was approximated with the same decomposition as for approximation of the inverse of G (G−1) , and the procedure was repeated in successive rounds of recursion until a matrix was obtained that was close enough to diagonal to be inverted element by element. Two applications of the approximation algorithm were tested in a single-step genomic evaluation of US Holstein final score, and correlation coefficients between estimated breeding values based on either real or approximated G−1 were compared. Approximations came closer to G−1 as the number of recursion rounds increased. Approximations were even more accurate and expected to be faster for A22. Timesaving strategies are needed to reduce the computing time required for the algorithm

    M-Theory on S^1/Z_2 : New Facts from a Careful Analysis

    Full text link
    We carefully re-examine the issues of solving the modified Bianchi identity, anomaly cancellations and flux quantization in the S^1/Z_2 orbifold of M-theory using the boundary-free "upstairs" formalism, avoiding several misconceptions present in earlier literature. While the solution for the four-form G to the modified Bianchi identity appears to depend on an arbitrary parameter b, we show that requiring G to be globally well-defined, i.e. invariant under small and large gauge and local Lorentz transformations, fixes b=1. This value also is necessary for a consistent reduction to the heterotic string in the small-radius limit. Insisting on properly defining all fields on the circle, we find that there is a previously unnoticed additional contribution to the anomaly inflow from the eleven-dimensional topological term. Anomaly cancellation then requires a quadratic relation between b and the combination lambda^6/kappa^4 of the gauge and gravitational coupling constants lambda and kappa. This contrasts with previous beliefs that anomaly cancellation would give a cubic equation for b. We observe that our solution for G automatically satisfies integer or half-integer flux quantization for the appropriate cycles. We explicitly write out the anomaly cancelling terms of the heterotic string as inherited from the M-theory approach. They differ from the usual ones by the addition of a well-defined local counterterm. We also show how five-branes enter our analysis.Comment: 32 pages, version to appear in Nucl. Phys. B, no figures, uses PHYZZ

    An evaluation of inbreeding measures using a whole-genome sequenced cattle pedigree.

    Get PDF
    peer reviewedThe estimation of the inbreeding coefficient (F) is essential for the study of inbreeding depression (ID) or for the management of populations under conservation. Several methods have been proposed to estimate the realized F using genetic markers, but it remains unclear which one should be used. Here we used whole-genome sequence data for 245 individuals from a Holstein cattle pedigree to empirically evaluate which estimators best capture homozygosity at variants causing ID, such as rare deleterious alleles or loci presenting heterozygote advantage and segregating at intermediate frequency. Estimators relying on the correlation between uniting gametes (F(UNI)) or on the genomic relationships (F(GRM)) presented the highest correlations with these variants. However, homozygosity at rare alleles remained poorly captured. A second group of estimators relying on excess homozygosity (F(HOM)), homozygous-by-descent segments (F(HBD)), runs-of-homozygosity (F(ROH)) or on the known genealogy (F(PED)) was better at capturing whole-genome homozygosity, reflecting the consequences of inbreeding on all variants, and for young alleles with low to moderate frequencies (0.10 < . < 0.25). The results indicate that F(UNI) and F(GRM) might present a stronger association with ID. However, the situation might be different when recessive deleterious alleles reach higher frequencies, such as in populations with a small effective population size. For locus-specific inbreeding measures or at low marker density, the ranking of the methods can also change as F(HBD) makes better use of the information from neighboring markers. Finally, we confirmed that genomic measures are in general superior to pedigree-based estimates. In particular, F(PED) was uncorrelated with locus-specific homozygosity

    Automatic landmarking identifies new loci associated with face morphology and implicates Neanderthal introgression in human nasal shape

    Get PDF
    We report a genome-wide association study of facial features in >6000 Latin Americans based on automatic landmarking of 2D portraits and testing for association with inter-landmark distances. We detected significant associations (P-value <5 × 10−8) at 42 genome regions, nine of which have been previously reported. In follow-up analyses, 26 of the 33 novel regions replicate in East Asians, Europeans, or Africans, and one mouse homologous region influences craniofacial morphology in mice. The novel region in 1q32.3 shows introgression from Neanderthals and we find that the introgressed tract increases nasal height (consistent with the differentiation between Neanderthals and modern humans). Novel regions include candidate genes and genome regulatory elements previously implicated in craniofacial development, and show preferential transcription in cranial neural crest cells. The automated approach used here should simplify the collection of large study samples from across the world, facilitating a cosmopolitan characterization of the genetics of facial features

    Different genes interact with particulate matter and tobacco smoke exposure in affecting lung function decline in the general population

    Get PDF
    BACKGROUND: Oxidative stress related genes modify the effects of ambient air pollution or tobacco smoking on lung function decline. The impact of interactions might be substantial, but previous studies mostly focused on main effects of single genes. OBJECTIVES: We studied the interaction of both exposures with a broad set of oxidative-stress related candidate genes and pathways on lung function decline and contrasted interactions between exposures. METHODS: For 12679 single nucleotide polymorphisms (SNPs), change in forced expiratory volume in one second (FEV(1)), FEV(1) over forced vital capacity (FEV(1)/FVC), and mean forced expiratory flow between 25 and 75% of the FVC (FEF(25-75)) was regressed on interval exposure to particulate matter >10 microm in diameter (PM10) or packyears smoked (a), additive SNP effects (b), and interaction terms between (a) and (b) in 669 adults with GWAS data. Interaction p-values for 152 genes and 14 pathways were calculated by the adaptive rank truncation product (ARTP) method, and compared between exposures. Interaction effect sizes were contrasted for the strongest SNPs of nominally significant genes (p(interaction)>0.05). Replication was attempted for SNPs with MAF<10% in 3320 SAPALDIA participants without GWAS. RESULTS: On the SNP-level, rs2035268 in gene SNCA accelerated FEV(1)/FVC decline by 3.8% (p(interaction) = 2.5x10(-6)), and rs12190800 in PARK2 attenuated FEV1 decline by 95.1 ml p(interaction) = 9.7x10(-8)) over 11 years, while interacting with PM10. Genes and pathways nominally interacting with PM10 and packyears exposure differed substantially. Gene CRISP2 presented a significant interaction with PM10 (p(interaction) = 3.0x10(-4)) on FEV(1)/FVC decline. Pathway interactions were weak. Replications for the strongest SNPs in PARK2 and CRISP2 were not successful. CONCLUSIONS: Consistent with a stratified response to increasing oxidative stress, different genes and pathways potentially mediate PM10 and tobac smoke effects on lung function decline. Ignoring environmental exposures would miss these patterns, but achieving sufficient sample size and comparability across study samples is challengin

    Fully automatic landmarking of 2D photographs identifies novel genetic loci influencing facial features

    Get PDF
    We report a genome-wide association study for facial features in > 6,000 Latin Americans. We placed 106 landmarks on 2D frontal photographs using the cloud service platform Face++. After Procrustes superposition, genome-wide association testing was performed for 301 inter-landmark distances. We detected nominally significant association (P-value < 5×10− 8) for 42 genome regions. Of these, 9 regions have been previously reported in GWAS of facial features. In follow-up analyses, we replicated 26 of the 33 novel regions (in East Asians or Europeans). The replicated regions include 1q32.3, 3q21.1, 8p11.21, 10p11.1, and 22q12.1, all comprising strong candidate genes involved in craniofacial development. Furthermore, the 1q32.3 region shows evidence of introgression from archaic humans. These results provide novel biological insights into facial variation and establish that automatic landmarking of standard 2D photographs is a simple and informative approach for the genetic analysis of facial variation, suitable for the rapid analysis of large population samples.- Introduction - Results And Discussion -- Study sample and phenotyping -- Trait/covariate correlation and heritability -- Overview of GWAS results and integration with the literature -- Follow-up of genomic regions newly associated with facial features: Replication in two human cohorts -- Follow-up of genomic regions newly associated with facial features: effects in the mouse -- Genome annotations at associated loci - Conclusion - Methods -- Study subjects -- Genotype data -- Phenotyping -- Statistical genetic analysis -- Interaction of EDAR with other genes -- Expression analysis for significant SNPs -- Detection of archaic introgression near ATF3 and association with facial features -- Annotation of SNPs in FUMA -- Shape GWAS in outbred mic

    A GWAS in Latin Americans identifies novel face shape loci, implicating VPS13B and a Denisovan introgressed region in facial variation

    Get PDF
    To characterize the genetic basis of facial features in Latin Americans, we performed a genome-wide association study (GWAS) of more than 6000 individuals using 59 landmark-based measurements from two-dimensional profile photographs and ~9,000,000 genotyped or imputed single-nucleotide polymorphisms. We detected significant association of 32 traits with at least 1 (and up to 6) of 32 different genomic regions, more than doubling the number of robustly associated face morphology loci reported until now (from 11 to 23). These GWAS hits are strongly enriched in regulatory sequences active specifically during craniofacial development. The associated region in 1p12 includes a tract of archaic adaptive introgression, with a Denisovan haplotype common in Native Americans affecting particularly lip thickness. Among the nine previously unidentified face morphology loci we identified is the VPS13B gene region, and we show that variants in this region also affect midfacial morphology in mice
    • 

    corecore