41 research outputs found
Short communication: Imputing genotypes using PedImpute fast algorithm combining pedigree and population information
Routine genomic evaluations frequently include a preliminary imputation step, requiring high accuracy and reduced computing time. A new algorithm, PedImpute (http://dekoppel.eu/pedimpute/), was developed and compared with findhap (http://aipl.arsusda.gov/software/findhap/) and BEAGLE (http://faculty.washington.edu/browning/beagle/beagle.html), using 19,904 Holstein genotypes from a 4-country international collaboration (United States, Canada, UK, and Italy). Different scenarios were evaluated on a sample subset that included only single nucleotide polymorphism from the Bovine low-density (LD) Illumina BeadChip (Illumina Inc., San Diego, CA). Comparative criteria were computing time, percentage of missing alleles, percentage of wrongly imputed alleles, and the allelic squared correlation. Imputation accuracy on ungenotyped animals was also analyzed. The algorithm PedImpute was slightly more accurate and faster than findhap and BEAGLE when sire, dam, and maternal grandsire were genotyped at high density. On the other hand, BEAGLE performed better than both PedImpute and findhap for animals with at least one close relative not genotyped or genotyped at low density. However, computing time and resources using BEAGLE were incompatible with routine genomic evaluations in Italy. Error rate and allelic squared correlation attained by PedImpute ranged from 0.2 to 1.1% and from 96.6 to 99.3%, respectively. When complete genomic information on sire, dam, and maternal grandsire are available, as expected to be the case in the close future in (at least) dairy cattle, and considering accuracies obtained and computation time required, PedImpute represents a valuable choice in routine evaluations among the algorithms tested
Genome wide scan for somatic cell counts in holstein bulls
Mastitis is the most costly disease for dairy production, and control of the disease is often difficult, due to its multi-factorial nature. Susceptibility to mastitis is under partial genetic control and the industry uses indirect selection for decreased concentrations of somatic cells in milk to reduce mastitis.
Background: Mastitis is the most costly disease for dairy production, and control of the disease is often difficult,
due to its multi-factorial nature. Susceptibility to mastitis is under partial genetic control and the industry uses
indirect selection for decreased concentrations of somatic cells in milk to reduce mastitis.
Methods: A genome-wide scan was performed to identify genomic regions associated with deregressed estimated
breeding values (EBVs) for somatic cell counts (SCC) in Holstein bulls. In total 1183 proven bulls of the Italian of
Holstein population, were genotyped with the BovineSNP50 BeadChip (Illumina, San Diego, CA) and a whole
genome association analysis was performed using the R package GenABEL.
Results: Two chromosomal regions showed association with SCC, a region on chromosome 14 with high
significance (P < 5x10-6) and a region on chromosome 6 with moderate significance (P < 5x10-5).
Conclusions: Two regions with effects on SCC have been identified with good statistical support. A further study
of these candidate regions will be performed to verify the results and identify the causal mutations
Genome-wide patterns of homozygosity provide clues about the population history and adaptation of goats
Abstract Background Patterns of homozygosity can be influenced by several factors, such as demography, recombination, and selection. Using the goat SNP50 BeadChip, we genotyped 3171 goats belonging to 117 populations with a worldwide distribution. Our objectives were to characterize the number and length of runs of homozygosity (ROH) and to detect ROH hotspots in order to gain new insights into the consequences of neutral and selection processes on the genome-wide homozygosity patterns of goats. Results The proportion of the goat genome covered by ROH is, in general, less than 15% with an inverse relationship between ROH length and frequency i.e. short ROH ( 0.20) F ROH values. For populations from Asia, the average number of ROH is smaller and their coverage is lower in goats from the Near East than in goats from Central Asia, which is consistent with the role of the Fertile Crescent as the primary centre of goat domestication. We also observed that local breeds with small population sizes tend to have a larger fraction of the genome covered by ROH compared to breeds with tens or hundreds of thousands of individuals. Five regions on three goat chromosomes i.e. 11, 12 and 18, contain ROH hotspots that overlap with signatures of selection. Conclusions Patterns of homozygosity (average number of ROH of 77 and genome coverage of 248 Mb; F ROH < 0.15) are similar in goats from different geographic areas. The increased homozygosity in local breeds is the consequence of their small population size and geographic isolation as well as of founder effects and recent inbreeding. The existence of three ROH hotspots that co-localize with signatures of selection demonstrates that selection has also played an important role in increasing the homozygosity of specific regions in the goat genome. Finally, most of the goat breeds analysed in this work display low levels of homozygosity, which is favourable for their genetic management and viability
Domestication of cattle: two or three events?
Cattle have been invaluable for the transition of human society from nomadic hunter-gatherers
to sedentary farming communities throughout much of Europe, Asia and
Africa since the earliest domestication of cattle more than 10,000 years ago.
Although current understanding of relationships among ancestral populations remains
limited, domestication of cattle is thought to have occurred on two or three
occasions, giving rise to the taurine (Bos taurus) and indicine (Bos indicus) species that
share the aurochs (Bos primigenius) as common ancestor ~250,000 years ago. Indicine
and taurine cattle were domesticated in the Indus Valley and Fertile Crescent, respectively;
however, an additional domestication event for taurine in the Western
Desert of Egypt has also been proposed. We analysed medium density Illumina
Bovine SNP array (~54,000 loci) data across 3,196 individuals, representing 180 taurine
and indicine populations to investigate population structure within and between
populations, and domestication and demographic dynamics using approximate
Bayesian computation (ABC). Comparative analyses between scenarios modelling
two and three domestication events consistently favour a model with only two episodes
and suggest that the additional genetic variation component usually detected
in African taurine cattle may be explained by hybridization with local aurochs in
Africa after the domestication of taurine cattle in the Fertile Crescent. African indicine
cattle exhibit high levels of shared genetic variation with Asian indicine cattle
due to their recent divergence and with African taurine cattle through relatively recent
gene flow. Scenarios with unidirectional or bidirectional migratory events between
European taurine and Asian indicine cattle are also plausible, although further
studies are needed to disentangle the complex human-mediated
dispersion patterns
of domestic cattle. This study therefore helps to clarify the effect of past demographic
history on the genetic variation of modern cattle, providing a basis for further
analyses exploring alternative migratory routes for early domestic populations
Exome sequences and multi-environment field trials elucidate the genetic basis of adaptation in barley
Broadening the genetic base of crops is crucial for developing varieties to respond to global agricultural challenges such as climate change. Here, we analysed a diverse panel of 371 domesticated lines of the model crop of barley to explore the genetics of crop adaptation. We first collected exome sequence data and phenotypes of key life history traits from contrasting multi-environment common garden trials. Then we applied refined statistical methods, including based on exomic haplotype states, for genotype-by-environment (G
7E) modelling. Sub-populations defined from exomic profiles were coincident with barley's biology, geography and history, and explained a high proportion of trial phenotypic variance. Clear G
7E interactions indicated adaptation profiles that varied for landraces and cultivars. Exploration of circadian clock-related genes, associated with the environmentally-adaptive days to heading trait (crucial for the crop's spread from the Fertile Crescent), illustrated complexities in G
7E effect directions, and the importance of latitudinally-based genic context in the expression of large effect alleles. Our analysis supports a gene-level scientific understanding of crop adaption and leads to practical opportunities for crop improvement, allowing the prioritisation of genomic regions and particular sets of lines for breeding efforts seeking to cope with climate change and other stresses
Use of SNP genotypes to identify carriers of harmful recessive mutations in cattle populations
Background
SNP (single nucleotide polymorphisms) genotype data are increasingly available in cattle populations and, among other things, can be used to predict carriers of specific mutations. It is therefore convenient to have a practical statistical method for the accurate classification of individuals into carriers and non-carriers. In this paper, we compared – through cross-validation– five classification models (Lasso-penalized logistic regression –Lasso, Support Vector Machines with either linear or radial kernel –SVML and SVMR, k-nearest neighbors –KNN, and multi-allelic gene prediction –MAG), for the identification of carriers of the TUBD1 recessive mutation on BTA19 (Bos taurus autosome 19), known to be associated with high calf mortality. A population of 3116 Fleckvieh and 392 Brown Swiss animals genotyped with the 54K SNP-chip was available for the analysis.
Results
In general, the use of SNP genotypes proved to be very effective for the identification of mutation carriers. The best predictive models were Lasso, SVML and MAG, with an average error rate, respectively, of 0.2 %, 0.4 % and 0.6 % in Fleckvieh, and 1.2 %, 0.9 % and 1.7 % in Brown Swiss. For the three models, the false positive rate was, respectively, 0.1 %, 0.1 % and 0.2 % in Fleckvieh, and 3.0 %, 2.4 % and 1.6 % in Brown Swiss; the false negative rate was 4.4 %, 7.6 %1.0 % in Fleckvieh, and 0.0 %, 0.1% and 0.8 % in Brown Swiss. MAG appeared to be more robust to sample size reduction: with 25 % of the data, the average error rate was 0.7 % and 2.2 % in Fleckvieh and Brown Swiss, compared to 2.1 % and 5.5 % with Lasso, and 2.6 % and 12.0 % with SVML.
Conclusions
The use of SNP genotypes is a very effective and efficient technique for the identification of mutation carriers in cattle populations. Very few misclassifications were observed, overall and both in the carriers and non-carriers classes. This indicates that this is a very reliable approach for potential applications in cattle breeding
Effect of prior distributions on accuracy of genomic breeding values for two dairy traits
The ideal method to estimate direct genomic values (DGV) would calculate the conditional mean of the breeding value given the genotype of individuals at each quantitative traits locus (QTL). In this study we compare accuracies of DGV obtained using three different prior distributions of the single-nucleotide polymorphism (SNP) effects (normal, Student's t and double-exponential) in simulated data, to understand the extent of reduction in DGV accuracy when the prior distribution does not match the true distribution of QTL effects. We then apply the methods in a real dataset of 1149 Australian Holstein-Friesian bulls, both to find the prior distribution that is most robust across traits and to make interpretations about the true distribution of QTL effects. Methods using normal and Student's t prior distributions had fixed hyper-parameters, whereas hyper-parameters for double-exponential prior distribution were conditional to the data. Using the Student's t distribution for the prior distribution of SNP effects gave the largest estimates of SNP effects in the presence of QTL with large effects in both simulated and real data, and achieved the best accuracies of DGV in both datasets. The double-exponential distribution resulted in higher shrinkage of SNP effect estimates, even when a large true effect was present. The normal distribution resulted in the greatest degree of shrinkage of estimated effects, and gave the lowest accuracies. The amount of information of the data analyzed might still be inadequate to estimate these hyper-parameters accurately. A Student's t distribution with fixed hyper-parameters was the best approximation of the QTL distribution for the two dairy traits analyzed