58 research outputs found
Short communication: Imputing genotypes using PedImpute fast algorithm combining pedigree and population information
Routine genomic evaluations frequently include a preliminary imputation step, requiring high accuracy and reduced computing time. A new algorithm, PedImpute (http://dekoppel.eu/pedimpute/), was developed and compared with findhap (http://aipl.arsusda.gov/software/findhap/) and BEAGLE (http://faculty.washington.edu/browning/beagle/beagle.html), using 19,904 Holstein genotypes from a 4-country international collaboration (United States, Canada, UK, and Italy). Different scenarios were evaluated on a sample subset that included only single nucleotide polymorphism from the Bovine low-density (LD) Illumina BeadChip (Illumina Inc., San Diego, CA). Comparative criteria were computing time, percentage of missing alleles, percentage of wrongly imputed alleles, and the allelic squared correlation. Imputation accuracy on ungenotyped animals was also analyzed. The algorithm PedImpute was slightly more accurate and faster than findhap and BEAGLE when sire, dam, and maternal grandsire were genotyped at high density. On the other hand, BEAGLE performed better than both PedImpute and findhap for animals with at least one close relative not genotyped or genotyped at low density. However, computing time and resources using BEAGLE were incompatible with routine genomic evaluations in Italy. Error rate and allelic squared correlation attained by PedImpute ranged from 0.2 to 1.1% and from 96.6 to 99.3%, respectively. When complete genomic information on sire, dam, and maternal grandsire are available, as expected to be the case in the close future in (at least) dairy cattle, and considering accuracies obtained and computation time required, PedImpute represents a valuable choice in routine evaluations among the algorithms tested
Haplotype Inference on Pedigrees with Recombinations, Errors, and Missing Genotypes via SAT solvers
The Minimum-Recombinant Haplotype Configuration problem (MRHC) has been
highly successful in providing a sound combinatorial formulation for the
important problem of genotype phasing on pedigrees. Despite several algorithmic
advances and refinements that led to some efficient algorithms, its
applicability to real datasets has been limited by the absence of some
important characteristics of these data in its formulation, such as mutations,
genotyping errors, and missing data.
In this work, we propose the Haplotype Configuration with Recombinations and
Errors problem (HCRE), which generalizes the original MRHC formulation by
incorporating the two most common characteristics of real data: errors and
missing genotypes (including untyped individuals). Although HCRE is
computationally hard, we propose an exact algorithm for the problem based on a
reduction to the well-known Satisfiability problem. Our reduction exploits
recent progresses in the constraint programming literature and, combined with
the use of state-of-the-art SAT solvers, provides a practical solution for the
HCRE problem. Biological soundness of the phasing model and effectiveness (on
both accuracy and performance) of the algorithm are experimentally demonstrated
under several simulated scenarios and on a real dairy cattle population.Comment: 14 pages, 1 figure, 4 tables, the associated software reHCstar is
available at http://www.algolab.eu/reHCsta
Body measurements from selective hunting: biometric features of red deer (Cervus elaphus) from Northern Apennine, Italy
Morphometric studies on European red deer (Cervus elaphus L.) living in sub-Mediterranean areas are rare. In this paper, we provide the first morphometric description of red deer from Apennine living in Prato Province, as well as a description of its skeletal growth pattern. We analysed 18 body, cranial and antler measurements from 905 deer carcasses, collected during 12 hunting seasons (2000–2012). The body size of red deer from Prato appeared comparable to that of other populations from Northern Apennine and Central Alps. A significant variation in weight during the hunting season was detected only in adult stags: they were estimated to lose 23% of their eviscerated body weight from the beginning of the rutting season until the end of winter. The relationship between eviscerated body weight (EW) and whole body weight (WW) was highly significant in both sexes within every age class (R2 always higher than 0.75), thus linear regressions were assessed in order to estimate EW from WW, allowing to complete datasets when such information is missing. Growth equations were utilised to describe the development of a subset of skeletal measures (height at shoulder, hind foot length, mandible length, head–trunk length) commonly collected on hunted cervids. Hind foot length was the measure which first ceased to grow and with the highest growth constant; although the relationship between cohort hind foot length and environmental, climatic and demographic variables has to be tested for red deer from Apennine, these bones appeared a suitable biological indicator for long-term monitoring of the species
Relationship among Milk Conductivity, Production Traits, and Somatic Cell Score in the Italian Mediterranean Buffalo
The measurement of milk electrical conductivity (EC) is a relatively simple and inexpensive technique that has been evaluated as a routine method for the diagnosis of mastitis in dairy farms. The aim of this study was to obtain further knowledge on relationships between EC, production traits and somatic cell count (SCC) in Italian Mediterranean Buffalo. The original dataset included 5411 records collected from 808 buffalo cows. Two mixed models were used to evaluate both the effect of EC on MY, PP and FP and EC at test-day, and the effect of EC on somatic cell score (SCS) by using five different parameters (EC_param), namely: EC collected at the official milk recording test day (EC_day0), EC collected 3 days before official milk recording (EC_day3), and three statistics calculated from EC collected 1, 3 and 5 days before each test-day, respectively. All effects included in the model were significant for all traits, with the only exception of the effect of EC nested within parity for FP. The relationship between EC and SCS was always positive, but of different magnitude according to the parity. The regression of EC on SCS at test-day using different EC parameters was always significant except when the regression parameter was the slope obtained from a linear regression of EC collected over the 5-day period. Moreover, in order to evaluate how well the different models fit the data, three parameters were used: the Average Information Criteria (AIC), the marginal R2 and the conditional R2. According to AIC and to both the Marginal and Conditional R2, the best results were obtained when the regression parameter was the mean EC estimated over the 5-day period
- …