    Comparison of Computational Models for Assessing Conservation of Gene Expression across Species

    Assessing conservation/divergence of gene expression across species is important for the understanding of gene regulation evolution. Although advances in microarray technology have provided massive high-dimensional gene expression data, the analysis of such data is still challenging. To date, assessing cross-species conservation of gene expression using microarray data has been mainly based on comparison of expression patterns across corresponding tissues, or comparison of co-expression of a gene with a reference set of genes. Because direct and reliable high-throughput experimental data on conservation of gene expression are often unavailable, the assessment of these two computational models is very challenging and has not been reported yet. In this study, we compared one corresponding tissue based method and three co-expression based methods for assessing conservation of gene expression, in terms of their pair-wise agreements, using a frequently used human-mouse tissue expression dataset. We find that 1) the co-expression based methods are only moderately correlated with the corresponding tissue based methods, 2) the reliability of co-expression based methods is affected by the size of the reference ortholog set, and 3) the corresponding tissue based methods may lose some information for assessing conservation of gene expression. We suggest that the use of either of these two computational models to study the evolution of a gene's expression may be subject to great uncertainty, and the investigation of changes in both gene expression patterns over corresponding tissues and co-expression of the gene with other genes is necessary

    Comparison of Two Output-Coding Strategies for Multi-Class Tumor Classification Using Gene Expression Data and Latent Variable Model as Binary Classifier

    Multi-class cancer classification based on microarray data is described. A generalized output-coding scheme based on One Versus One (OVO) combined with Latent Variable Model (LVM) is used. Results from the proposed One Versus One (OVO) outputcoding strategy is compared with the results obtained from the generalized One Versus All (OVA) method and their efficiencies of using them for multi-class tumor classification have been studied. This comparative study was done using two microarray gene expression data: Global Cancer Map (GCM) dataset and brain cancer (BC) dataset. Primary feature selection was based on fold change and penalized t-statistics. Evaluation was conducted with varying feature numbers. The OVO coding strategy worked quite well with the BC data, while both OVO and OVA results seemed to be similar for the GCM data. The selection of output coding methods for combining binary classifiers for multi-class tumor classification depends on the number of tumor types considered, the discrepancies between the tumor samples used for training as well as the heterogeneity of expression within the cancer subtypes used as training data

    Comparison of Methods for Handling Censored Records in Beef Fertility Data: Field Data

    The purpose of this study was to compare methods for handling censored days to calving records in beef cattle data, and verify results of an earlier simulation study. Data were records from naturalservice matings of 33,176 first-calf females in Australian Angus herds.Three methods for handling censored records were evaluated. Censored records (records on noncalving females) were assigned penalty values on a within-contemporary group basis under the first method (DCPEN). Under the second method (DCSIM), censored records were drawn from their respective predictive truncated normal distributions, whereas censored records were deleted under the third method (DCMISS). Data were analyzed using a mixed linear model that included the fixed effects of contemporary group and sex of calf, linear and quadratic covariates for age at mating, and random effects of animal andresidual error. A Bayesian approach via Gibbs sampling was used to estimate variance components and predict breeding values.Posterior means (PM) (SD) of additive genetic variance for DCPEN, DCSIM, and DCMISS were 22.6d2 (4.2d2), 26.1d2(3.6d2), and 13.5d2(2.9d2),respectively. The PM (SD) of residual variance forDCPEN, DCSIM, and DCMISS were 431.4d2(5.0d2),371.4d2 (4.5d2), and 262.2d2(3.4d2), respectively. ThePM (SD) of heritability for DCPEN, DCSIM, andDCMISS were 0.05 (0.01), 0.07 (0.01), and 0.05 (0.01),respectively. Simulating trait records for noncalvingfemales resulted in similar heritability to the penaltymethod but lower residual variance. Pearson correlationsbetween posterior means of animal effects for sireswith more than 20 daughters with records were 0.99between DCPEN and DCSIM, 0.77 between DCPENand DCMISS, and 0.81 between DCSIM and DCMISS.Of the 424 sires ranked in the top 10% and bottom 10%of sires in DCPEN, 91% and 89%, respectively, werealso ranked in the top 10% and bottom 10% in DCSIM.Little difference was observed between DCPEN andDCSIM for correlations between posterior means of animaleffects for sires, indicating that no major rerankingof sires would be expected. This finding suggests littledifference between these two censored data handlingtechniques for use in genetic evaluation of days to calving

    Effect of Diet Supplementation on the Expression of Bovine Genes Associated with Fatty Acid Synthesis and Metabolism

    Conjugated linoleic acids (CLA) are of important nutritional and health benefit to human. Food products of animal origin are their major dietary source and their concentration increases with high concentrate diets fed to animals. To examine the effects of diet supplementation on the expression of genes related to lipid metabolism, 28 Angus steers were fed either pasture only, pasture with soybean hulls and corn oil, pasture with corn grain, or high concentrate diet. At slaughter, samples of subcutaneous adipose tissue were collected, from which RNA was extracted. Relative abundance of gene expression was measured using Affymetrix GeneChip Bovine Genome array. An ANOVA model nested within gene was used to analyze the background adjusted, normalized average difference of probe-level intensities. To control experiment wise error, a false discovery rate of 0.01 was imposed on all contrasts. Expression of several genes involved in the synthesis of enzymes related to fatty acid metabolism and lipogenesis such as stearoyl-CoA desaturase (SCD), fatty acid synthetase (FASN), lipoprotein lipase (LPL), fatty-acyl elongase (LCE) along with several trancription factors and co-activators involved in lipogenesis were found to be differentially expressed. Confirmatory RT-qPCR was done to validate the microarray results, which showed satisfactory correspondence between the two platforms. Results show that changes in diet by increasing dietary energy intake by supplementing high concentrate diet have effects on the transcription of genes encoding enzymes involved in fat metabolism which in turn has effects on fatty acid content in the carcass tissue as well as carcass quality. Corn supplementation either as oil or grain appeared to significantly alter the expression of genes directly associated with fatty acid synthesis

    Model for fitting longitudinal traits subject to threshold response applied to genetic evaluation for heat tolerance

    A semi-parametric non-linear longitudinal hierarchical model is presented. The model assumes that individual variation exists both in the degree of the linear change of performance (slope) beyond a particular threshold of the independent variable scale and in the magnitude of the threshold itself; these individual variations are attributed to genetic and environmental components. During implementation via a Bayesian MCMC approach, threshold levels were sampled using a Metropolis step because their fully conditional posterior distributions do not have a closed form. The model was tested by simulation following designs similar to previous studies on genetics of heat stress. Posterior means of parameters of interest, under all simulation scenarios, were close to their true values with the latter always being included in the uncertain regions, indicating an absence of bias. The proposed models provide flexible tools for studying genotype by environmental interaction as well as for fitting other longitudinal traits subject to abrupt changes in the performance at particular points on the independent variable scale

    AntEpiSeeker: detecting epistatic interactions for case-control studies using a two-stage ant colony optimization algorithm

    <p>Abstract</p> <p>Background</p> <p>Epistatic interactions of multiple single nucleotide polymorphisms (SNPs) are now believed to affect individual susceptibility to common diseases. The detection of such interactions, however, is a challenging task in large scale association studies. Ant colony optimization (ACO) algorithms have been shown to be useful in detecting epistatic interactions.</p> <p>Findings</p> <p>AntEpiSeeker, a new two-stage ant colony optimization algorithm, has been developed for detecting epistasis in a case-control design. Based on some practical epistatic models, AntEpiSeeker has performed very well.</p> <p>Conclusions</p> <p>AntEpiSeeker is a powerful and efficient tool for large-scale association studies and can be downloaded from <url>http://nce.ads.uga.edu/~romdhane/AntEpiSeeker/index.html</url>.</p

    Principal component approach in variance component estimation for international sire evaluation

    <p>Abstract</p> <p>Background</p> <p>The dairy cattle breeding industry is a highly globalized business, which needs internationally comparable and reliable breeding values of sires. The international Bull Evaluation Service, Interbull, was established in 1983 to respond to this need. Currently, Interbull performs multiple-trait across country evaluations (MACE) for several traits and breeds in dairy cattle and provides international breeding values to its member countries. Estimating parameters for MACE is challenging since the structure of datasets and conventional use of multiple-trait models easily result in over-parameterized genetic covariance matrices. The number of parameters to be estimated can be reduced by taking into account only the leading principal components of the traits considered. For MACE, this is readily implemented in a random regression model.</p> <p>Methods</p> <p>This article compares two principal component approaches to estimate variance components for MACE using real datasets. The methods tested were a REML approach that directly estimates the genetic principal components (direct PC) and the so-called bottom-up REML approach (bottom-up PC), in which traits are sequentially added to the analysis and the statistically significant genetic principal components are retained. Furthermore, this article evaluates the utility of the bottom-up PC approach to determine the appropriate rank of the (co)variance matrix.</p> <p>Results</p> <p>Our study demonstrates the usefulness of both approaches and shows that they can be applied to large multi-country models considering all concerned countries simultaneously. These strategies can thus replace the current practice of estimating the covariance components required through a series of analyses involving selected subsets of traits. Our results support the importance of using the appropriate rank in the genetic (co)variance matrix. Using too low a rank resulted in biased parameter estimates, whereas too high a rank did not result in bias, but increased standard errors of the estimates and notably the computing time.</p> <p>Conclusions</p> <p>In terms of estimation's accuracy, both principal component approaches performed equally well and permitted the use of more parsimonious models through random regression MACE. The advantage of the bottom-up PC approach is that it does not need any previous knowledge on the rank. However, with a predetermined rank, the direct PC approach needs less computing time than the bottom-up PC.</p

    Higher heritabilities for gait components than for overall gait scores may improve mobility in ducks

    International audienceAbstractBackgroundGenetic progress in selection for greater body mass and meat yield in poultry has been associated with an increase in gait problems which are detrimental to productivity and welfare. The incidence of suboptimal gait in breeding flocks is controlled through the use of a visual gait score, which is a subjective assessment of walking ability of each bird. The subjective nature of the visual gait score has led to concerns over its effectiveness in reducing the incidence of suboptimal gait in poultry through breeding. The aims of this study were to assess the reliability of the current visual gait scoring system in ducks and to develop a more objective method to select for better gait.ResultsExperienced gait scorers assessed short video clips of walking ducks to estimate the reliability of the current visual gait scoring system. Kendall’s coefficients of concordance between and within observers were estimated at 0.49 and 0.75, respectively. In order to develop a more objective scoring system, gait components were visually scored on more than 4000 pedigreed Pekin ducks and genetic parameters were estimated for these components. Gait components, which are a more objective measure, had heritabilities that were as good as, or better than, those of the overall visual gait score.ConclusionsMeasurement of gait components is simpler and therefore more objective than the standard visual gait score. The recording of gait components can potentially be automated, which may increase accuracy further and may improve heritability estimates. Genetic correlations were generally low, which suggests that it is possible to use gait components to select for an overall improvement in both economic traits and gait as part of a balanced breeding programme
