14 research outputs found
Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems
Background: Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits.Results: A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework.Conclusions: sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets
Genome-Wide Analysis of the World's Sheep Breeds Reveals High Levels of Historic Mixture and Strong Recent Selection
Genomic structure in a global collection of domesticated sheep reveals a history of artificial selection for horn loss and traits relating to pigmentation, reproduction, and body size
Epilepsy Caused by an Abnormal Alternative Splicing with Dosage Effect of the SV2A Gene in a Chicken Model
Photosensitive reflex epilepsy is caused by the combination of an individual's enhanced sensitivity with relevant light stimuli, such as stroboscopic lights or video games. This is the most common reflex epilepsy in humans; it is characterized by the photoparoxysmal response, which is an abnormal electroencephalographic reaction, and seizures triggered by intermittent light stimulation. Here, by using genetic mapping, sequencing and functional analyses, we report that a mutation in the acceptor site of the second intron of SV2A (the gene encoding synaptic vesicle glycoprotein 2A) is causing photosensitive reflex epilepsy in a unique vertebrate model, the Fepi chicken strain, a spontaneous model where the neurological disorder is inherited as an autosomal recessive mutation. This mutation causes an aberrant splicing event and significantly reduces the level of SV2A mRNA in homozygous carriers. Levetiracetam, a second generation antiepileptic drug, is known to bind SV2A, and SV2A knock-out mice develop seizures soon after birth and usually die within three weeks. The Fepi chicken survives to adulthood and responds to levetiracetam, suggesting that the low-level expression of SV2A in these animals is sufficient to allow survival, but does not protect against seizures. Thus, the Fepi chicken model shows that the role of the SV2A pathway in the brain is conserved between birds and mammals, in spite of a large phylogenetic distance. The Fepi model appears particularly useful for further studies of physiopathology of reflex epilepsy, in comparison with induced models of epilepsy in rodents. Consequently, SV2A is a very attractive candidate gene for analysis in the context of both mono- and polygenic generalized epilepsies in humans
Selection Signatures in Worldwide Sheep Populations
The diversity of populations in domestic species offers great opportunities to study genome response to selection. The recently published Sheep HapMap dataset is a great example of characterization of the world wide genetic diversity in sheep. In this study, we re-analyzed the Sheep HapMap dataset to identify selection signatures in worldwide sheep populations. Compared to previous analyses, we made use of statistical methods that (i) take account of the hierarchical structure of sheep populations, (ii) make use of linkage disequilibrium information and (iii) focus specifically on either recent or older selection signatures. We show that this allows pinpointing several new selection signatures in the sheep genome and distinguishing those related to modern breeding objectives and to earlier post-domestication constraints. The newly identified regions, together with the ones previously identified, reveal the extensive genome response to selection on morphology, color and adaptation to new environments
Meta-analysis of genome-wide association studies for cattle stature identifies common genes that regulate body size in mammals
peer-reviewedH.D.D., A.J.C., P.J.B. and B.J.H. would like to acknowledge the Dairy Futures
Cooperative Research Centre for funding. H.P. and R.F. acknowledge funding
from the German Federal Ministry of Education and Research (BMBF) within the
AgroClustEr âSynbreedâSynergistic Plant and Animal Breedingâ (grant 0315527B).
H.P., R.F., R.E. and K.-U.G. acknowledge the Arbeitsgemeinschaft SĂŒddeutscher
RinderzĂŒchter, the Arbeitsgemeinschaft Ăsterreichischer FleckviehzĂŒchter
and ZuchtData EDV Dienstleistungen for providing genotype data. A. Bagnato
acknowledges the European Union (EU) Collaborative Project LowInputBreeds
(grant agreement 222623) for providing Brown Swiss genotypes. Braunvieh Schweiz
is acknowledged for providing Brown Swiss phenotypes. H.P. and R.F. acknowledge
the German Holstein Association (DHV) and the ConfederaciĂłn de Asociaciones
de Frisona Española (CONCAFE) for sharing genotype data. H.P. was financially
supported by a postdoctoral fellowship from the Deutsche Forschungsgemeinschaft
(DFG) (grant PA 2789/1-1). D.B. and D.C.P. acknowledge funding from the
Research Stimulus Fund (11/S/112) and Science Foundation Ireland (14/IA/2576).
M.S. and F.S.S. acknowledge the Canadian Dairy Network (CDN) for providing the
Holstein genotypes. P.S. acknowledges funding from the Genome Canada project
entitled âWhole Genome Selection through Genome Wide Imputation in Beef Cattleâ and acknowledges WestGrid and Compute/Calcul Canada for providing
computing resources. J.F.T. was supported by the National Institute of Food and
Agriculture, US Department of Agriculture, under awards 2013-68004-20364 and
2015-67015-23183. A. Bagnato, F.P., M.D. and J.W. acknowledge EU Collaborative
Project Quantomics (grant 516 agreement 222664) for providing Brown Swiss
and Finnish Ayrshire sequences and genotypes. A.C.B. and R.F.V. acknowledge
funding from the publicâprivate partnership âBreed4Foodâ (code BO-22.04-011-
001-ASG-LR) and EU FP7 IRSES SEQSEL (grant 317697). A.C.B. and R.F.V.
acknowledge CRV (Arnhem, the Netherlands) for providing data on Dutch and
New Zealand Holstein and Jersey bulls.Stature is affected by many polymorphisms of small effect in humans1. In contrast, variation in dogs, even within breeds, has been suggested to be largely due to variants in a small number of genes2,3. Here we use data from cattle to compare the genetic architecture of stature to those in humans and dogs. We conducted a meta-analysis for stature using 58,265 cattle from 17 populations with 25.4 million imputed whole-genome sequence variants. Results showed that the genetic architecture of stature in cattle is similar to that in humans, as the lead variants in 163 significantly associated genomic regions (P < 5 Ă 10â8) explained at most 13.8% of the phenotypic variance. Most of these variants were noncoding, including variants that were also expression quantitative trait loci (eQTLs) and in ChIPâseq peaks. There was significant overlap in loci for stature with humans and dogs, suggesting that a set of common genes regulates body size in mammals