1,534 research outputs found
Solving Inverse Problems with Piecewise Linear Estimators: From Gaussian Mixture Models to Structured Sparsity
A general framework for solving image inverse problems is introduced in this
paper. The approach is based on Gaussian mixture models, estimated via a
computationally efficient MAP-EM algorithm. A dual mathematical interpretation
of the proposed framework with structured sparse estimation is described, which
shows that the resulting piecewise linear estimate stabilizes the estimation
when compared to traditional sparse inverse problem techniques. This
interpretation also suggests an effective dictionary motivated initialization
for the MAP-EM algorithm. We demonstrate that in a number of image inverse
problems, including inpainting, zooming, and deblurring, the same algorithm
produces either equal, often significantly better, or very small margin worse
results than the best published ones, at a lower computational cost.Comment: 30 page
On some limitations of probabilistic models for dimension-reduction: illustration in the case of one particular probabilistic formulation of PLS
Partial Least Squares (PLS) refer to a class of dimension-reduction
techniques aiming at the identification of two sets of components with maximal
covariance, in order to model the relationship between two sets of observed
variables and , with .
El Bouhaddani et al. (2017) have recently proposed a probabilistic formulation
of PLS. Under the constraints they consider for the parameters of their model,
this latter can be seen as a probabilistic formulation of one version of PLS,
namely the PLS-SVD. However, we establish that these constraints are too
restrictive as they define a very particular subset of distributions for
under which, roughly speaking, components with maximal covariance
(solutions of PLS-SVD), are also necessarily of respective maximal variances
(solutions of the principal components analyses of and , respectively).
Then, we propose a simple extension of el Bouhaddani et al.'s model, which
corresponds to a more general probabilistic formulation of PLS-SVD, and which
is no longer restricted to these particular distributions. We present numerical
examples to illustrate the limitations of the original model of el Bouhaddani
et al. (2017)
Parametric and Semi-parametric Estimations of the Return to Schooling in South Africa
This paper estimates return to schooling for african and coloured women in South Africa. It compares parametric and semiparametric estimates of the sample selection model for the case of return to schooling. The parametric estimator is the one proposed by Heckman (1979) and the semiparametric estimator proposed by Newey (1991) and Klein and Spady (1993). It also attempts to correct endogeneity and mesurement error by using instruments of schooling. Following recent literature, the paper uses community variables primary and secondary school proximity and availability as instruments. Using instrumental variables increases the return to schooling substantially. Parametric corrections does not change the results but semiparametric corrections increases the return even morereturn to schooling, sample selection bias, semiparametric regression, instrumental variables, south africa
A geometric relationship of F2, F3 and F4-statistics with principal component analysis
Principal component analysis (PCA) and F-statistics sensu Patterson are two of the most widely used population genetic tools to study human genetic variation. Here, I derive explicit connections between the two approaches and show that these two methods are closely related. F-statistics have a simple geometrical interpretation in the context of PCA, and orthogonal projections are a key concept to establish this link. I show that for any pair of populations, any population that is admixed as determined by an F3-statistic will lie inside a circle on a PCA plot. Furthermore, the F4-statistic is closely related to an angle measurement, and will be zero if the differences between pairs of populations intersect at a right angle in PCA space. I illustrate my results on two examples, one of Western Eurasian, and one of global human diversity. In both examples, I find that the first few PCs are sufficient to approximate most F-statistics, and that PCA plots are effective at predicting F-statistics. Thus, while F-statistics are commonly understood in terms of discrete populations, the geometric perspective illustrates that they can be viewed in a framework of populations that vary in a more continuous manner.This article is part of the theme issue âCelebrating 50 years since Lewontin's apportionment of human diversityâ
Improving polygenic prediction with genetically inferred ancestry.
Genome-wide association studies (GWASs) have demonstrated that most common diseases have a strong genetic component from many genetic variants each with a small effect size. GWAS summary statistics have allowed the construction of polygenic scores (PGSs) estimating part of the individual risk for common diseases. Here, we propose to improve PGS-based risk estimation by incorporating genetic ancestry derived from genome-wide genotyping data. Our method involves three cohorts: a base (or discovery) for association studies, a target for phenotype/risk prediction, and a map for ancestry mapping; successively, (1) it generates for each individual in the base and target cohorts a set of principal components based on the map cohort-called mapped PCs, (2) it associates in the base cohort the phenotype with the mapped-PCs, and (3) it uses the mapped PCs in the target cohort to generate a phenotypic predictor called the ancestry score. We evaluated the ancestry score by comparing a predictive model using a PGS with one combining a PGS and an ancestry score. First, we performed simulations and found that the ancestry score has a greater impact on traits that correlate with ancestry-specific variants. Second, we showed, using UK Biobank data, that the ancestry score improves genetic prediction for our nine phenotypes to very different degrees. Third, we performed simulations and found that the more heterogeneous the base and target cohorts, the more beneficial the ancestry score is. Finally, we validated our approach under realistic conditions with UK Biobank as the base cohort and Swiss individuals from the CoLaus|PsyCoLaus study as the target cohort
Complex population structure and haplotype patterns in the Western European honey bee from sequencing a large panel of haploid drones
Honey bee subspecies originate from specific geographical areas in Africa, Europe
and the Middle East, and beekeepers interested in specific phenotypes have imported
genetic material to regions outside of the bees' original range for use either in pure
lines or controlled crosses. Moreover, imported drones are present in the environment
and mate naturally with queens from the local subspecies. The resulting admixture
complicates population genetics analyses, and population stratification can
be a major problem for association studies. To better understand Western European
honey bee populations, we produced a whole genome sequence and single nucleotide
polymorphism (SNP) genotype data set from 870 haploid drones and demonstrate
its utility for the identification of nine genetic backgrounds and various degrees of
admixture in a subset of 629 samples. Five backgrounds identified correspond to subspecies,
two to isolated populations on islands and two to managed populations. We
also highlight several large haplotype blocks, some of which coincide with the position
of centromeres. The largest is 3.6 Mb long and represents 21% of chromosome 11, with two major haplotypes corresponding to the two dominant genetic backgrounds
identified. This large naturally phased data set is available as a single vcf file that can
now serve as a reference for subsequent populations genomics studies in the honey
bee, such as (i) selecting individuals of verified homogeneous genetic backgrounds
as references, (ii) imputing genotypes from a lower-density
data set generated by an
SNP-chip
or by low-pass
sequencing, or (iii) selecting SNPs compatible with the requirements
of genotyping chips.This work was performed in collaboration with the GeT platform,
Toulouse (France), a partner of the National Infrastructure France
GĂ©nomique, thanks to support by the Commissariat aux Grands
Invetissements (ANR-10-INBS-0009).
Bioinformatics analyses were
performed on the GenoToul Bioinfo computer cluster. This work
was funded by a grant from the INRA Département de Génétique
Animale (INRA Animal Genetics division) and by the SeqApiPop programme,
funded by the FranceAgriMer grant 14-21-AT.
We thank John Kefuss for helpful discussions. We thank Andrew Abrahams
for providing honey bee samples from Colonsay (Scotland), the
Association Conservatoire de l'Abeille Noire Bretonne (ACANB) for
samples from Ouessant (France), CETA de Savoie for sample from
Savoie, ADAPI for samples from Porquerolles and all beekeepers and
bee breeders who kindly participated in this study by providing samples
from their colonies.info:eu-repo/semantics/publishedVersio
- âŠ