24,266 research outputs found
Distributed estimation from relative measurements of heterogeneous and uncertain quality
This paper studies the problem of estimation from relative measurements in a
graph, in which a vector indexed over the nodes has to be reconstructed from
pairwise measurements of differences between its components associated to nodes
connected by an edge. In order to model heterogeneity and uncertainty of the
measurements, we assume them to be affected by additive noise distributed
according to a Gaussian mixture. In this original setup, we formulate the
problem of computing the Maximum-Likelihood (ML) estimates and we design two
novel algorithms, based on Least Squares regression and
Expectation-Maximization (EM). The first algorithm (LS- EM) is centralized and
performs the estimation from relative measurements, the soft classification of
the measurements, and the estimation of the noise parameters. The second
algorithm (Distributed LS-EM) is distributed and performs estimation and soft
classification of the measurements, but requires the knowledge of the noise
parameters. We provide rigorous proofs of convergence of both algorithms and we
present numerical experiments to evaluate and compare their performance with
classical solutions. The experiments show the robustness of the proposed
methods against different kinds of noise and, for the Distributed LS-EM,
against errors in the knowledge of noise parameters.Comment: Submitted to IEEE transaction
TrAp: a Tree Approach for Fingerprinting Subclonal Tumor Composition
Revealing the clonal composition of a single tumor is essential for
identifying cell subpopulations with metastatic potential in primary tumors or
with resistance to therapies in metastatic tumors. Sequencing technologies
provide an overview of an aggregate of numerous cells, rather than
subclonal-specific quantification of aberrations such as single nucleotide
variants (SNVs). Computational approaches to de-mix a single collective signal
from the mixed cell population of a tumor sample into its individual components
are currently not available. Herein we propose a framework for deconvolving
data from a single genome-wide experiment to infer the composition, abundance
and evolutionary paths of the underlying cell subpopulations of a tumor. The
method is based on the plausible biological assumption that tumor progression
is an evolutionary process where each individual aberration event stems from a
unique subclone and is present in all its descendants subclones. We have
developed an efficient algorithm (TrAp) for solving this mixture problem. In
silico analyses show that TrAp correctly deconvolves mixed subpopulations when
the number of subpopulations and the measurement errors are moderate. We
demonstrate the applicability of the method using tumor karyotypes and somatic
hypermutation datasets. We applied TrAp to SNV frequency profile from Exome-Seq
experiment of a renal cell carcinoma tumor sample and compared the mutational
profile of the inferred subpopulations to the mutational profiles of twenty
single cells of the same tumor. Despite the large experimental noise, specific
co-occurring mutations found in clones inferred by TrAp are also present in
some of these single cells. Finally, we deconvolve Exome-Seq data from three
distinct metastases from different body compartments of one melanoma patient
and exhibit the evolutionary relationships of their subpopulations
A Semi-Blind Source Separation Method for Differential Optical Absorption Spectroscopy of Atmospheric Gas Mixtures
Differential optical absorption spectroscopy (DOAS) is a powerful tool for
detecting and quantifying trace gases in atmospheric chemistry
\cite{Platt_Stutz08}. DOAS spectra consist of a linear combination of complex
multi-peak multi-scale structures. Most DOAS analysis routines in use today are
based on least squares techniques, for example, the approach developed in the
1970s uses polynomial fits to remove a slowly varying background, and known
reference spectra to retrieve the identity and concentrations of reference
gases. An open problem is to identify unknown gases in the fitting residuals
for complex atmospheric mixtures.
In this work, we develop a novel three step semi-blind source separation
method. The first step uses a multi-resolution analysis to remove the
slow-varying and fast-varying components in the DOAS spectral data matrix .
The second step decomposes the preprocessed data in the first step
into a linear combination of the reference spectra plus a remainder, or
, where columns of matrix are known reference spectra,
and the matrix contains the unknown non-negative coefficients that are
proportional to concentration. The second step is realized by a convex
minimization problem ,
where the norm is a hybrid norm (Huber estimator) that helps to
maintain the non-negativity of . The third step performs a blind independent
component analysis of the remainder matrix to extract remnant gas
components. We first illustrate the proposed method in processing a set of DOAS
experimental data by a satisfactory blind extraction of an a-priori unknown
trace gas (ozone) from the remainder matrix. Numerical results also show that
the method can identify multiple trace gases from the residuals.Comment: submitted to Journal of Scientific Computin
Partial-volume Bayesian classification of material mixtures in MR volume data using voxel histograms
The authors present a new algorithm for identifying the distribution of different material types in volumetric datasets such as those produced with magnetic resonance imaging (MRI) or computed tomography (CT). Because the authors allow for mixtures of materials and treat voxels as regions, their technique reduces errors that other classification techniques can create along boundaries between materials and is particularly useful for creating accurate geometric models and renderings from volume data. It also has the potential to make volume measurements more accurately and classifies noisy, low-resolution data well. There are two unusual aspects to the authors' approach. First, they assume that, due to partial-volume effects, or blurring, voxels can contain more than one material, e.g., both muscle and fat; the authors compute the relative proportion of each material in the voxels. Second, they incorporate information from neighboring voxels into the classification process by reconstructing a continuous function, ρ(x), from the samples and then looking at the distribution of values that ρ(x) takes on within the region of a voxel. This distribution of values is represented by a histogram taken over the region of the voxel; the mixture of materials that those values measure is identified within the voxel using a probabilistic Bayesian approach that matches the histogram by finding the mixture of materials within each voxel most likely to have created the histogram. The size of regions that the authors classify is chosen to match the sparing of the samples because the spacing is intrinsically related to the minimum feature size that the reconstructed continuous function can represent
- …