48,797 research outputs found
Incorporating peak grouping information for alignment of multiple liquid chromatography-mass spectrometry datasets
Motivation: The combination of liquid chromatography and mass spectrometry (LC/MS) has been widely used for large-scale comparative studies in systems biology, including proteomics, glycomics and metabolomics. In almost all experimental design, it is necessary to compare chromatograms across biological or technical replicates and across sample groups. Central to this is the peak alignment step, which is one of the most important but challenging preprocessing steps. Existing alignment tools do not take into account the structural dependencies between related peaks that co-elute and are derived from the same metabolite or peptide. We propose a direct matching peak alignment method for LC/MS data that incorporates related peaks information (within each LC/MS run) and investigate its effect on alignment performance (across runs). The groupings of related peaks necessary for our method can be obtained from any peak clustering method and are built into a pairwise peak similarity score function. The similarity score matrix produced is used by an approximation algorithm for the weighted matching problem to produce the actual alignment result.<p></p>
Results:
We demonstrate that related peak information can improve alignment performance. The performance is evaluated on a set of benchmark datasets, where our method performs competitively compared to other popular alignment tools.<p></p>
Availability: The proposed alignment method has been implemented
as a stand-alone application in Python, available for download at
http://github.com/joewandy/peak-grouping-alignment.<p></p>
Unsupervised Feature Selection with Adaptive Structure Learning
The problem of feature selection has raised considerable interests in the
past decade. Traditional unsupervised methods select the features which can
faithfully preserve the intrinsic structures of data, where the intrinsic
structures are estimated using all the input features of data. However, the
estimated intrinsic structures are unreliable/inaccurate when the redundant and
noisy features are not removed. Therefore, we face a dilemma here: one need the
true structures of data to identify the informative features, and one need the
informative features to accurately estimate the true structures of data. To
address this, we propose a unified learning framework which performs structure
learning and feature selection simultaneously. The structures are adaptively
learned from the results of feature selection, and the informative features are
reselected to preserve the refined structures of data. By leveraging the
interactions between these two essential tasks, we are able to capture accurate
structures and select more informative features. Experimental results on many
benchmark data sets demonstrate that the proposed method outperforms many state
of the art unsupervised feature selection methods
Detection of regulator genes and eQTLs in gene networks
Genetic differences between individuals associated to quantitative phenotypic
traits, including disease states, are usually found in non-coding genomic
regions. These genetic variants are often also associated to differences in
expression levels of nearby genes (they are "expression quantitative trait
loci" or eQTLs for short) and presumably play a gene regulatory role, affecting
the status of molecular networks of interacting genes, proteins and
metabolites. Computational systems biology approaches to reconstruct causal
gene networks from large-scale omics data have therefore become essential to
understand the structure of networks controlled by eQTLs together with other
regulatory genes, and to generate detailed hypotheses about the molecular
mechanisms that lead from genotype to phenotype. Here we review the main
analytical methods and softwares to identify eQTLs and their associated genes,
to reconstruct co-expression networks and modules, to reconstruct causal
Bayesian gene and module networks, and to validate predicted networks in
silico.Comment: minor revision with typos corrected; review article; 24 pages, 2
figure
Sloshing in the LNG shipping industry: risk modelling through multivariate heavy-tail analysis
In the liquefied natural gas (LNG) shipping industry, the phenomenon of
sloshing can lead to the occurrence of very high pressures in the tanks of the
vessel. The issue of modelling or estimating the probability of the
simultaneous occurrence of such extremal pressures is now crucial from the risk
assessment point of view. In this paper, heavy-tail modelling, widely used as a
conservative approach to risk assessment and corresponding to a worst-case risk
analysis, is applied to the study of sloshing. Multivariate heavy-tailed
distributions are considered, with Sloshing pressures investigated by means of
small-scale replica tanks instrumented with d >1 sensors. When attempting to
fit such nonparametric statistical models, one naturally faces computational
issues inherent in the phenomenon of dimensionality. The primary purpose of
this article is to overcome this barrier by introducing a novel methodology.
For d-dimensional heavy-tailed distributions, the structure of extremal
dependence is entirely characterised by the angular measure, a positive measure
on the intersection of a sphere with the positive orthant in Rd. As d
increases, the mutual extremal dependence between variables becomes difficult
to assess. Based on a spectral clustering approach, we show here how a low
dimensional approximation to the angular measure may be found. The
nonparametric method proposed for model sloshing has been successfully applied
to pressure data. The parsimonious representation thus obtained proves to be
very convenient for the simulation of multivariate heavy-tailed distributions,
allowing for the implementation of Monte-Carlo simulation schemes in estimating
the probability of failure. Besides confirming its performance on artificial
data, the methodology has been implemented on a real data set specifically
collected for risk assessment of sloshing in the LNG shipping industry
- …