160 research outputs found
Principal manifolds and graphs in practice: from molecular biology to dynamical systems
We present several applications of non-linear data modeling, using principal
manifolds and principal graphs constructed using the metaphor of elasticity
(elastic principal graph approach). These approaches are generalizations of the
Kohonen's self-organizing maps, a class of artificial neural networks. On
several examples we show advantages of using non-linear objects for data
approximation in comparison to the linear ones. We propose four numerical
criteria for comparing linear and non-linear mappings of datasets into the
spaces of lower dimension. The examples are taken from comparative political
science, from analysis of high-throughput data in molecular biology, from
analysis of dynamical systems.Comment: 12 pages, 9 figure
Entropy Balance and Dispersive Oscillations in Lattice Boltzmann Models
We conduct an investigation into the dispersive post-shock oscillations in
the entropic lattice-Boltzmann method (ELBM). To this end we use a root finding
algorithm to implement the ELBM which displays fast cubic convergence and
guaranties the proper sign of dissipation. The resulting simulation on the
one-dimensional shock tube shows no benefit in terms of regularization from
using the ELBM over the standard LBGK method. We also conduct an experiment
investigating of the LBGK method using median filtering at a single point per
time step. Here we observe that significant regularization can be achieved.Comment: 18 pages, 4 figures; 13/07/2009 Matlab code added to appendi
PCA Beyond The Concept of Manifolds: Principal Trees, Metro Maps, and Elastic Cubic Complexes
Multidimensional data distributions can have complex topologies and variable
local dimensions. To approximate complex data, we propose a new type of
low-dimensional ``principal object'': a principal cubic complex. This complex
is a generalization of linear and non-linear principal manifolds and includes
them as a particular case. To construct such an object, we combine a method of
topological grammars with the minimization of an elastic energy defined for its
embedment into multidimensional data space. The whole complex is presented as a
system of nodes and springs and as a product of one-dimensional continua
(represented by graphs), and the grammars describe how these continua transform
during the process of optimal complex construction. The simplest case of a
topological grammar (``add a node'', ``bisect an edge'') is equivalent to the
construction of ``principal trees'', an object useful in many practical
applications. We demonstrate how it can be applied to the analysis of bacterial
genomes and for visualization of cDNA microarray data using the ``metro map''
representation. The preprint is supplemented by animation: ``How the
topological grammar constructs branching principal components
(AnimatedBranchingPCA.gif)''.Comment: 19 pages, 8 figure
Elastic Maps and Nets for Approximating Principal Manifolds and Their Application to Microarray Data Visualization
Principal manifolds are defined as lines or surfaces passing through ``the
middle'' of data distribution. Linear principal manifolds (Principal Components
Analysis) are routinely used for dimension reduction, noise filtering and data
visualization. Recently, methods for constructing non-linear principal
manifolds were proposed, including our elastic maps approach which is based on
a physical analogy with elastic membranes. We have developed a general
geometric framework for constructing ``principal objects'' of various
dimensions and topologies with the simplest quadratic form of the smoothness
penalty which allows very effective parallel implementations. Our approach is
implemented in three programming languages (C++, Java and Delphi) with two
graphical user interfaces (VidaExpert
http://bioinfo.curie.fr/projects/vidaexpert and ViMiDa
http://bioinfo-out.curie.fr/projects/vimida applications). In this paper we
overview the method of elastic maps and present in detail one of its major
applications: the visualization of microarray data in bioinformatics. We show
that the method of elastic maps outperforms linear PCA in terms of data
approximation, representation of between-point distance structure, preservation
of local point neighborhood and representing point classes in low-dimensional
spaces.Comment: 35 pages 10 figure
Thermodynamic Tree: The Space of Admissible Paths
Is a spontaneous transition from a state x to a state y allowed by
thermodynamics? Such a question arises often in chemical thermodynamics and
kinetics. We ask the more formal question: is there a continuous path between
these states, along which the conservation laws hold, the concentrations remain
non-negative and the relevant thermodynamic potential G (Gibbs energy, for
example) monotonically decreases? The obvious necessary condition, G(x)\geq
G(y), is not sufficient, and we construct the necessary and sufficient
conditions. For example, it is impossible to overstep the equilibrium in
1-dimensional (1D) systems (with n components and n-1 conservation laws). The
system cannot come from a state x to a state y if they are on the opposite
sides of the equilibrium even if G(x) > G(y). We find the general
multidimensional analogue of this 1D rule and constructively solve the problem
of the thermodynamically admissible transitions.
We study dynamical systems, which are given in a positively invariant convex
polyhedron D and have a convex Lyapunov function G. An admissible path is a
continuous curve along which does not increase. For x,y from D, x\geq y (x
precedes y) if there exists an admissible path from x to y and x \sim y if
x\geq y and y\geq x. The tree of G in D is a quotient space D/~. We provide an
algorithm for the construction of this tree. In this algorithm, the restriction
of G onto the 1-skeleton of (the union of edges) is used. The problem of
existence of admissible paths between states is solved constructively. The
regions attainable by the admissible paths are described.Comment: Extended version, 31 page, 9 figures, 69 cited references, many minor
correction
Computational diagnosis and risk evaluation for canine lymphoma
The canine lymphoma blood test detects the levels of two biomarkers, the
acute phase proteins (C-Reactive Protein and Haptoglobin). This test can be
used for diagnostics, for screening, and for remission monitoring as well. We
analyze clinical data, test various machine learning methods and select the
best approach to these problems. Three family of methods, decision trees, kNN
(including advanced and adaptive kNN) and probability density evaluation with
radial basis functions, are used for classification and risk estimation.
Several pre-processing approaches were implemented and compared. The best of
them are used to create the diagnostic system. For the differential diagnosis
the best solution gives the sensitivity and specificity of 83.5% and 77%,
respectively (using three input features, CRP, Haptoglobin and standard
clinical symptom). For the screening task, the decision tree method provides
the best result, with sensitivity and specificity of 81.4% and >99%,
respectively (using the same input features). If the clinical symptoms
(Lymphadenopathy) are considered as unknown then a decision tree with CRP and
Hapt only provides sensitivity 69% and specificity 83.5%. The lymphoma risk
evaluation problem is formulated and solved. The best models are selected as
the system for computational lymphoma diagnosis and evaluation the risk of
lymphoma as well. These methods are implemented into a special web-accessed
software and are applied to problem of monitoring dogs with lymphoma after
treatment. It detects recurrence of lymphoma up to two months prior to the
appearance of clinical signs. The risk map visualisation provides a friendly
tool for explanatory data analysis.Comment: 24 pages, 86 references in the bibliography, Significantly extended
version with review of lymphoma biomarkers and data mining methods (Three new
sections are added: 1.1. Biomarkers for canine lymphoma, 1.2. Acute phase
proteins as lymphoma biomarkers and 3.1. Data mining methods for biomarker
cancer diagnosis. Flowcharts of data analysis are included as supplementary
material (20 pages
Decay and coherence of two-photon excited yellow ortho-excitons in Cu2O
Photoluminescence excitation spectroscopy has revealed a novel, highly
efficient two-photon excitation method to produce a cold, uniformly distributed
high density excitonic gas in bulk cuprous oxide. A study of the time evolution
of the density, temperature and chemical potential of the exciton gas shows
that the so called quantum saturation effect that prevents Bose-Einstein
condensation of the ortho-exciton gas originates from an unfavorable ratio
between the cooling and recombination rates. Oscillations observed in the
temporal decay of the ortho-excitonic luminescence intensity are discussed in
terms of polaritonic beating. We present the semiclassical description of
polaritonic oscillations in linear and non-linear optical processes.Comment: 14 pages, 12 figure
Raman and Infrared-Active Phonons in Hexagonal HoMnO Single Crystals: Magnetic Ordering Effects
Polarized Raman scattering and infrared reflection spectra of hexagonal
HoMnO single crystals in the temperature range 10-300 K are reported.
Group-theoretical analysis is performed and scattering selection rules for the
second order scattering processes are presented. Based on the results of
lattice dynamics calculations, performed within the shell model, the observed
lines in the spectra are assigned to definite lattice vibrations. The magnetic
ordering of Mn ions, which occurs below T=76 K, is shown to effect both
Raman- and infrared-active phonons, which modulate Mn-O-Mn bonds and,
consequently, Mn exchange interaction.Comment: 8 pages, 6 figure
- …