520 research outputs found
Modelling High-frequency Economic Time Series
The minute-by-minute move of the Hang Seng Index (HSI) data over a four-year
period is analysed and shown to possess similar statistical features as those
of other markets. Based on a mathematical theorem [S. B. Pope and E. S. C.
Ching, Phys. Fluids A {\bf 5}, 1529 (1993)], we derive an analytic form for the
probability distribution function (PDF) of index moves from fitted functional
forms of certain conditional averages of the time series. Furthermore,
following a recent work by Stolovitzky and Ching, we show that the observed PDF
can be reproduced by a Langevin process with a move-dependent noise amplitude.
The form of the Langevin equation can be determined directly from the market
data.Comment: To appear in Proceedings of the Dynamics Days Asia Pacific
Conference, 13-16 July, 1999, Hong Kong (Physica A, 2000
Mean- Field Approximation and Extended Self-Similarity in Turbulence
Recent experimental discovery of extended self-similarity (ESS) was one of
the most interesting developments, enabling precise determination of the
scaling exponents of fully developed turbulence. Here we show that the ESS is
consistent with the Navier-Stokes equations, provided the pressure -gradient
contributions are expressed in terms of velocity differences in the mean field
approximation (Yakhot, Phys.Rev. E{\bf 63}, 026307, (2001)). A sufficient
condition for extended self-similarity in a general dynamical systemComment: 8 pages, no figure
Local properties of extended self-similarity in 3D turbulence
Using a generalization of extended self-similarity we have studied local
scaling properties of 3D turbulence in a direct numerical simulation. We have
found that these properties are consistent with lognormal-like behavior of
energy dissipation fluctuations with moderate amplitudes for space scales
beginning from Kolmogorov length up to the largest scales, and in the
whole range of the Reynolds numbers: . The
locally determined intermittency exponent varies with ; it has a
maximum at scale , independent of .Comment: 4 pages, 5 figure
Accurate and Reliable Cancer Classification Based on Probabilistic Inference of Pathway Activity
With the advent of high-throughput technologies for measuring genome-wide expression profiles, a large number of methods have been proposed for discovering diagnostic markers that can accurately discriminate between different classes of a disease. However, factors such as the small sample size of typical clinical data, the inherent noise in high-throughput measurements, and the heterogeneity across different samples, often make it difficult to find reliable gene markers. To overcome this problem, several studies have proposed the use of pathway-based markers, instead of individual gene markers, for building the classifier. Given a set of known pathways, these methods estimate the activity level of each pathway by summarizing the expression values of its member genes, and use the pathway activities for classification. It has been shown that pathway-based classifiers typically yield more reliable results compared to traditional gene-based classifiers. In this paper, we propose a new classification method based on probabilistic inference of pathway activities. For a given sample, we compute the log-likelihood ratio between different disease phenotypes based on the expression level of each gene. The activity of a given pathway is then inferred by combining the log-likelihood ratios of the constituent genes. We apply the proposed method to the classification of breast cancer metastasis, and show that it achieves higher accuracy and identifies more reproducible pathway markers compared to several existing pathway activity inference methods
Stability and aggregation of ranked gene lists
Ranked gene lists are highly instable in the sense that similar measures of differential gene expression may yield very different rankings, and that a small change of the data set usually affects the obtained gene list considerably. Stability issues have long been under-considered in the literature, but they have grown to a hot topic in the last few years, perhaps as a consequence of the increasing skepticism on the reproducibility and clinical applicability of molecular research findings. In this article, we review existing approaches for the assessment of stability of ranked gene lists and the related problem of aggregation, give some practical recommendations, and warn against potential misuse of these methods. This overview is illustrated through an application to a recent leukemia data set using the freely available Bioconductor package GeneSelector
- …