7,034 research outputs found

    Adaptive multiscale detection of filamentary structures in a background of uniform random points

    Full text link
    We are given a set of nn points that might be uniformly distributed in the unit square [0,1]2[0,1]^2. We wish to test whether the set, although mostly consisting of uniformly scattered points, also contains a small fraction of points sampled from some (a priori unknown) curve with CαC^{\alpha}-norm bounded by β\beta. An asymptotic detection threshold exists in this problem; for a constant T(α,β)>0T_-(\alpha,\beta)>0, if the number of points sampled from the curve is smaller than T(α,β)n1/(1+α)T_-(\alpha,\beta)n^{1/(1+\alpha)}, reliable detection is not possible for large nn. We describe a multiscale significant-runs algorithm that can reliably detect concentration of data near a smooth curve, without knowing the smoothness information α\alpha or β\beta in advance, provided that the number of points on the curve exceeds T(α,β)n1/(1+α)T_*(\alpha,\beta)n^{1/(1+\alpha)}. This algorithm therefore has an optimal detection threshold, up to a factor T/TT_*/T_-. At the heart of our approach is an analysis of the data by counting membership in multiscale multianisotropic strips. The strips will have area 2/n2/n and exhibit a variety of lengths, orientations and anisotropies. The strips are partitioned into anisotropy classes; each class is organized as a directed graph whose vertices all are strips of the same anisotropy and whose edges link such strips to their ``good continuations.'' The point-cloud data are reduced to counts that measure membership in strips. Each anisotropy graph is reduced to a subgraph that consist of strips with significant counts. The algorithm rejects H0\mathbf{H}_0 whenever some such subgraph contains a path that connects many consecutive significant counts.Comment: Published at http://dx.doi.org/10.1214/009053605000000787 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Hyperspectral colon tissue cell classification

    Get PDF
    A novel algorithm to discriminate between normal and malignant tissue cells of the human colon is presented. The microscopic level images of human colon tissue cells were acquired using hyperspectral imaging technology at contiguous wavelength intervals of visible light. While hyperspectral imagery data provides a wealth of information, its large size normally means high computational processing complexity. Several methods exist to avoid the so-called curse of dimensionality and hence reduce the computational complexity. In this study, we experimented with Principal Component Analysis (PCA) and two modifications of Independent Component Analysis (ICA). In the first stage of the algorithm, the extracted components are used to separate four constituent parts of the colon tissue: nuclei, cytoplasm, lamina propria, and lumen. The segmentation is performed in an unsupervised fashion using the nearest centroid clustering algorithm. The segmented image is further used, in the second stage of the classification algorithm, to exploit the spatial relationship between the labeled constituent parts. Experimental results using supervised Support Vector Machines (SVM) classification based on multiscale morphological features reveal the discrimination between normal and malignant tissue cells with a reasonable degree of accuracy

    Detection of an anomalous cluster in a network

    Full text link
    We consider the problem of detecting whether or not, in a given sensor network, there is a cluster of sensors which exhibit an "unusual behavior." Formally, suppose we are given a set of nodes and attach a random variable to each node. We observe a realization of this process and want to decide between the following two hypotheses: under the null, the variables are i.i.d. standard normal; under the alternative, there is a cluster of variables that are i.i.d. normal with positive mean and unit variance, while the rest are i.i.d. standard normal. We also address surveillance settings where each sensor in the network collects information over time. The resulting model is similar, now with a time series attached to each node. We again observe the process over time and want to decide between the null, where all the variables are i.i.d. standard normal, and the alternative, where there is an emerging cluster of i.i.d. normal variables with positive mean and unit variance. The growth models used to represent the emerging cluster are quite general and, in particular, include cellular automata used in modeling epidemics. In both settings, we consider classes of clusters that are quite general, for which we obtain a lower bound on their respective minimax detection rate and show that some form of scan statistic, by far the most popular method in practice, achieves that same rate to within a logarithmic factor. Our results are not limited to the normal location model, but generalize to any one-parameter exponential family when the anomalous clusters are large enough.Comment: Published in at http://dx.doi.org/10.1214/10-AOS839 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Coronal Mass Ejection Detection using Wavelets, Curvelets and Ridgelets: Applications for Space Weather Monitoring

    Full text link
    Coronal mass ejections (CMEs) are large-scale eruptions of plasma and magnetic feld that can produce adverse space weather at Earth and other locations in the Heliosphere. Due to the intrinsic multiscale nature of features in coronagraph images, wavelet and multiscale image processing techniques are well suited to enhancing the visibility of CMEs and supressing noise. However, wavelets are better suited to identifying point-like features, such as noise or background stars, than to enhancing the visibility of the curved form of a typical CME front. Higher order multiscale techniques, such as ridgelets and curvelets, were therefore explored to characterise the morphology (width, curvature) and kinematics (position, velocity, acceleration) of CMEs. Curvelets in particular were found to be well suited to characterising CME properties in a self-consistent manner. Curvelets are thus likely to be of benefit to autonomous monitoring of CME properties for space weather applications.Comment: Accepted for publication in Advances in Space Research (3 April 2010

    Randomized hybrid linear modeling by local best-fit flats

    Full text link
    The hybrid linear modeling problem is to identify a set of d-dimensional affine sets in a D-dimensional Euclidean space. It arises, for example, in object tracking and structure from motion. The hybrid linear model can be considered as the second simplest (behind linear) manifold model of data. In this paper we will present a very simple geometric method for hybrid linear modeling based on selecting a set of local best fit flats that minimize a global l1 error measure. The size of the local neighborhoods is determined automatically by the Jones' l2 beta numbers; it is proven under certain geometric conditions that good local neighborhoods exist and are found by our method. We also demonstrate how to use this algorithm for fast determination of the number of affine subspaces. We give extensive experimental evidence demonstrating the state of the art accuracy and speed of the algorithm on synthetic and real hybrid linear data.Comment: To appear in the proceedings of CVPR 201

    Multiscale Markov Decision Problems: Compression, Solution, and Transfer Learning

    Full text link
    Many problems in sequential decision making and stochastic control often have natural multiscale structure: sub-tasks are assembled together to accomplish complex goals. Systematically inferring and leveraging hierarchical structure, particularly beyond a single level of abstraction, has remained a longstanding challenge. We describe a fast multiscale procedure for repeatedly compressing, or homogenizing, Markov decision processes (MDPs), wherein a hierarchy of sub-problems at different scales is automatically determined. Coarsened MDPs are themselves independent, deterministic MDPs, and may be solved using existing algorithms. The multiscale representation delivered by this procedure decouples sub-tasks from each other and can lead to substantial improvements in convergence rates both locally within sub-problems and globally across sub-problems, yielding significant computational savings. A second fundamental aspect of this work is that these multiscale decompositions yield new transfer opportunities across different problems, where solutions of sub-tasks at different levels of the hierarchy may be amenable to transfer to new problems. Localized transfer of policies and potential operators at arbitrary scales is emphasized. Finally, we demonstrate compression and transfer in a collection of illustrative domains, including examples involving discrete and continuous statespaces.Comment: 86 pages, 15 figure
    corecore