13,381 research outputs found

    λ\lambda_\infty & Maximum Variance Embedding: Measuring and Optimizing Connectivity of A Graph Metric

    Full text link
    Bobkov, Houdr\'e, and the last author [2000] introduced a Poincar\'e-type functional parameter, λ\lambda_\infty, of a graph and related it to connectivity of the graph via Cheeger-type inequalities. A work by the second author, Raghavendra, and Vempala [2013] related the complexity of λ\lambda_\infty to the so-called small-set expansion (SSE) problem and further set forth the desiderata for NP-hardness of this optimization problem. We confirm the conjecture that computing λ\lambda_\infty is NP-hard for weighted trees. Beyond measuring connectivity in many applications we want to optimize it. This, via convex duality, leads to a problem in machine learning known as the Maximum Variance Embedding (MVE). The output is a function from vertices to a low dim Euclidean space, subject to bounds on Euclidean distances between neighbors. The objective is to maximize output variance. Special cases of MVE into nn and 11 dims lead to absolute algebraic connectivity [1990] and spread constant [1998], that measure connectivity of the graph and its Cartesian nn-power, respectively. MVE has other applications in measuring diffusion speed and robustness of networks, clustering, and dimension reduction. We show that computing MVE in tree-width dims is NP-hard, while only one additional dim beyond width of a given tree-decomposition makes the problem in P. We show that MVE of a tree in 2 dims defines a non-convex yet benign optimization landscape, i.e., local=global optima. We further develop a linear time combinatorial algorithm for this case. Finally, we denote approximate Maximum Variance Embedding is tractable in significantly lower dims. For trees and general graphs, for which Maximum Variance Embedding cannot be solved in less than 22 and Ω(n)\Omega(n) dims, we provide 1+ε1+\varepsilon approximation algorithms for embedding into 11 and O(logn/ε2)O(\log n /\varepsilon^2) dims, respectively

    Masking Strategies for Image Manifolds

    Full text link
    We consider the problem of selecting an optimal mask for an image manifold, i.e., choosing a subset of the pixels of the image that preserves the manifold's geometric structure present in the original data. Such masking implements a form of compressive sensing through emerging imaging sensor platforms for which the power expense grows with the number of pixels acquired. Our goal is for the manifold learned from masked images to resemble its full image counterpart as closely as possible. More precisely, we show that one can indeed accurately learn an image manifold without having to consider a large majority of the image pixels. In doing so, we consider two masking methods that preserve the local and global geometric structure of the manifold, respectively. In each case, the process of finding the optimal masking pattern can be cast as a binary integer program, which is computationally expensive but can be approximated by a fast greedy algorithm. Numerical experiments show that the relevant manifold structure is preserved through the data-dependent masking process, even for modest mask sizes

    Structural Variability from Noisy Tomographic Projections

    Full text link
    In cryo-electron microscopy, the 3D electric potentials of an ensemble of molecules are projected along arbitrary viewing directions to yield noisy 2D images. The volume maps representing these potentials typically exhibit a great deal of structural variability, which is described by their 3D covariance matrix. Typically, this covariance matrix is approximately low-rank and can be used to cluster the volumes or estimate the intrinsic geometry of the conformation space. We formulate the estimation of this covariance matrix as a linear inverse problem, yielding a consistent least-squares estimator. For nn images of size NN-by-NN pixels, we propose an algorithm for calculating this covariance estimator with computational complexity O(nN4+κN6logN)\mathcal{O}(nN^4+\sqrt{\kappa}N^6 \log N), where the condition number κ\kappa is empirically in the range 1010--200200. Its efficiency relies on the observation that the normal equations are equivalent to a deconvolution problem in 6D. This is then solved by the conjugate gradient method with an appropriate circulant preconditioner. The result is the first computationally efficient algorithm for consistent estimation of 3D covariance from noisy projections. It also compares favorably in runtime with respect to previously proposed non-consistent estimators. Motivated by the recent success of eigenvalue shrinkage procedures for high-dimensional covariance matrices, we introduce a shrinkage procedure that improves accuracy at lower signal-to-noise ratios. We evaluate our methods on simulated datasets and achieve classification results comparable to state-of-the-art methods in shorter running time. We also present results on clustering volumes in an experimental dataset, illustrating the power of the proposed algorithm for practical determination of structural variability.Comment: 52 pages, 11 figure

    Topological data analysis of contagion maps for examining spreading processes on networks

    Get PDF
    Social and biological contagions are influenced by the spatial embeddedness of networks. Historically, many epidemics spread as a wave across part of the Earth's surface; however, in modern contagions long-range edges -- for example, due to airline transportation or communication media -- allow clusters of a contagion to appear in distant locations. Here we study the spread of contagions on networks through a methodology grounded in topological data analysis and nonlinear dimension reduction. We construct "contagion maps" that use multiple contagions on a network to map the nodes as a point cloud. By analyzing the topology, geometry, and dimensionality of manifold structure in such point clouds, we reveal insights to aid in the modeling, forecast, and control of spreading processes. Our approach highlights contagion maps also as a viable tool for inferring low-dimensional structure in networks.Comment: Main Text and Supplementary Informatio

    Conditional t-SNE: Complementary t-SNE embeddings through factoring out prior information

    Get PDF
    Dimensionality reduction and manifold learning methods such as t-Distributed Stochastic Neighbor Embedding (t-SNE) are routinely used to map high-dimensional data into a 2-dimensional space to visualize and explore the data. However, two dimensions are typically insufficient to capture all structure in the data, the salient structure is often already known, and it is not obvious how to extract the remaining information in a similarly effective manner. To fill this gap, we introduce \emph{conditional t-SNE} (ct-SNE), a generalization of t-SNE that discounts prior information from the embedding in the form of labels. To achieve this, we propose a conditioned version of the t-SNE objective, obtaining a single, integrated, and elegant method. ct-SNE has one extra parameter over t-SNE; we investigate its effects and show how to efficiently optimize the objective. Factoring out prior knowledge allows complementary structure to be captured in the embedding, providing new insights. Qualitative and quantitative empirical results on synthetic and (large) real data show ct-SNE is effective and achieves its goal

    Spread spectrum-based video watermarking algorithms for copyright protection

    Get PDF
    Merged with duplicate record 10026.1/2263 on 14.03.2017 by CS (TIS)Digital technologies know an unprecedented expansion in the last years. The consumer can now benefit from hardware and software which was considered state-of-the-art several years ago. The advantages offered by the digital technologies are major but the same digital technology opens the door for unlimited piracy. Copying an analogue VCR tape was certainly possible and relatively easy, in spite of various forms of protection, but due to the analogue environment, the subsequent copies had an inherent loss in quality. This was a natural way of limiting the multiple copying of a video material. With digital technology, this barrier disappears, being possible to make as many copies as desired, without any loss in quality whatsoever. Digital watermarking is one of the best available tools for fighting this threat. The aim of the present work was to develop a digital watermarking system compliant with the recommendations drawn by the EBU, for video broadcast monitoring. Since the watermark can be inserted in either spatial domain or transform domain, this aspect was investigated and led to the conclusion that wavelet transform is one of the best solutions available. Since watermarking is not an easy task, especially considering the robustness under various attacks several techniques were employed in order to increase the capacity/robustness of the system: spread-spectrum and modulation techniques to cast the watermark, powerful error correction to protect the mark, human visual models to insert a robust mark and to ensure its invisibility. The combination of these methods led to a major improvement, but yet the system wasn't robust to several important geometrical attacks. In order to achieve this last milestone, the system uses two distinct watermarks: a spatial domain reference watermark and the main watermark embedded in the wavelet domain. By using this reference watermark and techniques specific to image registration, the system is able to determine the parameters of the attack and revert it. Once the attack was reverted, the main watermark is recovered. The final result is a high capacity, blind DWr-based video watermarking system, robust to a wide range of attacks.BBC Research & Developmen