27 research outputs found
Exploiting geometry, topology, and optimization for knowledge discovery in big data
2013 Summer.Includes bibliographical references.In this dissertation, we consider several topics that are united by the theme of topological and geometric data analysis. First, we consider an application in landscape ecology using a well-known vector quantization algorithm to characterize and segment the color content of natural imagery. Color information in an image may be viewed naturally as clusters of pixels with similar attributes. The inherent structure and distribution of these clusters serves to quantize the information in the image and provides a basis for classification. A friendly graphical user interface called Biological Landscape Organizer and Semi-supervised Segmenting Machine (BLOSSM) was developed to aid in this classification. We consider four different choices for color space and five different metrics in which to analyze our data, and results are compared. Second, we present a novel topologically driven clustering algorithm that blends Locally Linear Embedding (LLE) and vector quantization by mapping color information to a lower dimensional space, identifying distinct color regions, and classifying pixels together based on both a proximity measure and color content. It is observed that these techniques permit a significant reduction in color resolution while maintaining the visually important features of images. Third, we develop a novel algorithm which we call Sparse LLE that leads to sparse representations in local reconstructions by using a data weighted 1-norm regularization term in the objective function of an optimization problem. It is observed that this new formulation has proven effective at automatically determining an appropriate number of nearest neighbors for each data point. We explore various optimization techniques, namely Primal Dual Interior Point algorithms, to solve this problem, comparing the computational complexity for each. Fourth, we present a novel algorithm that can be used to determine the boundary of a data set, or the vertices of a convex hull encasing a point cloud of data, in any dimension by solving a quadratic optimization problem. In this problem, each point is written as a linear combination of its nearest neighbors where the coefficients of this linear combination are penalized if they do not construct a convex combination, revealing those points that cannot be represented in this way, the vertices of the convex hull containing the data. Finally, we exploit the relatively new tool from topological data analysis, persistent homology, and consider the use of vector bundles to re-embed data in order to improve the topological signal of a data set by embedding points sampled from a projective variety into successive Grassmannians
Reflections on Math Students’ Circles: Two Personal Stories from Colorado
Math Students’ Circles provide an opportunity for mathematicians to work in their communities to engage young students in mathematics as a human, aesthetic, and social endeavor. Sometimes referred to simply as Math Circles, these venues give mathematicians experience in introducing children to topics not typically seen in school curricula in an exciting, hands-on format. This article explores two Math Students’ Circles (MSCs) in the state of Colorado from the point of view of two pre-tenure faculty members. One participated in MSCs for four years while working on her Ph.D. in mathematics, the other started an MSC as an offshoot of a successful professional development program for middle school mathematics teachers. We discuss how and why MSCs have influenced our professional lives
Topological Data Analysis of Biological Aggregation Models
We apply tools from topological data analysis to two mathematical models
inspired by biological aggregations such as bird flocks, fish schools, and
insect swarms. Our data consists of numerical simulation output from the models
of Vicsek and D'Orsogna. These models are dynamical systems describing the
movement of agents who interact via alignment, attraction, and/or repulsion.
Each simulation time frame is a point cloud in position-velocity space. We
analyze the topological structure of these point clouds, interpreting the
persistent homology by calculating the first few Betti numbers. These Betti
numbers count connected components, topological circles, and trapped volumes
present in the data. To interpret our results, we introduce a visualization
that displays Betti numbers over simulation time and topological persistence
scale. We compare our topological results to order parameters typically used to
quantify the global behavior of aggregations, such as polarization and angular
momentum. The topological calculations reveal events and structure not captured
by the order parameters.Comment: 25 pages, 12 figures; second version contains typo corrections, minor
textual additions, and a brief discussion of computational complexity; third
version fixes one typo and adds small paragraph about topological stabilit
Vietoris-Rips and Cech Complexes of Metric Gluings
We study Vietoris-Rips and Cech complexes of metric wedge sums and metric gluings. We show that the Vietoris-Rips (resp. Cech) complex of a wedge sum, equipped with a natural metric, is homotopy equivalent to the wedge sum of the Vietoris-Rips (resp. Cech) complexes. We also provide generalizations for certain metric gluings, i.e. when two metric spaces are glued together along a common isometric subset. As our main example, we deduce the homotopy type of the Vietoris-Rips complex of two metric graphs glued together along a sufficiently short path. As a result, we can describe the persistent homology, in all homological dimensions, of the Vietoris-Rips complexes of a wide class of metric graphs