28 research outputs found

    Avoiding the Global Sort: A Faster Contour Tree Algorithm

    Get PDF
    We revisit the classical problem of computing the \emph{contour tree} of a scalar field f:MRf:\mathbb{M} \to \mathbb{R}, where M\mathbb{M} is a triangulated simplicial mesh in Rd\mathbb{R}^d. The contour tree is a fundamental topological structure that tracks the evolution of level sets of ff and has numerous applications in data analysis and visualization. All existing algorithms begin with a global sort of at least all critical values of ff, which can require (roughly) Ω(nlogn)\Omega(n\log n) time. Existing lower bounds show that there are pathological instances where this sort is required. We present the first algorithm whose time complexity depends on the contour tree structure, and avoids the global sort for non-pathological inputs. If CC denotes the set of critical points in M\mathbb{M}, the running time is roughly O(vClogv)O(\sum_{v \in C} \log \ell_v), where v\ell_v is the depth of vv in the contour tree. This matches all existing upper bounds, but is a significant improvement when the contour tree is short and fat. Specifically, our approach ensures that any comparison made is between nodes in the same descending path in the contour tree, allowing us to argue strong optimality properties of our algorithm. Our algorithm requires several novel ideas: partitioning M\mathbb{M} in well-behaved portions, a local growing procedure to iteratively build contour trees, and the use of heavy path decompositions for the time complexity analysis

    In pursuit of linear complexity in discrete and computational geometry

    Get PDF
    Many computational problems arise naturally from geometric data. In this thesis, we consider three such problems: (i) distance optimization problems over point sets, (ii) computing contour trees over simplicial meshes, and (iii) bounding the expected complexity of weighted Voronoi diagrams. While these topics are broad, here the focus is on identifying structure which implies linear (or near linear) algorithmic and descriptive complexity. The first topic we consider is in geometric optimization. More specifically, we define a large class of distance problems, for which we provide linear time exact or approximate solutions. Roughly speaking, the class of problems facilitate either clustering together close points (i.e. netting) or throwing out outliers (i.e pruning), allowing for successively smaller summaries of the relevant information in the input. A surprising number of classical geometric optimization problems are unified under this framework, including finding the optimal k-center clustering, the kth ranked distance, the kth heaviest edge of the MST, the minimum radius ball enclosing k points, and many others. In several cases we get the first known linear time approximation algorithm for a given problem, where our approximation ratio matches that of previous work. The second topic we investigate is contour trees, a fundamental structure in computational topology. Contour trees give a compact summary of the evolution of level sets on a mesh, and are typically used on massive data sets. Previous algorithms for computing contour trees took Θ(n log n) time and were worst-case optimal. Here we provide an algorithm whose running time lies between Θ(nα(n)) and Θ(n log n), and varies depending on the shape of the tree, where α(n) is the inverse Ackermann function. In particular, this is the first algorithm with O(nα(n)) running time on instances with balanced contour trees. Our algorithmic results are complemented by lower bounds indicating that, up to a factor of α(n), on all instance types our algorithm performs optimally. For the final topic, we consider the descriptive complexity of weighted Voronoi diagrams. Such diagrams have quadratic (or higher) worst-case complexity, however, as was the case for contour trees, here we push beyond worst-case analysis. A new diagram, called the candidate diagram, is introduced, which allows us to bound the complexity of weighted Voronoi diagrams arising from a particular probabilistic input model. Specifically, we assume weights are randomly permuted among fixed Voronoi sites, an assumption which is weaker than the more typical sampled locations assumption. Under this assumption, the expected complexity is shown to be near linear

    Analyze Large Multidimensional Datasets Using Algebraic Topology

    Get PDF
    This paper presents an efficient algorithm to extract knowledge from high-dimensionality, high- complexity datasets using algebraic topology, namely simplicial complexes. Based on concept of isomorphism of relations, our method turn a relational table into a geometric object (a simplicial complex is a polyhedron). So, conceptually association rule searching is turned into a geometric traversal problem. By leveraging on the core concepts behind Simplicial Complex, we use a new technique (in computer science) that improves the performance over existing methods and uses far less memory. It was designed and developed with a strong emphasis on scalability, reliability, and extensibility. This paper also investigate the possibility of Hadoop integration and the challenges that come with the framework

    Hypersweeps, Convective Clouds and Reeb Spaces

    Get PDF
    Isosurfaces are one of the most prominent tools in scientific data visualisation. An isosurface is a surface that defines the boundary of a feature of interest in space for a given threshold. This is integral in analysing data from the physical sciences which observe and simulate three or four dimensional phenomena. However it is time consuming and impractical to discover surfaces of interest by manually selecting different thresholds. The systematic way to discover significant isosurfaces in data is with a topological data structure called the contour tree. The contour tree encodes the connectivity and shape of each isosurface at all possible thresholds. The first part of this work has been devoted to developing algorithms that use the contour tree to discover significant features in data using high performance computing systems. Those algorithms provided a clear speedup over previous methods and were used to visualise physical plasma simulations. A major limitation of isosurfaces and contour trees is that they are only applicable when a single property is associated with data points. However scientific data sets often take multiple properties into account. A recent breakthrough generalised isosurfaces to fiber surfaces. Fiber surfaces define the boundary of a feature where the threshold is defined in terms of multiple parameters, instead of just one. In this work we used fiber surfaces together with isosurfaces and the contour tree to create a novel application that helps atmosphere scientists visualise convective cloud formation. Using this application, they were able to, for the first time, visualise the physical properties of certain structures that trigger cloud formation. Contour trees can also be generalised to handle multiple parameters. The natural extension of the contour tree is called the Reeb space and it comes from the pure mathematical field of fiber topology. The Reeb space is not yet fully understood mathematically and algorithms for computing it have significant practical limitations. A key difficulty is that while the contour tree is a traditional one dimensional data structure made up of points and lines between them, the Reeb space is far more complex. The Reeb space is made up of two dimensional sheets, attached to each other in intricate ways. The last part of this work focuses on understanding the structure of Reeb spaces and the rules that are followed when sheets are combined. This theory builds towards developing robust combinatorial algorithms to compute and use Reeb spaces for practical data analysis

    Multilevel Skeletonization Using Local Separators

    Get PDF

    Persistent Homology in Multivariate Data Visualization

    Get PDF
    Technological advances of recent years have changed the way research is done. When describing complex phenomena, it is now possible to measure and model a myriad of different aspects pertaining to them. This increasing number of variables, however, poses significant challenges for the visual analysis and interpretation of such multivariate data. Yet, the effective visualization of structures in multivariate data is of paramount importance for building models, forming hypotheses, and understanding intrinsic properties of the underlying phenomena. This thesis provides novel visualization techniques that advance the field of multivariate visual data analysis by helping represent and comprehend the structure of high-dimensional data. In contrast to approaches that focus on visualizing multivariate data directly or by means of their geometrical features, the methods developed in this thesis focus on their topological properties. More precisely, these methods provide structural descriptions that are driven by persistent homology, a technique from the emerging field of computational topology. Such descriptions are developed in two separate parts of this thesis. The first part deals with the qualitative visualization of topological features in multivariate data. It presents novel visualization methods that directly depict topological information, thus permitting the comparison of structural features in a qualitative manner. The techniques described in this part serve as low-dimensional representations that make the otherwise high-dimensional topological features accessible. We show how to integrate them into data analysis workflows based on clustering in order to obtain more information about the underlying data. The efficacy of such combined workflows is demonstrated by analysing complex multivariate data sets from cultural heritage and political science, for example, whose structures are hidden to common visualization techniques. The second part of this thesis is concerned with the quantitative visualization of topological features. It describes novel methods that measure different aspects of multivariate data in order to provide quantifiable information about them. Here, the topological characteristics serve as a feature descriptor. Using these descriptors, the visualization techniques in this part focus on augmenting and improving existing data analysis processes. Among others, they deal with the visualization of high-dimensional regression models, the visualization of errors in embeddings of multivariate data, as well as the assessment and visualization of the results of different clustering algorithms. All the methods presented in this thesis are evaluated and analysed on different data sets in order to show their robustness. This thesis demonstrates that the combination of geometrical and topological methods may support, complement, and surpass existing approaches for multivariate visual data analysis
    corecore