161 research outputs found

    A density-based statistical analysis of graph clustering algorithm performance

    Get PDF
    This is a pre-copyedited, author-produced version of an article accepted for publication in Journal of Complex Networks following peer review. The version of record: Pierre Miasnikof, Alexander Y Shestopaloff, Anthony J Bonner, Yuri Lawryshyn, Panos M Pardalos, A density-based statistical analysis of graph clustering algorithm performance, Journal of Complex Networks, Volume 8, Issue 3, June 2020, cnaa012, https://doi.org/10.1093/comnet/cnaa012 is available online at: https://doi.org/10.1093/comnet/cnaa012© 2020 The authors. Published by Oxford University Press. All rights reserved. We introduce graph clustering quality measures based on comparisons of global, intra- A nd inter-cluster densities, an accompanying statistical significance test and a step-by-step routine for clustering quality assessment. Our work is centred on the idea that well-clustered graphs will display a mean intra-cluster density that is higher than global density and mean inter-cluster density. We do not rely on any generative model for the null model graph. Our measures are shown to meet the axioms of a good clustering quality function. They have an intuitive graph-theoretic interpretation, a formal statistical interpretation and can be tested for significance. Empirical tests also show they are more responsive to graph structure, less likely to breakdown during numerical implementation and less sensitive to uncertainty in connectivity than the commonly used measures

    HIPAD - A Hybrid Interior-Point Alternating Direction algorithm for knowledge-based SVM and feature selection

    Full text link
    We consider classification tasks in the regime of scarce labeled training data in high dimensional feature space, where specific expert knowledge is also available. We propose a new hybrid optimization algorithm that solves the elastic-net support vector machine (SVM) through an alternating direction method of multipliers in the first phase, followed by an interior-point method for the classical SVM in the second phase. Both SVM formulations are adapted to knowledge incorporation. Our proposed algorithm addresses the challenges of automatic feature selection, high optimization accuracy, and algorithmic flexibility for taking advantage of prior knowledge. We demonstrate the effectiveness and efficiency of our algorithm and compare it with existing methods on a collection of synthetic and real-world data.Comment: Proceedings of 8th Learning and Intelligent OptimizatioN (LION8) Conference, 201

    Dense subgraph maintenance under streaming edge weight updates for real-time story identification

    Get PDF
    Recent years have witnessed an unprecedented proliferation of social media. People around the globe author, everyday, millions of blog posts, social network status updates, etc. This rich stream of information can be used to identify, on an ongoing basis, emerging stories, and events that capture popular attention. Stories can be identified via groups of tightly coupled real-world entities, namely the people, locations, products, etc, that are involved in the story. The sheer scale and rapid evolution of the data involved necessitate highly efficient techniques for identifying important stories at every point of time. The main challenge in real-time story identification is the maintenance of dense subgraphs (corresponding to groups of tightly coupled entities) under streaming edge weight updates (resulting from a stream of user-generated content). This is the first work to study the efficient maintenance of dense subgraphs under such streaming edge weight updates. For a wide range of definitions of density, we derive theoretical results regarding the magnitude of change that a single edge weight update can cause. Based on these, we propose a novel algorithm, DynDens, which outperforms adaptations of existing techniques to this setting and yields meaningful, intuitive results. Our approach is validated by a thorough experimental evaluation on large-scale real and synthetic datasets

    Atomic super-resolution tomography

    Get PDF
    We consider the problem of reconstructing a nanocrystal at atomic resolution from electron microscopy images taken at a few tilt angles. A popular reconstruction approach called discrete tomography confines the atom locations to a coarse spatial grid, which is inspired by the physical a priori knowledge that atoms in a crystalline solid tend to form regular lattices. Although this constraint has proven to be powerful for solving this very under-determined inverse problem in many cases, its key limitation is that, in practice, defects may occur that cause atoms to deviate from regular lattice positions. Here we propose a grid-free discrete tomography algorithm that allows for continuous deviations of the atom locations similar to super-resolution approaches for microscopy. The new formulation allows us to define atomic interaction potentials explicitly, which results in a both meaningful and powerful incorporation of the available physical a priori knowledge about the crystal's properties. In computational experiments, we compare the proposed grid-free method to established grid-based approaches and show that our approach can indeed recover the atom positions more accurately for common lattice defects

    The Functional Consequences of Mutualistic Network Architecture

    Get PDF
    The architecture and properties of many complex networks play a significant role in the functioning of the systems they describe. Recently, complex network theory has been applied to ecological entities, like food webs or mutualistic plant-animal interactions. Unfortunately, we still lack an accurate view of the relationship between the architecture and functioning of ecological networks. In this study we explore this link by building individual-based pollination networks from eight Erysimum mediohispanicum (Brassicaceae) populations. In these individual-based networks, each individual plant in a population was considered a node, and was connected by means of undirected links to conspecifics sharing pollinators. The architecture of these unipartite networks was described by means of nestedness, connectivity and transitivity. Network functioning was estimated by quantifying the performance of the population described by each network as the number of per-capita juvenile plants produced per population. We found a consistent relationship between the topology of the networks and their functioning, since variation across populations in the average per-capita production of juvenile plants was positively and significantly related with network nestedness, connectivity and clustering. Subtle changes in the composition of diverse pollinator assemblages can drive major consequences for plant population performance and local persistence through modifications in the structure of the inter-plant pollination networks

    Guaranteed Error Bounds on Approximate Model Abstractions Through Reachability Analysis

    Get PDF
    It is well known that exact notions of model abstraction and reduction for dynamical systems may not be robust enough in practice because they are highly sensitive to the specific choice of parameters. In this paper we consider this problem for nonlinear ordinary differential equations (ODEs) with polynomial derivatives. We introduce approximate differential equivalence as a more permissive variant of a recently developed exact counterpart, allowing ODE variables to be related even when they are governed by nearby derivatives. We develop algorithms to (i) compute the largest approximate differential equivalence; (ii) construct an approximate quotient model from the original one via an appropriate parameter perturbation; and (iii) provide a formal certificate on the quality of the approximation as an error bound, computed as an over-approximation of the reachable set of the perturbed model. Finally, we apply approximate differential equivalences to study the effect of parametric tolerances in models of symmetric electric circuits
    corecore