241 research outputs found
Axiomatic Construction of Hierarchical Clustering in Asymmetric Networks
This paper considers networks where relationships between nodes are
represented by directed dissimilarities. The goal is to study methods for the
determination of hierarchical clusters, i.e., a family of nested partitions
indexed by a connectivity parameter, induced by the given dissimilarity
structures. Our construction of hierarchical clustering methods is based on
defining admissible methods to be those methods that abide by the axioms of
value - nodes in a network with two nodes are clustered together at the maximum
of the two dissimilarities between them - and transformation - when
dissimilarities are reduced, the network may become more clustered but not
less. Several admissible methods are constructed and two particular methods,
termed reciprocal and nonreciprocal clustering, are shown to provide upper and
lower bounds in the space of admissible methods. Alternative clustering
methodologies and axioms are further considered. Allowing the outcome of
hierarchical clustering to be asymmetric, so that it matches the asymmetry of
the original data, leads to the inception of quasi-clustering methods. The
existence of a unique quasi-clustering method is shown. Allowing clustering in
a two-node network to proceed at the minimum of the two dissimilarities
generates an alternative axiomatic construction. There is a unique clustering
method in this case too. The paper also develops algorithms for the computation
of hierarchical clusters using matrix powers on a min-max dioid algebra and
studies the stability of the methods proposed. We proved that most of the
methods introduced in this paper are such that similar networks yield similar
hierarchical clustering results. Algorithms are exemplified through their
application to networks describing internal migration within states of the
United States (U.S.) and the interrelation between sectors of the U.S. economy.Comment: This is a largely extended version of the previous conference
submission under the same title. The current version contains the material in
the previous version (published in ICASSP 2013) as well as material presented
at the Asilomar Conference on Signal, Systems, and Computers 2013, GlobalSIP
2013, and ICML 2014. Also, unpublished material is included in the current
versio
Metric Representations Of Networks
The goal of this thesis is to analyze networks by first projecting them onto structured metric-like spaces -- governed by a generalized triangle inequality -- and then leveraging this structure to facilitate the analysis. Networks encode relationships between pairs of nodes, however, the relationship between two nodes can be independent of the other ones and need not be defined for every pair. This is not true for metric spaces, where the triangle inequality imposes conditions that must be satisfied by triads of distances and these must be defined for every pair of nodes. In general terms, this additional structure facilitates the analysis and algorithm design in metric spaces. In deriving metric projections for networks, an axiomatic approach is pursued where we encode as axioms intuitively desirable properties and then seek for admissible projections satisfying these axioms. Although small variations are introduced throughout the thesis, the axioms of projection -- a network that already has the desired metric structure must remain unchanged -- and transformation -- when reducing dissimilarities in a network the projected distances cannot increase -- shape all of the axiomatic constructions considered. Notwithstanding their apparent weakness, the aforementioned axioms serve as a solid foundation for the theory of metric representations of networks.
We begin by focusing on hierarchical clustering of asymmetric networks, which can be framed as a network projection problem onto ultrametric spaces. We show that the set of admissible methods is infinite but bounded in a well-defined sense and state additional desirable properties to further winnow the admissibility landscape. Algorithms for the clustering methods developed are also derived and implemented. We then shift focus to projections onto generalized q-metric spaces, a parametric family containing among others the (regular) metric and ultrametric spaces. A uniqueness result is shown for the projection of symmetric networks whereas for asymmetric networks we prove that all admissible projections are contained between two extreme methods. Furthermore, projections are illustrated via their implementation for efficient search and data visualization. Lastly, our analysis is extended to encompass projections of dioid spaces, natural algebraic generalizations of weighted networks
Monotone independence, comb graphs and Bose-Einstein condensation
The adjacency matrix of a comb graph is decomposed into a sum of monotone independent random variables with respect to the vacuum state. The vacuum spectral distribution is shown to be asymptotically the arcsine law as a consequence of the monotone central limit theorem. As an example the comb lattice is studied with explicit calculation
Recommended from our members
Incremental Non-Greedy Clustering at Scale
Clustering is the task of organizing data into meaningful groups. Modern clustering applications such as entity resolution put several demands on clustering algorithms: (1) scalability to massive numbers of points as well as clusters, (2) incremental additions of data, (3) support for any user-specified similarity functions.
Hierarchical clusterings are often desired as they represent multiple alternative flat clusterings (e.g., at different granularity levels). These tree-structured clusterings provide for both fine-grained clusters as well as uncertainty in the presence of newly arriving data. Previous work on hierarchical clustering does not fully address all three of the aforementioned desiderata. Work on incremental hierarchical clustering often makes greedy, irrevocable clustering decisions that are regretted in the presence of future data. Work on scalable hierarchical clustering does not support incremental additions or deletions. These methods often make requirements on the similarity functions used and/or empirically tend to over merge clusters, which can lead to inaccurate clusterings.
In this thesis, we present incremental and scalable methods for hierarchical clustering to empirically satisfy the above desiderata. Our work aims to represent uncertainty and meaningful alternative clusterings, to efficiently reconsider past decisions in the incremental case, and to use parallelism to scale to massive datasets. Our method, Grinch, handles incrementally arriving data in a non-greedy fashion, by reconsidering past decisions using tree structure re-arrangements (e.g., rotations and grafts) invoked in accordance with the user’s specified similarity function. To achieve scalability to massive datasets, our method, SCC, builds a hierarchical clusterings in a level-wise bottom-up manner. Certain clustering decisions are made independently in parallel within each level, and a global similarity threshold schedule prevents greedy over-merging. We show how SCC can be combined with the tree-structure re-arrangements in Grinch to form a mini-batch algorithm achieving both scalable and incremental performance. Lastly, we generalize our hierarchical clustering approaches to DAG-structured ones, which can better represent uncertainty in clustering by representing overlapping clusters. We introduce an efficient bottom-up method for DAG-structured clustering, Llama. For each of the proposed methods, we provide both a theoretical and empirical analysis. Empirically, our methods achieve state-of-the-art results on clustering benchmarks in both the batch and the incremental settings, including multiple point improvements in dendrogram purity and scalability to billions of points
Formal Model-Driven Analysis of Resilience of GossipSub to Attacks from Misbehaving Peers
GossipSub is a new peer-to-peer communication protocol designed to counter
attacks from misbehaving peers by carefully controlling what information is
disseminated and to whom, via a score function computed by each peer that
captures positive and negative behaviors of its neighbors. The score function
depends on several parameters (weights, caps, thresholds, etc.) that can be
configured by applications using GossipSub. The specification for GossipSub is
written in English and its resilience to attacks from misbehaving peers is
supported empirically by emulation testing using an implementation in Golang.
In this work we take a foundational approach to understanding the resilience
of GossipSub to attacks from misbehaving peers. We build the first formal model
of GossipSub, using the ACL2s theorem prover. Our model is officially endorsed
by GossipSub developers. It can simulate GossipSub networks of arbitrary size
and topology, with arbitrarily configured peers, and can be used to prove and
disprove theorems about the protocol. We formalize fundamental security
properties stating that the score function is fair, penalizes bad behavior and
rewards good behavior. We prove that the score function is always fair, but can
be configured in ways that either penalize good behavior or ignore bad
behavior. Using our model, we run GossipSub with the specific configurations
for two popular real-world applications: the FileCoin and Eth2.0 blockchains.
We show that all properties hold for FileCoin. However, given any Eth2.0
network (of any topology and size) with any number of potentially misbehaving
peers, we can synthesize attacks where these peers are able to continuously
misbehave by never forwarding topic messages, while maintaining positive scores
so that they are never pruned from the network by GossipSub.Comment: In revie
Algorithmic Modelling of Folded Surfaces. Analysis and Design of Folded Surfaces in Architecture and Manufacturing.
Both in the field of design and architecture origami is often taken as a reference for its kinetic proprieties and its elegant appearance. Dynamic facades, fast deployment structures, temporary shelters, portable furniture, retractile roofs, are some examples which can take advantage of the kinetic properties of the origami. While designing with origami, the designer needs to control shape and motion at the same time, which increases the complexity of the design process. This complexity of the design process may lead the designers to choose a solution where the patterns are mere copies of well-known patterns or to reference to the origami only for ornamental purposes. The origami-inspired projects that we gathered and studied in the fields of architecture, manufacturing and fashion, confirmed this trend. We observed that the cause of this lack of variety could also be attributed to insufficient knowledge, or to inefficiency of the design tools. Many researchers studied the mathematical implications of origami, to be able to design specific patterns for precise applications. However, this theoretical knowledge is hard to apply directly to different practical projects without a deep understanding of these theorems. Thus, in this thesis, we aim to narrow the gap between potentialities of this discipline and limits of the available designing tools, by proposing a simplified synthetic constructive approach, applied with a parametric modeller, which allows the designers to bypass scripting and algebraic formulations and, at the same time, it increases the design freedom. Among the cases studies, we propose some fabrication-aimed examples, which introduce the subjects of thick-origami, distribution of stresses and analysis of deformations of the folded models.Nei campi dell’architettura e dell’industrial design, l’origami è spesso preso come riferimento per le sue proprietà cinetiche e le sue forme eleganti. Facciate dinamiche, strutture pieghevoli, rifugi temporanei, arredi portatili, tetti retrattili, sono alcuni esempi di progetti che potrebbero beneficiare delle proprietà cinetiche dell’origami. Progettare con l’origami richiede di controllare forma e movimento contemporaneamente; ciò aumenta la complessità del processo progettuale. Questa difficoltà progettuale può portare i progettisti a scegliere soluzioni che non sono altro che mere copie di pattern noti o a considerare l’origami come riferimento solo per ragioni ornamentali. I progetti ispirati all’origami che abbiamo raccolto ed analizzato nei campi di architettura, industria manifatturiera, e moda, confermano questo trend. Abbiamo osservato che la causa di questo mero utilizzo potrebbe essere attribuibile a preparazione insufficiente del progettista o a inefficienza degli strumenti progettuali. Diversi ricercatori hanno studiato le implicazioni matematiche dell’origami, per poter progettare specifici pattern per precise applicazioni. Nonostante ciò, questa conoscenza teorica è difficile da applicare direttamente ad altri progetti pratici senza una profonda comprensione di questi teoremi. Questa tesi punta quindi a ridurre il divario tra potenzialità di questa disciplina e limiti imposti dagli strumenti progettuali disponibili, proponendo un approccio sintetico e costruttivo semplificato, che permetta ai progettisti di evitare scripting e formulazioni algebriche, aumentando allo stesso tempo la libertà progettuale. Tra i casi studio, proponiamo anche alcuni esempi mirati alla fabbricazione che introducono il tema dell’origami a spessore non nullo, della distribuzione delle forze e dell’analisi delle deformazioni sui modelli piegati
Bers’ simultaneous uniformization and the intersection of Poincaré holonomy varieties
We consider the space of ordered pairs of distinct ℂP¹ -structures on Riemann surfaces (of any orientations) which have identical holonomy, so that the quasi-Fuchsian space is identified with a connected component of this space. This space holomorphically maps to the product of the Teichmüller spaces minus its diagonal. In this paper, we prove that this mapping is a complete local branched covering map. As a corollary, we reprove Bers’ simultaneous uniformization theorem without any quasi-conformal deformation theory. Our main theorem is that the intersection of arbitrary two Poincaré holonomy varieties (SL₂ ℂ-opers) is a non-empty discrete set, which is closely related to the mapping.The version of record of this article, first published in Geometric and Functional Analysis, is available online at Publisher’s website: https://doi.org/10.1007/s00039-023-00653-
- …