    On the number of types in sparse graphs

    We prove that for every class of graphs C\mathcal{C} which is nowhere dense, as defined by Nesetril and Ossona de Mendez, and for every first order formula ϕ(xˉ,yˉ)\phi(\bar x,\bar y), whenever one draws a graph GCG\in \mathcal{C} and a subset of its nodes AA, the number of subsets of AyˉA^{|\bar y|} which are of the form {vˉAyˉ ⁣:Gϕ(uˉ,vˉ)}\{\bar v\in A^{|\bar y|}\, \colon\, G\models\phi(\bar u,\bar v)\} for some valuation uˉ\bar u of xˉ\bar x in GG is bounded by O(Axˉ+ϵ)\mathcal{O}(|A|^{|\bar x|+\epsilon}), for every ϵ>0\epsilon>0. This provides optimal bounds on the VC-density of first-order definable set systems in nowhere dense graph classes. We also give two new proofs of upper bounds on quantities in nowhere dense classes which are relevant for their logical treatment. Firstly, we provide a new proof of the fact that nowhere dense classes are uniformly quasi-wide, implying explicit, polynomial upper bounds on the functions relating the two notions. Secondly, we give a new combinatorial proof of the result of Adler and Adler stating that every nowhere dense class of graphs is stable. In contrast to the previous proofs of the above results, our proofs are completely finitistic and constructive, and yield explicit and computable upper bounds on quantities related to uniform quasi-wideness (margins) and stability (ladder indices)

    Experimental analysis of the accessibility of drawings with few segments

    The visual complexity of a graph drawing is defined as the number of geometric objects needed to represent all its edges. In particular, one object may represent multiple edges, e.g., one needs only one line segment to draw two collinear incident edges. We study the question if drawings with few segments have a better aesthetic appeal and help the user to asses the underlying graph. We design an experiment that investigates two different graph types (trees and sparse graphs), three different layout algorithms for trees, and two different layout algorithms for sparse graphs. We asked the users to give an aesthetic ranking on the layouts and to perform a furthest-pair or shortest-path task on the drawings.Comment: Appears in the Proceedings of the 25th International Symposium on Graph Drawing and Network Visualization (GD 2017

    Spectral and Dynamical Properties in Classes of Sparse Networks with Mesoscopic Inhomogeneities

    We study structure, eigenvalue spectra and diffusion dynamics in a wide class of networks with subgraphs (modules) at mesoscopic scale. The networks are grown within the model with three parameters controlling the number of modules, their internal structure as scale-free and correlated subgraphs, and the topology of connecting network. Within the exhaustive spectral analysis for both the adjacency matrix and the normalized Laplacian matrix we identify the spectral properties which characterize the mesoscopic structure of sparse cyclic graphs and trees. The minimally connected nodes, clustering, and the average connectivity affect the central part of the spectrum. The number of distinct modules leads to an extra peak at the lower part of the Laplacian spectrum in cyclic graphs. Such a peak does not occur in the case of topologically distinct tree-subgraphs connected on a tree. Whereas the associated eigenvectors remain localized on the subgraphs both in trees and cyclic graphs. We also find a characteristic pattern of periodic localization along the chains on the tree for the eigenvector components associated with the largest eigenvalue equal 2 of the Laplacian. We corroborate the results with simulations of the random walk on several types of networks. Our results for the distribution of return-time of the walk to the origin (autocorrelator) agree well with recent analytical solution for trees, and it appear to be independent on their mesoscopic and global structure. For the cyclic graphs we find new results with twice larger stretching exponent of the tail of the distribution, which is virtually independent on the size of cycles. The modularity and clustering contribute to a power-law decay at short return times

    The split-and-drift random graph, a null model for speciation

    We introduce a new random graph model motivated by biological questions relating to speciation. This random graph is defined as the stationary distribution of a Markov chain on the space of graphs on {1,,n}\{1, \ldots, n\}. The dynamics of this Markov chain is governed by two types of events: vertex duplication, where at constant rate a pair of vertices is sampled uniformly and one of these vertices loses its incident edges and is rewired to the other vertex and its neighbors; and edge removal, where each edge disappears at constant rate. Besides the number of vertices nn, the model has a single parameter rnr_n. Using a coalescent approach, we obtain explicit formulas for the first moments of several graph invariants such as the number of edges or the number of complete subgraphs of order kk. These are then used to identify five non-trivial regimes depending on the asymptotics of the parameter rnr_n. We derive an explicit expression for the degree distribution, and show that under appropriate rescaling it converges to classical distributions when the number of vertices goes to infinity. Finally, we give asymptotic bounds for the number of connected components, and show that in the sparse regime the number of edges is Poissonian.Comment: added Proposition 2.4 and formal proofs of Proposition 2.3 and 2.

    Binding and Normalization of Binary Sparse Distributed Representations by Context-Dependent Thinning

    Distributed representations were often criticized as inappropriate for encoding of data with a complex structure. However Plate's Holographic Reduced Representations and Kanerva's Binary Spatter Codes are recent schemes that allow on-the-fly encoding of nested compositional structures by real-valued or dense binary vectors of fixed dimensionality. In this paper we consider procedures of the Context-Dependent Thinning which were developed for representation of complex hierarchical items in the architecture of Associative-Projective Neural Networks. These procedures provide binding of items represented by sparse binary codevectors (with low probability of 1s). Such an encoding is biologically plausible and allows a high storage capacity of distributed associative memory where the codevectors may be stored. In contrast to known binding procedures, Context-Dependent Thinning preserves the same low density (or sparseness) of the bound codevector for varied number of component codevectors. Besides, a bound codevector is not only similar to another one with similar component codevectors (as in other schemes), but it is also similar to the component codevectors themselves. This allows the similarity of structures to be estimated just by the overlap of their codevectors, without retrieval of the component codevectors. This also allows an easy retrieval of the component codevectors. Examples of algorithmic and neural-network implementations of the thinning procedures are considered. We also present representation examples for various types of nested structured data (propositions using role-filler and predicate-arguments representation schemes, trees, directed acyclic graphs) using sparse codevectors of fixed dimension. Such representations may provide a fruitful alternative to the symbolic representations of traditional AI, as well as to the localist and microfeature-based connectionist representations

    Computing Top-k Closeness Centrality Faster in Unweighted Graphs. (Technical Report)

    Centrality indices are widely used analytic measures for the importance of nodes in a network. Closeness centrality is very popular among these measures. For a single node v, it takes the sum of the distances of v to all other nodes into account. The currently best algorithms in practical applications for computing the closeness for all nodes exactly in unweighted graphs are based on breadth-first search (BFS) from every node. Thus, even for sparse graphs, these algorithms require quadratic running time in the worst case, which is prohibitive for large networks. In many relevant applications, however, it is unnecessary to compute closeness values for all nodes. Instead, one requires only the k nodes with the highest closeness values in descending order. Thus, we present a new algorithm for computing this top-k ranking in unweighted graphs. Following the rationale of previous work, our algorithm significantly reduces the number of traversed edges. It does so by computing upper bounds on the closeness and stopping the current BFS search when k nodes already have higher closeness than the bounds computed for the other nodes. In our experiments with real-world and synthetic instances of various types, one of these new bounds is good for small-world graphs with low diameter (such as social networks), while the other one excels for graphs with high diameter (such as road networks). Combining them yields an algorithm that is faster than the state of the art for top-k computations for all test instances, by a wide margin for high-diameter graphs

    Chromatic and structural properties of sparse graph classes

    A graph is a mathematical structure consisting of a set of objects, which we call vertices, and links between pairs of objects, which we call edges. Graphs are used to model many problems arising in areas such as physics, sociology, and computer science. It is partially because of the simplicity of the definition of a graph that the concept can be so widely used. Nevertheless, when applied to a particular task, it is not always necessary to study graphs in all their generality, and it can be convenient to studying them from a restricted point of view. Restriction can come from requiring graphs to be embeddable in a particular surface, to admit certain types of decompositions, or by forbidding some substructure. A collection of graphs satisfying a fixed restriction forms a class of graphs. Many important classes of graphs satisfy that graphs belonging to it cannot have many edges in comparison with the number of vertices. Such is the case of classes with an upper bound on the maximum degree, and of classes excluding a fixed minor. Recently, the notion of classes with bounded expansion was introduced by Neˇsetˇril and Ossona de Mendez [62], as a generalisation of many important types of sparse classes. In this thesis we study chromatic and structural properties of classes with bounded expansion. We say a graph is k-degenerate if each of its subgraphs has a vertex of degree at most k. The degeneracy is thus a measure of the density of a graph. This notion has been generalised with the introduction, by Kierstead and Yang [47], of the generalised colouring numbers. These parameters have found applications in many areas of Graph Theory, including a characterisation of classes with bounded expansion. One of the main results of this thesis is a series of upper bounds on the generalised colouring numbers, for different sparse classes of graphs, such as classes excluding a fixed complete minor, classes with bounded genus and classes with bounded tree-width. We also study the following problem: for a fixed positive integer p, how many colours do we need to colour a given graph in such a way that vertices at distance exactly p get different colours? When considering classes with bounded expansion, we improve dramatically on the previously known upper bounds for the number of colours needed. Finally, we introduce a notion of addition of graph classes, and show various cases in which sparse classes can be summed so as to obtain another sparse class

    A Monte Carlo Evaluation of Weighted Community Detection Algorithms

    The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman's spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities