1,140 research outputs found

    Multitask Learning on Graph Neural Networks: Learning Multiple Graph Centrality Measures with a Unified Network

    Full text link
    The application of deep learning to symbolic domains remains an active research endeavour. Graph neural networks (GNN), consisting of trained neural modules which can be arranged in different topologies at run time, are sound alternatives to tackle relational problems which lend themselves to graph representations. In this paper, we show that GNNs are capable of multitask learning, which can be naturally enforced by training the model to refine a single set of multidimensional embeddings Rd\in \mathbb{R}^d and decode them into multiple outputs by connecting MLPs at the end of the pipeline. We demonstrate the multitask learning capability of the model in the relevant relational problem of estimating network centrality measures, focusing primarily on producing rankings based on these measures, i.e. is vertex v1v_1 more central than vertex v2v_2 given centrality cc?. We then show that a GNN can be trained to develop a \emph{lingua franca} of vertex embeddings from which all relevant information about any of the trained centrality measures can be decoded. The proposed model achieves 89%89\% accuracy on a test dataset of random instances with up to 128 vertices and is shown to generalise to larger problem sizes. The model is also shown to obtain reasonable accuracy on a dataset of real world instances with up to 4k vertices, vastly surpassing the sizes of the largest instances with which the model was trained (n=128n=128). Finally, we believe that our contributions attest to the potential of GNNs in symbolic domains in general and in relational learning in particular.Comment: Published at ICANN2019. 10 pages, 3 Figure

    Consistency and differences between centrality measures across distinct classes of networks

    Full text link
    The roles of different nodes within a network are often understood through centrality analysis, which aims to quantify the capacity of a node to influence, or be influenced by, other nodes via its connection topology. Many different centrality measures have been proposed, but the degree to which they offer unique information, and such whether it is advantageous to use multiple centrality measures to define node roles, is unclear. Here we calculate correlations between 17 different centrality measures across 212 diverse real-world networks, examine how these correlations relate to variations in network density and global topology, and investigate whether nodes can be clustered into distinct classes according to their centrality profiles. We find that centrality measures are generally positively correlated to each other, the strength of these correlations varies across networks, and network modularity plays a key role in driving these cross-network variations. Data-driven clustering of nodes based on centrality profiles can distinguish different roles, including topological cores of highly central nodes and peripheries of less central nodes. Our findings illustrate how network topology shapes the pattern of correlations between centrality measures and demonstrate how a comparative approach to network centrality can inform the interpretation of nodal roles in complex networks.Comment: Main text (25 pages, 8 figures, 1 table), supplementary information (16 pages, 2 tables) and supplementary figures (17 figures

    Uplifting edges in higher order networks: spectral centralities for non-uniform hypergraphs

    Full text link
    Spectral analysis of networks states that many structural properties of graphs, such as centrality of their nodes, are given in terms of their adjacency matrices. The natural extension of such spectral analysis to higher order networks is strongly limited by the fact that a given hypergraph could have several different adjacency hypermatrices, hence the results obtained so far are mainly restricted to the class of uniform hypergraphs, which leaves many real systems unattended. A new method for analysing non-linear eigenvector-like centrality measures of non-uniform hypergraphs is presented in this paper that could be useful for studying properties of H\mathcal{H}-eigenvectors and Z\mathcal{Z}-eigenvectors in the non-uniform case. In order to do so, a new operation - the uplift\textit{uplift} - is introduced, incorporating auxiliary nodes in the hypergraph to allow for a uniform-like analysis. We later argue why this is a mathematically sound operation, and we furthermore use it to classify a whole family of hypergraphs with unique Perron-like Z\mathcal{Z}-eigenvectors. We supplement the theoretical analysis with several examples and numerical simulations on synthetic and real datasets.Comment: 28 pages, 6 figure

    Applications of Multidimensional Space of Mathematical Molecular Descriptors in Large-Scale Bioactivity and Toxicity Prediction- Applications to Prediction of Mutagenicity and Blood-Brain Barrier Entry of Chemicals

    Get PDF
    In this chapter, we review our QSAR research in the prediction of toxicities, bioactivities and properties of chemicals using computed mathematical descriptors. Robust statistical methods have been used to develop high quality predictive quantitative structure-activity relationship (QSAR) models for the prediction of mutagenicity and BBB (blood-brain barrier) entry of two large and diverse sets chemicals. This work is licensed under a Creative Commons Attribution 4.0 International License

    Essays on the Network Analysis of Culture

    Get PDF
    Nelle relazioni economiche, negli accordi internazionali e nel dialogo istituzionale, la parola distanza \ue8 una delle pi\uf9 enunciate. Ci sono distanze esogene da colmare per creare legami, a volte ci sono chiusure necessarie e altre volte rotture inevitabili, ma questo pu\uf2 dipendere, cos\uec come le distanze geografiche e fisiche, e gli interessi impliciti, in gran parte dallo status culturale di gruppi di individui. La valutazione quantitativa della distanza tra due entit\ue0 \ue8 una propriet\ue0 diadica ed in quanto tale, la presenza, intensit\ue0, direzione e segno di un legame rappresenta un modo per catturarla. Poich\ue9 le entit\ue0 possono essere individui, oggetti, societ\ue0, paesi, pianeti, cos\uec come reti che si riferiscono a contesti specifici, e il modo di misurare la somiglianza tra di loro pu\uf2 essere vario, una cosa peculiare delle distanze \ue8 la loro natura mutevole. Mentre le distanze fisiche sono quasi oggettivamente calcolabili, nel caso della cultura (ed anche di altri concetti pi\uf9 o meno ampi) l\u2019utilizzo di un metodo rispetto ad un altro potrebbe cambiare radicalmente la relazione di distanza tra le entit\ue0, soprattutto se esse hanno un alto grado di complessit\ue0. Il bagaglio culturale svolge un ruolo importante nel determinare lo status socio-economico di un paese e la sua caratterizzazione in termini di somiglianza con altri paesi. Il Capitolo 1 - utilizzando i dati della WVS/EVS Joint 2017 - operativizza una definizione di cultura che tiene conto delle interdipendenze tra tratti culturali a livello di paese e propone una nuova misura di distanza culturale. Sfruttando un recente algoritmo Bayesiano di Copula Gaussian graphical models, questo Capitolo stima per ciascuno di 76 paesi inclusi nella WVS/EVS Joint 2017, la rete culturale di interdipendenze tra tratti culturali considerando diversi insiemi di essi: i 6 della prima batteria di domande, i 10 della mappa culturale di Inglehart-Welzel, i 14 della mappa culturale di Inglehart-Welzel, dove per gli indici di \u201cPost-materialism\u201d e \u201cAutonomy\u201d sono state utilizzate le variabili da cui sono ricavate, e 60 tratti culturali dei quali, 14 come definiti in precedenza, 6 fanno riferimento alla prima batteria di domande e i restanti 40 sono selezionati in modo da ottenere un numero di variabili che possa far fronte al trade-off tra il tempo di elaborazione dell\u2019algoritmo e il minimo numero di valori mancanti per paese. Dopo aver definito le distanze tra i paesi considerando sia le reti culturali che le distribuzioni dei tratti culturali, attraverso il metodo DISTATIS, questo Capitolo osserva come l'aggiunta della componente di rete a quella distributiva classica, modifichi sostanzialmente la misura della distanza culturale sia nel caso di pochi tratti culturali (6, 10 e 14) che nel caso di pi\uf9 tratti culturali (60). Infine, esso afferma che la struttura di rete della cultura nazionale \ue8 importante per la definizione della distanza culturale tra i paesi del mondo e trova due misure finali di distanza: il Compromise_Large (da 60 variabili) e il Compromise_IW (dalle variabili della mappa culturale di Inglehart-Welzel). L'effetto delle variabili culturali sulla situazione economica di un paese, o pi\uf9 in generale di un'area geograficamente definita, \ue8 stato negli ultimi anni scandagliato dalla letteratura economica. Le distanze culturali, genetiche, geografiche, climatiche, semantiche, etniche, linguistiche, politiche sono state spesso incluse nei modelli econometrici come variabili indipendenti o di controllo. Il Capitolo 2 segue questa letteratura, prima confrontando individualmente tre misurazioni della distanza culturale calcolate nel Capitolo 1 con altre distanze usate in letteratura assieme alla distanza culturale o come proxy di essa, e poi confrontandole (le misure di distanza culturale e quelle dalla letteratura) congiuntamente tramite DISTATIS. Le tre distanze culturali sono le due nuove misure di cui sopra (Compromise_Large e Compromise_IW) e l'IW index ottenuto come distanza euclidea tra i paesi nella mappa culturale di Inglehart-Welzel, mentre le altre distanze prendono in considerazione la condizione climatica, l'etnia e la lingua, la genetica ed il recente fenomeno di Facebook. Infine, questo Capitolo considera tutte le misure di distanza all\u2019interno di un Social Relations Regression Model (SRRM) che stima la distanza tra i paesi in base al PIL pro capite (anno 2017). Il risultato finale mostra che le distanze culturali sono poco correlate con le distanze prese dalla letteratura, e quando si trova un compromesso tra di loro, di solito la Compromise_Large \ue8 caratterizzata da un peso leggermente superiore. La conclusione principale riguarda l'importante potere esplicativo della distanza Compromise_Large sulla distanza in PIL pro capite rispetto a quello della IW index e della Compromise_IW, la quale ha un significato intermedio tra le due. Ci\uf2 conferma l'importanza di considerare la rete culturale nazionale di interdipendenze tra tratti culturali nella definizione generale della distanza culturale, ed anche che l\u2019aggiunta di un numero maggiore di tratti culturali pu\uf2 influire nella sua specificazione, seppur i tratti culturali considerati da Ronald Inglehart e Christian Welzel nella costruzione della loro mappa culturale sembrano catturare gi\ue0 una buona parte dell\u2019informazione culturale dei paesi. La produzione abnorme di dati nel nostro tempo ha permesso l'osservazione di grandi collezioni di reti all\u2019interno di un campo di analisi specifico, le quali possono essere caratterizzate anche da una diversa dimensione l\u2019una dall\u2019altra (ad esempio si pu\uf2 pensare alla rete commerciale tra paesi di ogni prodotto). Una rete \ue8 un oggetto complesso, per cui un modo comune per analizzare e comparare congiuntamente un set di reti \ue8 ridurne la complessit\ue0 proiettandole in uno spazio ridotto attraverso i descrittori che le caratterizzano. \uc8 qui che sorge il problema analizzato nel Capitolo 3: qual \ue8 il sottoinsieme di descrittori che mantiene le caratteristiche delle reti il pi\uf9 possibile invariate nel processo di mapping, ovvero proietta in punti diversi dello spazio reti non isomorfe e raggruppa vicine reti strutturalmente simili tra di loro e lontano reti dissimili? Attraverso una simulazione di reti da quattro modelli generativi (Random, Scale-free, Small-world e Stochastic block model) e la selezione di un ampio insieme di descrittori riferenti ai livelli micro, meso e macro di analisi della rete, questo Capitolo trova tramite il metodo di Subgroup Discovery un piccolo sottoinsieme di descrittori. Questo sottoinsieme \ue8 composto da 5 descrittori: il momento primo del Coefficiente di Clustering Locale, 3 configurazioni di Motifs e il descrittore di Smallworldness. L'efficacia dei descrittori \ue8 valutata applicandoli all'insieme delle reti culturali binarie con 60 tratti culturali stimate nel Capitolo 1 e confrontando le distanze tra questi punti-rete nello spazio dei descrittori con distanze di reti popolari in letteratura. Le principali innovazioni sono due: la costruzione di un nuovo indice di distanza culturale tra i paesi, in cui \ue8 inclusa la rete culturale di interdipendenze tra tratti culturali; la selezione di un piccolo sottoinsieme efficiente di descrittori per la proiezione nello spazio di insiemi di reti binarie che possono avere grandezza diversa l\u2019una dall\u2019altra.In economic relations, in international agreements and in institutional dialogue, the word distance is one of the most enunciated. There are exogenous distances to be bridged to ignite a bond, sometimes there are necessary cracks and other times unavoidable breaks, but this may depend, as well as geographical and physical distances, and implicit interests, largely on the cultural status of groups of individuals. The quantitative evaluation of the distance between two entities is a dyadic property and as such, the presence, intensity, direction and sign of their tie is a way to undertake it. Since entities can be individuals, objects, companies, countries, planets, as well as networks referring to specific contexts, and the way to measure similarity between them is various, a peculiarity thing of distances is their changeable nature. While physical distances are almost objectively computable, in case of culture (and even other more or less broad concepts) using a method rather than another could radically change the proximity relationship between entities, especially if they have a high degree of complexity. The cultural background plays an important role in determining the socio-economic status of a country and its characterization in terms of similarity to other countries. The Chapter 1 - using data from the WVS/EVS Joint 2017 - operationalizes a definition of culture that takes into account the interdependencies between cultural traits at country level and calculates a new measure of cultural distance. Taking advantage of a recent Bayesian algorithm by Gaussian copula graphical model, this Chapter estimates for each of 76 countries included in the WVS/EVS Joint 2017, the cultural network of interdependencies between cultural traits considering different sets of them: the 6 from the first battery of questions, the 10 of the Inglehart-Welzel Cultural Map, the 14 of the Inglehart-Welzel Cultural Map, where for \u201cPost-materialism\u201d and \u201cAutonomy\u201d indices are used the variables from which they are derived, and 60 cultural traits of which, 14 as previously defined, 6 refer to the first battery of questions and the remaining 40 are selected to get a number that can cope with the trade-off between processing time and the minimum number of missing values per country. After defining the distances between countries considering both cultural networks and distributions of cultural traits, this Chapter observes via DISTATIS how the addition of the network component to the classic distributional one, substantially modifies the measure of cultural distance both in the case of a few cultural traits (6, 10 and 14) and in the case of more cultural traits (60). Finally, it affirms that the network structure of the national culture matters for the definition of the cultural distance among worldwide countries and finds two final distance measures: Compromise_Large (from 60 variables) and Compromise_IW (from the Inglehart-Welzel cultural map variables). The effect of cultural variables on the economic situation of a country or more generally of a geographically definable area, has been scoured in recent years by the economic literature. Cultural, genetic, geographical, climatic, semantic, ethnic, linguistic, political distances have often been included in econometric models as independent or control variables. The Chapter 2 follows this literature, firstly by individually comparing three measurements of cultural distance calculated in Chapter 1 with other distances used in literature together with cultural distance or as a proxy of it, and secondly by jointly comparing them (the measurements of cultural distance and those from literature) via DISTATIS. The three cultural distances are the two new measures mentioned above (Compromise_Large and Compromise_IW) and the IW index obtained as Euclidean distance between countries in the Inglehart-Welzel cultural map, while the other distances take into consideration climatic condition, ethnicity and language, genetics and the recent phenomenon of Facebook. Finally, this Chapter considers these distance measures into a Social Relations Regression Model (SRRM) which estimates the distance between countries in GDP per capita (year 2017). The final result shows that cultural distances are poorly correlated with the distances from the literature, and when a compromise is found between them, usually the Compromise_Large is characterized by a slightly higher weight. The main conclusion concerns the important explanatory power of the Compromise_Large distance on the distance in GDP per capita compared to that of the IW index and the Compromise_IW, which has an intermediate meaning between the two. This confirms the importance of considering the national cultural network of interdependencies between cultural traits in the overall definition of cultural distance, and also that the addition of more cultural traits may influence its specification, although the cultural traits considered by Inglehart and Welzel in the construction of their cultural map seem to capture already a good part of the cultural information of the countries. The abnormal production of data in our time has allowed the observation of large collections of networks within a specific field of analysis, which can also be characterized by a different size from each other, e.g. you can think of the trade network of each product between countries. A network is a complex object, so a common way to analyze and compare a set of networks is to reduce their complexity by mapping them into a space through the descriptors that characterize them. This is where the problem analyzed in Chapter 3 arises: what is the subset of descriptors that keeps the characteristics of networks as much as possible unchanged in the mapping process, namely projects non-isomorphic networks in different points of the space and groups nearby networks structurally similar and distant networks dissimilar? Through a simulation of networks from four generative models (Random, Scale-free, Small-world and Stochastic block model) and the selection of a wide set of descriptors of the micro, meso and macro-level of network analysis, this Chapter finds evidence of a small subset of descriptors via Subgroup Discovery. This subset is composed by 5 descriptors: the first moment of the Local Clustering Coefficient, 3 Motifs configurations and the descriptor of Smallworldness. The effectiveness of descriptors is evaluated by applying them to the set of binary cultural networks with 60 cultural traits estimated in Chapter 1 and comparing distances between these points-network in the space of the descriptors with popular network distances used in literature. Two are the main innovations: the construction of a new index of cultural distance among countries, in which is included the cultural network of interdependencies among cultural traits; the selection of a small efficient subset of descriptors for mapping in the space of sets of binary networks, which can also be characterized by a different size from each other
    corecore