135,706 research outputs found

    A model for the generation of social network graphs

    Get PDF
    In this paper we present and evaluate a social network model which exploits fundamental results coming from the social anthropology literature. Specifically, our model focuses on ego networks, i.e., the set of active social relationships for a given individual. The model is based on a function that correlates the level of emotional closeness of a social relationship to the time invested in it. The size of the social network is limited by the time budget a person invests in socializing. We exploit the model to define a constructive algorithm to generate synthetic social networks. Experimental results show that our model satisfies, on average, known properties of ego networks such as the size, the composition and the hierarchical structure

    EvoCut : A new Generalization of Albert-Barab\'asi Model for Evolution of Complex Networks

    Get PDF
    With the evolution of social networks, the network structure shows dynamic nature in which nodes and edges appear as well as disappear for various reasons. The role of a node in the network is presented as the number of interactions it has with the other nodes. For this purpose a network is modeled as a graph where nodes represent network members and edges represent a relationship among them. Several models for evolution of social networks has been proposed till date, most widely accepted being the Barab\'asi-Albert \cite{Network science} model that is based on \emph{preferential attachment} of nodes according to the degree distribution. This model leads to generation of graphs that are called \emph{Scale Free} and the degree distribution of such graphs follow the \emph{power law}. Several generalizations of this model has also been proposed. In this paper we present a new generalization of the model and attempt to bring out its implications in real life

    GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?

    Full text link
    Large-scale graphs with node attributes are fundamental in real-world scenarios, such as social and financial networks. The generation of synthetic graphs that emulate real-world ones is pivotal in graph machine learning, aiding network evolution understanding and data utility preservation when original data cannot be shared. Traditional models for graph generation suffer from limited model capacity. Recent developments in diffusion models have shown promise in merely graph structure generation or the generation of small molecular graphs with attributes. However, their applicability to large attributed graphs remains unaddressed due to challenges in capturing intricate patterns and scalability. This paper introduces GraphMaker, a novel diffusion model tailored for generating large attributed graphs. We study the diffusion models that either couple or decouple graph structure and node attribute generation to address their complex correlation. We also employ node-level conditioning and adopt a minibatch strategy for scalability. We further propose a new evaluation pipeline using models trained on generated synthetic graphs and tested on original graphs to evaluate the quality of synthetic data. Empirical evaluations on real-world datasets showcase GraphMaker's superiority in generating realistic and diverse large-attributed graphs beneficial for downstream tasks.Comment: Code available at https://github.com/Graph-COM/GraphMake

    Outlier Edge Detection Using Random Graph Generation Models and Applications

    Get PDF
    Outliers are samples that are generated by different mechanisms from other normal data samples. Graphs, in particular social network graphs, may contain nodes and edges that are made by scammers, malicious programs or mistakenly by normal users. Detecting outlier nodes and edges is important for data mining and graph analytics. However, previous research in the field has merely focused on detecting outlier nodes. In this article, we study the properties of edges and propose outlier edge detection algorithms using two random graph generation models. We found that the edge-ego-network, which can be defined as the induced graph that contains two end nodes of an edge, their neighboring nodes and the edges that link these nodes, contains critical information to detect outlier edges. We evaluated the proposed algorithms by injecting outlier edges into some real-world graph data. Experiment results show that the proposed algorithms can effectively detect outlier edges. In particular, the algorithm based on the Preferential Attachment Random Graph Generation model consistently gives good performance regardless of the test graph data. Further more, the proposed algorithms are not limited in the area of outlier edge detection. We demonstrate three different applications that benefit from the proposed algorithms: 1) a preprocessing tool that improves the performance of graph clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel noisy data clustering algorithm. These applications show the great potential of the proposed outlier edge detection techniques.Comment: 14 pages, 5 figures, journal pape

    EvoCut: A new Generalization of Albert-Barabasi Model for Evolution of Complex Networks

    Get PDF
    With the evolution of social networks, the network structure shows dynamic nature in which nodes and edges appear as well as disappear for various reasons. The role of a node in the network is presented as the number of interactions it has with the other nodes. For this purpose a network is modeled as a graph where nodes represent network members and edges represent a relationship among them. Several models for evolution of social networks has been proposed till date, most widely accepted being the Barabasi-Albert [1] model that is based on preferential attachment of nodes according to the degree distribution. This model leads to generation of graphs that are called Scale Free and the degree distribution of such graphs follow the power law. Several generalizations of this model has also been proposed. In this paper we present a new generalization of the model and attempt to bring out its implications in real life

    Generative Graph Convolutional Network for Growing Graphs

    Full text link
    Modeling generative process of growing graphs has wide applications in social networks and recommendation systems, where cold start problem leads to new nodes isolated from existing graph. Despite the emerging literature in learning graph representation and graph generation, most of them can not handle isolated new nodes without nontrivial modifications. The challenge arises due to the fact that learning to generate representations for nodes in observed graph relies heavily on topological features, whereas for new nodes only node attributes are available. Here we propose a unified generative graph convolutional network that learns node representations for all nodes adaptively in a generative model framework, by sampling graph generation sequences constructed from observed graph data. We optimize over a variational lower bound that consists of a graph reconstruction term and an adaptive Kullback-Leibler divergence regularization term. We demonstrate the superior performance of our approach on several benchmark citation network datasets

    FairGen: Towards Fair Graph Generation

    Full text link
    There have been tremendous efforts over the past decades dedicated to the generation of realistic graphs in a variety of domains, ranging from social networks to computer networks, from gene regulatory networks to online transaction networks. Despite the remarkable success, the vast majority of these works are unsupervised in nature and are typically trained to minimize the expected graph reconstruction loss, which would result in the representation disparity issue in the generated graphs, i.e., the protected groups (often minorities) contribute less to the objective and thus suffer from systematically higher errors. In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraint. In particular, we start from the investigation of representation disparity in the context of graph generative models. To mitigate the disparity, we propose a fairness-aware graph generative model named FairGen. Our model jointly trains a label-informed graph generation module and a fair representation learning module by progressively learning the behaviors of the protected and unprotected groups, from the `easy' concepts to the `hard' ones. In addition, we propose a generic context sampling strategy for graph generative models, which is proven to be capable of fairly capturing the contextual information of each group with a high probability. Experimental results on seven real-world data sets, including web-based graphs, demonstrate that FairGen (1) obtains performance on par with state-of-the-art graph generative models across six network properties, (2) mitigates the representation disparity issues in the generated graphs, and (3) substantially boosts the model performance by up to 17% in downstream tasks via data augmentation

    Forecasting the Missing Links in Heterogeneous Social Networks

    Get PDF
    Social network analysis has gained attention from several researchers in the past time because of its wide application in capturing social interactions. One of the aims of social network analysis is to recover missing links between the users which may exist in the future but have not yet appeared due to incomplete data. The prediction of hidden or missing links in criminal networks is also a significant problem. The collection of criminal data from these networks appears to be incomplete and inconsistent which is reflected in the structure in the form of missing nodes and links. Many machine learning algorithms are applied for this detection using supervised techniques. But, supervised machine learning algorithms require large datasets for training the link prediction model for achieving optimum results. In this research, we have used a Facebook dataset to solve the problem of link prediction in a network. The two machine learning classifiers applied are LogisticRegression and K-Nearest Neighbour where KNN has higher accuracy than LR. In this article, we have proposed an algorithm Graph Sample Aggregator with Low Reciprocity, (GraphSALR), for the generation of node embeddings in larger graphs which use node feature information
    corecore