253 research outputs found

    Fairness-aware predictive graph learning in social networks

    Get PDF
    Predictive graph learning approaches have been bringing significant advantages in many real-life applications, such as social networks, recommender systems, and other social-related downstream tasks. For those applications, learning models should be able to produce a great prediction result to maximize the usability of their application. However, the paradigm of current graph learning methods generally neglects the differences in link strength, leading to discriminative predictive results, resulting in different performance between tasks. Based on that problem, a fairness-aware predictive learning model is needed to balance the link strength differences and not only consider how to formulate it. To address this problem, we first formally define two biases (i.e., Preference and Favoritism) that widely exist in previous representation learning models. Then, we employ modularity maximization to distinguish strong and weak links from the quantitative perspective. Eventually, we propose a novel predictive learning framework entitled ACE that first implements the link strength differentiated learning process and then integrates it with a dual propagation process. The effectiveness and fairness of our proposed ACE have been verified on four real-world social networks. Compared to nine different state-of-the-art methods, ACE and its variants show better performance. The ACE framework can better reconstruct networks, thus also providing a high possibility of resolving misinformation in graph-structured data. © 2022 by the authors

    Location Analytics for Location-Based Social Networks

    Get PDF

    Exploration of a large database of French notarial acts with social network methods

    Get PDF
    International audienceThis article illustrates how mathematical and statistical tools designed to handle relational data may be useful to help decipher the most important features and defects of a large historical database and to gain knowledge about a corpus made of several thousand documents. Such a relational model is generally enough to address a wide variety of problems, including most databases containing relational tables. In mathematics, it is referred to as a 'network' or a 'graph'. The article's purpose is to emphasize how a relevant relational model of a historical corpus can serve as a theoretical framework which makes available automatic data mining methods designed for graphs. By such methods, for one thing, consistency checking can be performed so as to extract possible transcription errors or interpretation errors during the transcription automatically. Moreover, when the database is so large that a human being is unable to gain much knowledge by even an exhaustive manual exploration, relational data mining can help elucidate the database's main features. First, the macroscopic structure of the relations between entities can be emphasized with the help of network summaries automatically produced by classification methods. A complementary point of view is obtained via local summaries of the relation structure: a set of network-related indicators can be calculated for each entity, singling out, for instance, highly connected entities. Finally, visualisation methods dedicated to graphs can be used to give the user an intuitive understanding of the database. Additional information can be superimposed on such network visualisations, making it possible intuitively to link the relations between entities using attributes that describe each entity. This overall approach is here illustrated with a huge corpus of medieval notarial acts, containing several thousand transactions and involving a comparable number of persons

    Attractability and Virality: The Role of Message Features and Social Influence in Health News Diffusion

    Get PDF
    What makes health news articles attractable and viral? Why do some articles diffuse widely by prompting audience selections (attractability) and subsequent social retransmissions (virality), while others do not? Identifying what drives social epidemics of health news coverage is crucial to our understanding of its impact on the public, especially in the emerging media environment where news consumption has become increasingly selective and social. This dissertation examines how message features and social influence affect the volume and persistence of attractability and virality within the context of the online diffusion of New York Times (NYT) health news articles. The dissertation analyzes (1) behavioral data of audience selections and retransmissions of the NYT articles and (2) associated article content and context data that are collected using computational social science approaches (automated data mining; computer-assisted content analysis) along with more traditional methods (manual content analysis; message evaluation survey). Analyses of message effects on the total volume of attractability and virality show that articles with high informational utility and positive sentiment invite more frequent selections and retransmissions, and that articles are also more attractable when presenting controversial, emotionally evocative, and familiar content. Furthermore, these analyses reveal that informational utility and novelty have stronger positive associations with email-specific virality, while emotion-related message features, content familiarity, and exemplification play a larger role in triggering social media-based retransmissions. Temporal dynamics analyses demonstrate social influence-driven cumulative advantage effects, such that articles which stay on popular-news lists longer invite more frequent subsequent selections and retransmissions. These analyses further show that the social influence effects are stronger for articles containing message features found to enhance the total volume of attractability and virality. This suggests that those synergistic interactions might underlie the observed message effects on total selections and retransmissions. Exploratory analyses reveal that the effects of social influence and message features tend to be similar for both (1) the volume of audience news selections and retransmissions and (2) the persistence of those behaviors. However, some message features, such as expressed emotionality, are relatively unique predictors of persistence outcomes. Results are discussed in light of their implications for communication research and practice

    A Neighborhood-preserving Graph Summarization

    Full text link
    We introduce in this paper a new summarization method for large graphs. Our summarization approach retains only a user-specified proportion of the neighbors of each node in the graph. Our main aim is to simplify large graphs so that they can be analyzed and processed effectively while preserving as many of the node neighborhood properties as possible. Since many graph algorithms are based on the neighborhood information available for each node, the idea is to produce a smaller graph which can be used to allow these algorithms to handle large graphs and run faster while providing good approximations. Moreover, our compression allows users to control the size of the compressed graph by adjusting the amount of information loss that can be tolerated. The experiments conducted on various real and synthetic graphs show that our compression reduces considerably the size of the graphs. Moreover, we conducted several experiments on the obtained summaries using various graph algorithms and applications, such as node embedding, graph classification and shortest path approximations. The obtained results show interesting trade-offs between the algorithms runtime speed-up and the precision loss.Comment: 17 pages, 10 figure

    Efficient Identification of TOP-K Heavy Hitters over Sliding Windows

    Get PDF
    This is the author accepted manuscript. The final version is available from Springer Verlag via the DOI in this recordDue to the increasing volume of network traffic and growing complexity of network environment, rapid identification of heavy hitters is quite challenging. To deal with the massive data streams in real-time, accurate and scalable solution is required. The traditional method to keep an individual counter for each host in the whole data streams is very resource-consuming. This paper presents a new data structure called FCM and its associated algorithms. FCM combines the count-min sketch with the stream-summary structure simultaneously for efficient TOP-K heavy hitter identification in one pass. The key point of this algorithm is that it introduces a novel filter-and-jump mechanism. Given that the Internet traffic has the property of being heavy-tailed and hosts of low frequencies account for the majority of the IP addresses, FCM periodically filters the mice from input streams to efficiently improve the accuracy of TOP-K heavy hitter identification. On the other hand, considering that abnormal events are always time sensitive, our algorithm works by adjusting its measurement window to the newly arrived elements in the data streams automatically. Our experimental results demonstrate that the performance of FCM is superior to the previous related algorithm. Additionally this solution has a good prospect of application in advanced network environment.Chinese Academy of SciencesNational Natural Science Foundation of Chin

    CenGCN : centralized convolutional networks with vertex imbalance for scale-free graphs

    Get PDF
    Graph Convolutional Networks (GCNs) have achieved impressive performance in a wide variety of areas, attracting considerable attention. The core step of GCNs is the information-passing framework that considers all information from neighbors to the central vertex to be equally important. Such equal importance, however, is inadequate for scale-free networks, where hub vertices propagate more dominant information due to vertex imbalance. In this paper, we propose a novel centrality-based framework named CenGCN to address the inequality of information. This framework first quantifies the similarity between hub vertices and their neighbors by label propagation with hub vertices. Based on this similarity and centrality indices, the framework transforms the graph by increasing or decreasing the weights of edges connecting hub vertices and adding self-connections to vertices. In each non-output layer of the GCN, this framework uses a hub attention mechanism to assign new weights to connected non-hub vertices based on their common information with hub vertices. We present two variants CenGCN_D and CenGCN_E, based on degree centrality and eigenvector centrality, respectively. We also conduct comprehensive experiments, including vertex classification, link prediction, vertex clustering, and network visualization. The results demonstrate that the two variants significantly outperform state-of-the-art baselines. © 1989-2012 IEEE
    • …
    corecore