378 research outputs found

    Exploring the Evolution of Node Neighborhoods in Dynamic Networks

    Full text link
    Dynamic Networks are a popular way of modeling and studying the behavior of evolving systems. However, their analysis constitutes a relatively recent subfield of Network Science, and the number of available tools is consequently much smaller than for static networks. In this work, we propose a method specifically designed to take advantage of the longitudinal nature of dynamic networks. It characterizes each individual node by studying the evolution of its direct neighborhood, based on the assumption that the way this neighborhood changes reflects the role and position of the node in the whole network. For this purpose, we define the concept of \textit{neighborhood event}, which corresponds to the various transformations such groups of nodes can undergo, and describe an algorithm for detecting such events. We demonstrate the interest of our method on three real-world networks: DBLP, LastFM and Enron. We apply frequent pattern mining to extract meaningful information from temporal sequences of neighborhood events. This results in the identification of behavioral trends emerging in the whole network, as well as the individual characterization of specific nodes. We also perform a cluster analysis, which reveals that, in all three networks, one can distinguish two types of nodes exhibiting different behaviors: a very small group of active nodes, whose neighborhood undergo diverse and frequent events, and a very large group of stable nodes

    Geographical trends in research: a preliminary analysis on authors' affiliations

    Get PDF
    In the last decade, research literature reached an enormous volume with an unprecedented current annual increase of 1.5 million new publications. As research gets ever more global and new countries and institutions, either from academia or corporate environment, start to contribute with their share, it is important to monitor this complex scenario and understand its dynamics. We present a study on a conference proceedings dataset extracted from Springer Nature Scigraph that illustrates insightful geographical trends and highlights the unbalanced growth of competitive research institutions worldwide. Results emerged from our micro and macro analysis show that the distributions among countries of institutions and papers follow a power law, and thus very few countries keep producing most of the papers accepted by high-tier conferences. In addition, we found that the annual and overall turnover rate of the top 5, 10 and 25 countries is extremely low, suggesting a very static landscape in which new entries struggle to emerge. Finally, we highlight the presence of an increasing gap between the number of institutions initiating and overseeing research endeavours (i.e. first and last authors' affiliations) and the total number of institutions participating in research. As a consequence of our analysis, the paper also discusses our experience in working with affiliations: an utterly simple matter at first glance, that is instead revealed to be a complex research and technical challenge yet far from being solved

    Harnessing Historical Corrections to build Test Collections for Named Entity Disambiguation

    Full text link
    Matching mentions of persons to the actual persons (the name disambiguation problem) is central for several digital library applications. Scientists have been working on algorithms to create this matching for decades without finding a universal solution. One problem is that test collections for this problem are often small and specific to a certain collection. In this work, we present an approach that can create large test collections from historical metadata with minimal extra cost. We apply this approach to the DBLP collection to generate two freely available test collections. One collection focuses on the properties of defects and one on the evaluation of disambiguation algorithms.Comment: Preprint of a paper accepted at TPDL 201

    Rank-aware, Approximate Query Processing on the Semantic Web

    Get PDF
    Search over the Semantic Web corpus frequently leads to queries having large result sets. So, in order to discover relevant data elements, users must rely on ranking techniques to sort results according to their relevance. At the same time, applications oftentimes deal with information needs, which do not require complete and exact results. In this thesis, we face the problem of how to process queries over Web data in an approximate and rank-aware fashion

    Unveiling the Sentinels: Assessing AI Performance in Cybersecurity Peer Review

    Full text link
    Peer review is the method employed by the scientific community for evaluating research advancements. In the field of cybersecurity, the practice of double-blind peer review is the de-facto standard. This paper touches on the holy grail of peer reviewing and aims to shed light on the performance of AI in reviewing for academic security conferences. Specifically, we investigate the predictability of reviewing outcomes by comparing the results obtained from human reviewers and machine-learning models. To facilitate our study, we construct a comprehensive dataset by collecting thousands of papers from renowned computer science conferences and the arXiv preprint website. Based on the collected data, we evaluate the prediction capabilities of ChatGPT and a two-stage classification approach based on the Doc2Vec model with various classifiers. Our experimental evaluation of review outcome prediction using the Doc2Vec-based approach performs significantly better than the ChatGPT and achieves an accuracy of over 90%. While analyzing the experimental results, we identify the potential advantages and limitations of the tested ML models. We explore areas within the paper-reviewing process that can benefit from automated support approaches, while also recognizing the irreplaceable role of human intellect in certain aspects that cannot be matched by state-of-the-art AI techniques

    Publication Culture in Computing Research

    Get PDF
    The dissemination of research results is an integral part of research and hence a crucial component for any scientific discipline. In the area of computing research, there have been raised concerns recently about its publication culture, most notably by highlighting the high priority of conferences (compared to journals in other disciplines) and -- from an economic viewpoint -- the costs of preparing and accessing research results. The Dagstuhl Perspectives Workshop 12452 “Publication Culture in Computing Research” aimed at discussing the main problems with a selected group of researchers and practitioners. The goal was to identify and classify the current problems and to suggest potential remedies. The group of participants was selected in a way such that a wide spectrum of opinions would be presented. This lead to intensive discussions. The workshop is seen as an important step in the ongoing discussion. As a main result, the main problem roots were identified and potential solutions were discussed. The insights will be part of an upcoming manifesto on Publication Culture in Computing Research

    Postmortem Analysis of Decayed Online Social Communities: Cascade Pattern Analysis and Prediction

    Full text link
    Recently, many online social networks, such as MySpace, Orkut, and Friendster, have faced inactivity decay of their members, which contributed to the collapse of these networks. The reasons, mechanics, and prevention mechanisms of such inactivity decay are not fully understood. In this work, we analyze decayed and alive sub-websites from the StackExchange platform. The analysis mainly focuses on the inactivity cascades that occur among the members of these communities. We provide measures to understand the decay process and statistical analysis to extract the patterns that accompany the inactivity decay. Additionally, we predict cascade size and cascade virality using machine learning. The results of this work include a statistically significant difference of the decay patterns between the decayed and the alive sub-websites. These patterns are mainly: cascade size, cascade virality, cascade duration, and cascade similarity. Additionally, the contributed prediction framework showed satisfactory prediction results compared to a baseline predictor. Supported by empirical evidence, the main findings of this work are: (1) the decay process is not governed by only one network measure; it is better described using multiple measures; (2) the expert members of the StackExchange sub-websites were mainly responsible for the activity or inactivity of the StackExchange sub-websites; (3) the Statistics sub-website is going through decay dynamics that may lead to it becoming fully-decayed; and (4) decayed sub-websites were originally less resilient to inactivity decay, unlike the alive sub-websites
    corecore