378 research outputs found
Exploring the Evolution of Node Neighborhoods in Dynamic Networks
Dynamic Networks are a popular way of modeling and studying the behavior of
evolving systems. However, their analysis constitutes a relatively recent
subfield of Network Science, and the number of available tools is consequently
much smaller than for static networks. In this work, we propose a method
specifically designed to take advantage of the longitudinal nature of dynamic
networks. It characterizes each individual node by studying the evolution of
its direct neighborhood, based on the assumption that the way this neighborhood
changes reflects the role and position of the node in the whole network. For
this purpose, we define the concept of \textit{neighborhood event}, which
corresponds to the various transformations such groups of nodes can undergo,
and describe an algorithm for detecting such events. We demonstrate the
interest of our method on three real-world networks: DBLP, LastFM and Enron. We
apply frequent pattern mining to extract meaningful information from temporal
sequences of neighborhood events. This results in the identification of
behavioral trends emerging in the whole network, as well as the individual
characterization of specific nodes. We also perform a cluster analysis, which
reveals that, in all three networks, one can distinguish two types of nodes
exhibiting different behaviors: a very small group of active nodes, whose
neighborhood undergo diverse and frequent events, and a very large group of
stable nodes
Geographical trends in research: a preliminary analysis on authors' affiliations
In the last decade, research literature reached an enormous volume with an unprecedented current annual increase of 1.5 million new publications. As research gets ever more global and new countries and institutions, either from academia or corporate environment, start to contribute with their share, it is important to monitor this complex scenario and understand its dynamics.
We present a study on a conference proceedings dataset extracted from Springer Nature Scigraph that illustrates insightful geographical trends and highlights the unbalanced growth of competitive research institutions worldwide. Results emerged from our micro and macro analysis show that the distributions among countries of institutions and papers follow a power law, and thus very few countries keep producing most of the papers accepted by high-tier conferences. In addition, we found that the annual and overall turnover rate of the top 5, 10 and 25 countries is extremely low, suggesting a very static landscape in which new entries struggle to emerge. Finally, we highlight the presence of an increasing gap between the number of institutions initiating and overseeing research endeavours (i.e. first and last authors' affiliations) and the total number of institutions participating in research. As a consequence of our analysis, the paper also discusses our experience in working with affiliations: an utterly simple matter at first glance, that is instead revealed to be a complex research and technical challenge yet far from being solved
Harnessing Historical Corrections to build Test Collections for Named Entity Disambiguation
Matching mentions of persons to the actual persons (the name disambiguation
problem) is central for several digital library applications. Scientists have
been working on algorithms to create this matching for decades without finding
a universal solution. One problem is that test collections for this problem are
often small and specific to a certain collection. In this work, we present an
approach that can create large test collections from historical metadata with
minimal extra cost. We apply this approach to the DBLP collection to generate
two freely available test collections. One collection focuses on the properties
of defects and one on the evaluation of disambiguation algorithms.Comment: Preprint of a paper accepted at TPDL 201
Rank-aware, Approximate Query Processing on the Semantic Web
Search over the Semantic Web corpus frequently leads to queries having large result sets. So, in order to discover relevant data elements, users must rely on ranking techniques to sort results according to their relevance. At the same time, applications oftentimes deal with information needs, which do not require complete and exact results. In this thesis, we face the problem of how to process queries over Web data in an approximate and rank-aware fashion
Unveiling the Sentinels: Assessing AI Performance in Cybersecurity Peer Review
Peer review is the method employed by the scientific community for evaluating
research advancements. In the field of cybersecurity, the practice of
double-blind peer review is the de-facto standard. This paper touches on the
holy grail of peer reviewing and aims to shed light on the performance of AI in
reviewing for academic security conferences. Specifically, we investigate the
predictability of reviewing outcomes by comparing the results obtained from
human reviewers and machine-learning models. To facilitate our study, we
construct a comprehensive dataset by collecting thousands of papers from
renowned computer science conferences and the arXiv preprint website. Based on
the collected data, we evaluate the prediction capabilities of ChatGPT and a
two-stage classification approach based on the Doc2Vec model with various
classifiers. Our experimental evaluation of review outcome prediction using the
Doc2Vec-based approach performs significantly better than the ChatGPT and
achieves an accuracy of over 90%. While analyzing the experimental results, we
identify the potential advantages and limitations of the tested ML models. We
explore areas within the paper-reviewing process that can benefit from
automated support approaches, while also recognizing the irreplaceable role of
human intellect in certain aspects that cannot be matched by state-of-the-art
AI techniques
Publication Culture in Computing Research
The dissemination of research results is an integral part of research and hence a crucial component
for any scientific discipline. In the area of computing research, there have been raised
concerns recently about its publication culture, most notably by highlighting the high priority of
conferences (compared to journals in other disciplines) and -- from an economic viewpoint -- the
costs of preparing and accessing research results.
The Dagstuhl Perspectives Workshop 12452 “Publication Culture in Computing Research”
aimed at discussing the main problems with a selected group of researchers and practitioners.
The goal was to identify and classify the current problems and to suggest potential remedies.
The group of participants was selected in a way such that a wide spectrum of opinions would be
presented. This lead to intensive discussions.
The workshop is seen as an important step in the ongoing discussion. As a main result, the
main problem roots were identified and potential solutions were discussed. The insights will be
part of an upcoming manifesto on Publication Culture in Computing Research
Postmortem Analysis of Decayed Online Social Communities: Cascade Pattern Analysis and Prediction
Recently, many online social networks, such as MySpace, Orkut, and
Friendster, have faced inactivity decay of their members, which contributed to
the collapse of these networks. The reasons, mechanics, and prevention
mechanisms of such inactivity decay are not fully understood. In this work, we
analyze decayed and alive sub-websites from the StackExchange platform. The
analysis mainly focuses on the inactivity cascades that occur among the members
of these communities. We provide measures to understand the decay process and
statistical analysis to extract the patterns that accompany the inactivity
decay. Additionally, we predict cascade size and cascade virality using machine
learning. The results of this work include a statistically significant
difference of the decay patterns between the decayed and the alive
sub-websites. These patterns are mainly: cascade size, cascade virality,
cascade duration, and cascade similarity. Additionally, the contributed
prediction framework showed satisfactory prediction results compared to a
baseline predictor. Supported by empirical evidence, the main findings of this
work are: (1) the decay process is not governed by only one network measure; it
is better described using multiple measures; (2) the expert members of the
StackExchange sub-websites were mainly responsible for the activity or
inactivity of the StackExchange sub-websites; (3) the Statistics sub-website is
going through decay dynamics that may lead to it becoming fully-decayed; and
(4) decayed sub-websites were originally less resilient to inactivity decay,
unlike the alive sub-websites
- …