10,584 research outputs found
From Text to Knowledge with Graphs: modelling, querying and exploiting textual content
This paper highlights the challenges, current trends, and open issues related
to the representation, querying and analytics of content extracted from texts.
The internet contains vast text-based information on various subjects,
including commercial documents, medical records, scientific experiments,
engineering tests, and events that impact urban and natural environments.
Extracting knowledge from this text involves understanding the nuances of
natural language and accurately representing the content without losing
information. This allows knowledge to be accessed, inferred, or discovered. To
achieve this, combining results from various fields, such as linguistics,
natural language processing, knowledge representation, data storage, querying,
and analytics, is necessary. The vision in this paper is that graphs can be a
well-suited text content representation once annotated and the right querying
and analytics techniques are applied. This paper discusses this hypothesis from
the perspective of linguistics, natural language processing, graph models and
databases and artificial intelligence provided by the panellists of the DOING
session in the MADICS Symposium 2022
A Network Topology Approach to Bot Classification
Automated social agents, or bots, are increasingly becoming a problem on
social media platforms. There is a growing body of literature and multiple
tools to aid in the detection of such agents on online social networking
platforms. We propose that the social network topology of a user would be
sufficient to determine whether the user is a automated agent or a human. To
test this, we use a publicly available dataset containing users on Twitter
labelled as either automated social agent or human. Using an unsupervised
machine learning approach, we obtain a detection accuracy rate of 70%
Semantically Enhanced Software Documentation Processes
High-quality software documentation is a substantial issue for
understanding software systems. Shorter time-to-market software cycles increase
the importance of automatism for keeping the documentation up to
date. In this paper, we describe the automatic support of the software documentation process using semantic technologies. We introduce a software documentation ontology as an underlying knowledge base. The defined ontology is populated automatically by analysing source code, software documentation and code execution. Through selected results we demonstrate that the use of such semantic systems can support software documentation processes efficiently
Investigating the cross-platform behaviours of online hate groups
The past few decades have established how digital technologies and platforms have provided an effective medium for spreading hateful content. Despite efforts from law-enforcement agencies and platform developers to remove or limit such content, online hate ideologies and extremist narratives are still being linked to several catastrophic consequences around the world. The concept of online hate is still considered a complex phenomenon, with its definition evolving across several theoretical paradigms and disciplines, and spanning multiple forms of victimisation. Due to this complexity, research into online hate is fragmented throughout numerous disciplines, including computational social science. Previous research has demonstrated how online hate thrives globally through self-organised, scalable clusters that interconnect to form robust networks spread across multiple social-media platforms, countries, and languages. Although several extensive approaches and methods have been proposed in previous studies for the analysis of online hate, limited research has investigated how hateful behaviours and content compare and relate across different online platforms.
This thesis aimed to address these limitations by developing a cross-platform analysis framework for online-hate researchers to gain a clearer understanding of the dynamics of the global hate ecosystem. More specifically, the designing of this framework involved examining the main functionalities of existing online-hate analysis frameworks, and the extent to which they address cross-platform hate. The strengths and limitations of these approaches then informed the functional requirements of the cross-platform analysis framework. To demonstrate how the framework can provide novel insights into online-hate research, this thesis also details its application to various case studies, including online hate from white-supremacy-supporting users and environments spread during the 2020 US election and the COVID-19 pandemic.
This comprises a comparative analysis of hateful content in terms of the major topics of discussion and psycho-linguistic properties across different types of online platforms using natural language processing techniques. Additionally, the framework is used to explore networks of shared content, particularly through the posting of URLs, by harnessing social-network analysis methods. Finally, the cross-platform analysis framework is validated using a list of validation criteria to evaluate its practicality in investigating hateful content and providing novel insights into the field of online hate. The findings from this can be used to develop more effective analysis tools for online-hate researchers and law-enforcement agencies
Online Social Networks: Measurements, Analysis and Solutions for Mining Challenges
In the last decade, online social networks showed enormous growth. With the rise
of these networks and the consequent availability of wealth social network data, Social
Network Analysis (SNA) led researchers to get the opportunity to access, analyse and
mine the social behaviour of millions of people, explore the way they communicate and
exchange information.
Despite the growing interest in analysing social networks, there are some challenges
and implications accompanying the analysis and mining of these networks. For example,
dealing with large-scale and evolving networks is not yet an easy task and still requires
a new mining solution. In addition, finding communities within these networks is a
challenging task and could open opportunities to see how people behave in groups on a
large scale. Also, the challenge of validating and optimizing communities without knowing
in advance the structure of the network due to the lack of ground truth is yet another
challenging barrier for validating the meaningfulness of the resulting communities.
In this thesis, we started by providing an overview of the necessary background and key
concepts required in the area of social networks analysis. Our main focus is to provide
solutions to tackle the key challenges in this area. For doing so, first, we introduce a predictive
technique to help in the prediction of the execution time of the analysis tasks for
evolving networks through employing predictive modeling techniques to the problem of
evolving and large-scale networks. Second, we study the performance of existing community
detection approaches to derive high quality community structure using a real email
network through analysing the exchange of emails and exploring community dynamics.
The aim is to study the community behavioral patterns and evaluate their quality within
an actual network. Finally, we propose an ensemble technique for deriving communities
using a rich internal enterprise real network in IBM that reflects real collaborations
and communications between employees. The technique aims to improve the community
detection process through the fusion of different algorithms
- …