3,568 research outputs found
Toxicity in Evolving Twitter Topics - Employing a novel Dynamic Topic volution Model (DyTEM) onTwitter data
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThis thesis presents an extensive investigation into the evolution of topics and their association with
speech toxicity on Twitter, based on a large corpus of tweets, providing crucial insights for monitoring
online discourse and potentially informing interventions to combat toxic behavior in digital
communities. A Dynamic Topic Evolution Model (DyTEM) is introduced, constructed by combining
static Topic Modelling techniques and sentence embeddings through the state-of-the-art sentence
transformer, sBERT. The DyTEM, tested and validated on a substantial sample of tweets, is represented
as a directed graph, encapsulating the inherent dynamism of Twitter discussions. For validating the
consistency of DyTEM and providing guidance for hyperparameter selection, a novel, hashtag-based
validation method is proposed. The analysis identifies and scrutinizes five distinct Topic Transition
Types: Topic Stagnation, Topic Merge, Topic Split, Topic Disappearance, and Topic Emergence. A
speech toxicity classification model is employed to delve into the toxicity dynamics within topic
evolution. A standout finding of this study is the positive correlation between topic popularity and its
toxicity, implying that trending or viral topics tend to contain more inflammatory speech. This insight,
along with the methodologies introduced in this study, contributes significantly to the broader
understanding of digital discourse dynamics and could guide future strategies aimed at fostering
healthier and more constructive online spaces
Airborne Directional Networking: Topology Control Protocol Design
This research identifies and evaluates the impact of several architectural design choices in relation to airborne networking in contested environments related to autonomous topology control. Using simulation, we evaluate topology reconfiguration effectiveness using classical performance metrics for different point-to-point communication architectures. Our attention is focused on the design choices which have the greatest impact on reliability, scalability, and performance. In this work, we discuss the impact of several practical considerations of airborne networking in contested environments related to autonomous topology control modeling. Using simulation, we derive multiple classical performance metrics to evaluate topology reconfiguration effectiveness for different point-to-point communication architecture attributes for the purpose of qualifying protocol design elements
A review of clustering techniques and developments
© 2017 Elsevier B.V. This paper presents a comprehensive study on clustering: exiting methods and developments made at various times. Clustering is defined as an unsupervised learning where the objects are grouped on the basis of some similarity inherent among them. There are different methods for clustering the objects such as hierarchical, partitional, grid, density based and model based. The approaches used in these methods are discussed with their respective states of art and applicability. The measures of similarity as well as the evaluation criteria, which are the central components of clustering, are also presented in the paper. The applications of clustering in some fields like image segmentation, object and character recognition and data mining are highlighted
A Label-based Edge Partitioning for Multi-Layer Graphs
Social network systems rely on very large underlying graphs. Consequently, to achieve scalability, most data analytics and data mining algorithms are distributed and graphs are partitioned over a set of servers. In most real-world graphs, the edges and/or vertices have different semantics and queries largely consider this semantics. But while several works focus on efficient graph computations on these “multi-semantic” graphs, few ones are dedicated to their partitioning. In this work, we propose a novel approach to achieve edge partitioning for multi-layer graphs, which considers both structural and edge-types (labels) localities. Our experiments on real life datasets with benchmark graph applications confirm that the execution time and the inter-partition communication can be significantly reduced with our approach
- …