423 research outputs found
DynED: Dynamic Ensemble Diversification in Data Stream Classification
Ensemble methods are commonly used in classification due to their remarkable
performance. Achieving high accuracy in a data stream environment is a
challenging task considering disruptive changes in the data distribution, also
known as concept drift. A greater diversity of ensemble components is known to
enhance prediction accuracy in such settings. Despite the diversity of
components within an ensemble, not all contribute as expected to its overall
performance. This necessitates a method for selecting components that exhibit
high performance and diversity. We present a novel ensemble construction and
maintenance approach based on MMR (Maximal Marginal Relevance) that dynamically
combines the diversity and prediction accuracy of components during the process
of structuring an ensemble. The experimental results on both four real and 11
synthetic datasets demonstrate that the proposed approach (DynED) provides a
higher average mean accuracy compared to the five state-of-the-art baselines.Comment: Proceedings of the 32nd ACM International Conference on Information
and Knowledge Management (CIKM '23), October 21--25, 2023, Birmingham, United
Kingdo
Attention-based Multi-modal Sentiment Analysis and Emotion Detection in Conversation using RNN
The availability of an enormous quantity of multimodal data and its widespread applications, automatic sentiment analysis and emotion classification in the conversation has become an interesting research topic among the research community. The interlocutor state, context state between the neighboring utterances and multimodal fusion play an important role in multimodal sentiment analysis and emotion detection in conversation. In this article, the recurrent neural network (RNN) based method is developed to capture the interlocutor state and contextual state between the utterances. The pair-wise attention mechanism is used to understand the relationship between the modalities and their importance before fusion. First, two-two combinations of modalities are fused at a time and finally, all the modalities are fused to form the trimodal representation feature vector. The experiments are conducted on three standard datasets such as IEMOCAP, CMU-MOSEI, and CMU-MOSI. The proposed model is evaluated using two metrics such as accuracy and F1-Score and the results demonstrate that the proposed model performs better than the standard baselines
Matrix profile data mining for BGP anomaly detection
The Border Gateway Protocol (BGP), acting as the communication protocol that binds the Internet, remains vulnerable despite Internet security advancements. This is not surprising, as the Internet was not designed to be resilient to cyber-attacks, therefore the detection of anomalous activity was not of prime importance to the Internet creators. Detection of BGP anomalies can potentially provide network operators with an early warning system to focus on protecting networks, systems, and infrastructure from significant impact, improve security posture and resilience, while ultimately contributing to a secure global Internet environment. In this paper, we present a novel technique for the detection of BGP anomalies in different events. This research uses publicly available datasets of BGP messages collected from the repositories, Route Views and Réseaux IP Européens (RIPE). Our contribution is the application of a time series data mining approach, Matrix Profile (MP), to detect BGP anomalies in all categories of BGP events. Advantages of the MP detection technique compared to extant approaches include that it is domain agnostic, is assumption-free, requires few parameters, does not require training data, and is scalable and storage efficient. The single hyper-parameter analyzed in MP shows it is robust to change. Our results indicate the MP detection scheme is competitive against existing detection schemes. A novel BGP anomaly detection scheme is also proposed for further research and validation
A Cognitive Framework to Secure Smart Cities
The advancement in technology has transformed Cyber Physical Systems and their interface with IoT into a more sophisticated and challenging paradigm. As a result, vulnerabilities and potential attacks manifest themselves considerably more than before, forcing researchers to rethink the conventional strategies that are currently in place to secure such physical systems. This manuscript studies the complex interweaving of sensor networks and physical systems and suggests a foundational innovation in the field. In sharp contrast with the existing IDS and IPS solutions, in this paper, a preventive and proactive method is employed to stay ahead of attacks by constantly monitoring network data patterns and identifying threats that are imminent. Here, by capitalizing on the significant progress in processing power (e.g. petascale computing) and storage capacity of computer systems, we propose a deep learning approach to predict and identify various security breaches that are about to occur. The learning process takes place by collecting a large number of files of different types and running tests on them to classify them as benign or malicious. The prediction model obtained as such can then be used to identify attacks. Our project articulates a new framework for interactions between physical systems and sensor networks, where malicious packets are repeatedly learned over time while the system continually operates with respect to imperfect security mechanisms
Learning Disentangled Representations in Signed Directed Graphs without Social Assumptions
Signed graphs are complex systems that represent trust relationships or
preferences in various domains. Learning node representations in such graphs is
crucial for many mining tasks. Although real-world signed relationships can be
influenced by multiple latent factors, most existing methods often oversimplify
the modeling of signed relationships by relying on social theories and treating
them as simplistic factors. This limits their expressiveness and their ability
to capture the diverse factors that shape these relationships. In this paper,
we propose DINES, a novel method for learning disentangled node representations
in signed directed graphs without social assumptions. We adopt a disentangled
framework that separates each embedding into distinct factors, allowing for
capturing multiple latent factors. We also explore lightweight graph
convolutions that focus solely on sign and direction, without depending on
social theories. Additionally, we propose a decoder that effectively classifies
an edge's sign by considering correlations between the factors. To further
enhance disentanglement, we jointly train a self-supervised factor
discriminator with our encoder and decoder. Throughout extensive experiments on
real-world signed directed graphs, we show that DINES effectively learns
disentangled node representations, and significantly outperforms its
competitors in the sign prediction task.Comment: 26 pages, 11 figure
What is Normal, What is Strange, and What is Missing in a Knowledge Graph: Unified Characterization via Inductive Summarization
Knowledge graphs (KGs) store highly heterogeneous information about the world
in the structure of a graph, and are useful for tasks such as question
answering and reasoning. However, they often contain errors and are missing
information. Vibrant research in KG refinement has worked to resolve these
issues, tailoring techniques to either detect specific types of errors or
complete a KG.
In this work, we introduce a unified solution to KG characterization by
formulating the problem as unsupervised KG summarization with a set of
inductive, soft rules, which describe what is normal in a KG, and thus can be
used to identify what is abnormal, whether it be strange or missing. Unlike
first-order logic rules, our rules are labeled, rooted graphs, i.e., patterns
that describe the expected neighborhood around a (seen or unseen) node, based
on its type, and information in the KG. Stepping away from the traditional
support/confidence-based rule mining techniques, we propose KGist, Knowledge
Graph Inductive SummarizaTion, which learns a summary of inductive rules that
best compress the KG according to the Minimum Description Length principle---a
formulation that we are the first to use in the context of KG rule mining. We
apply our rules to three large KGs (NELL, DBpedia, and Yago), and tasks such as
compression, various types of error detection, and identification of incomplete
information. We show that KGist outperforms task-specific, supervised and
unsupervised baselines in error detection and incompleteness identification,
(identifying the location of up to 93% of missing entities---over 10% more than
baselines), while also being efficient for large knowledge graphs.Comment: 10 pages, plus 2 pages of references. 5 figures. Accepted at The Web
Conference 202
- …