2,333 research outputs found

    Systematic literature review for malware visualization techniques

    Get PDF
    Analyzing the activities or the behaviors of malicious scripts highly depends on extracted features. It is also significant to know which features are more effective for certain visualization types. Similarly, selecting an appropriate visualization technique plays a key role for analytical descriptive, diagnostic, predictive and prescriptive. Thus, the visualization technique should provide understandable information about the malicious code activities. This paper followed systematic literature review method in order to review the extracted features that are used to identify the malware, different types of visualization techniques and guidelines to select the right visualization techniques. An advanced search has been performed in most relevant digital libraries to obtain potentially relevant articles. The results demonstrate significant resources and types of features that are important to analyze malware activities and common visualization techniques that are currently used and methods to choose the right visualization technique in order to analyze the security events effectively

    Analysis of Feature Categories for Malware Visualization

    Get PDF
    It is important to know which features are more effective for certain visualization types. Furthermore, selecting an appropriate visualization tool plays a key role in descriptive, diagnostic, predictive and prescriptive analytics. Moreover, analyzing the activities of malicious scripts or codes is dependent on the extracted features. In this paper, the authors focused on reviewing and classifying the most common extracted features that have been used for malware visualization based on specified categories. This study examines the features categories and its usefulness for effective malware visualization. Additionally, it focuses on the common extracted features that have been used in the malware visualization domain. Therefore, the conducted literature review finding revealed that the features could be categorized into four main categories, namely, static, dynamic, hybrid, and application metadata. The contribution of this research paper is about feature selection for illustrating which features are effective with which visualization tools for malware visualization

    FLAGS : a methodology for adaptive anomaly detection and root cause analysis on sensor data streams by fusing expert knowledge with machine learning

    Get PDF
    Anomalies and faults can be detected, and their causes verified, using both data-driven and knowledge-driven techniques. Data-driven techniques can adapt their internal functioning based on the raw input data but fail to explain the manifestation of any detection. Knowledge-driven techniques inherently deliver the cause of the faults that were detected but require too much human effort to set up. In this paper, we introduce FLAGS, the Fused-AI interpretabLe Anomaly Generation System, and combine both techniques in one methodology to overcome their limitations and optimize them based on limited user feedback. Semantic knowledge is incorporated in a machine learning technique to enhance expressivity. At the same time, feedback about the faults and anomalies that occurred is provided as input to increase adaptiveness using semantic rule mining methods. This new methodology is evaluated on a predictive maintenance case for trains. We show that our method reduces their downtime and provides more insight into frequently occurring problems. (C) 2020 The Authors. Published by Elsevier B.V

    The Cybernetics Thought Collective: A History of Science and Technology Portal Project White Paper

    Get PDF
    This White Paper discusses the Cybernetics Thought Collective (CTC) team’s specific work to digitize a select portion of archival materials; investigate and experiment with natural language processing, named entity extraction, and machine learning software; begin investigating access interfaces for the portal; and ingest the digitized materials and machine-extracted metadata into the University of Illinois Library’s preservation repository and digital library. The pilot grant project enabled us to explore emerging methods for creating access to archival materials, which resulted in promising outcomes. In May 2018, the CTC team launched the prototype portal: ​https://archives.library.illinois.edu/thought-collective/​.National Endowment for the Humanities PW-253912-17Ope

    STREAM-EVOLVING BOT DETECTION FRAMEWORK USING GRAPH-BASED AND FEATURE-BASED APPROACHES FOR IDENTIFYING SOCIAL BOTS ON TWITTER

    Get PDF
    This dissertation focuses on the problem of evolving social bots in online social networks, particularly Twitter. Such accounts spread misinformation and inflate social network content to mislead the masses. The main objective of this dissertation is to propose a stream-based evolving bot detection framework (SEBD), which was constructed using both graph- and feature-based models. It was built using Python, a real-time streaming engine (Apache Kafka version 3.2), and our pretrained model (bot multi-view graph attention network (Bot-MGAT)). The feature-based model was used to identify predictive features for bot detection and evaluate the SEBD predictions. The graph-based model was used to facilitate multiview graph attention networks (GATs) with fellowship links to build our framework for predicting account labels from streams. A probably approximately correct learning framework was applied to confirm the accuracy and confidence levels of SEBD.The results showed that the SEBD can effectively identify bots from streams and profile features are sufficient for detecting social bots. The pretrained Bot-MGAT model uses fellowship links to reveal hidden information that can aid in identifying bot accounts. The significant contributions of this study are the development of a stream based bot detection framework for detecting social bots based on a given hashtag and the proposal of a hybrid approach for feature selection to identify predictive features for identifying bot accounts. Our findings indicate that Twitter has a higher percentage of active bots than humans in hashtags. The results indicated that stream-based detection is more effective than offline detection by achieving accuracy score 96.9%. Finally, semi supervised learning (SSL) can solve the issue of labeled data in bot detection tasks

    Facilitating and Enhancing the Performance of Model Selection for Energy Time Series Forecasting in Cluster Computing Environments

    Get PDF
    Applying Machine Learning (ML) manually to a given problem setting is a tedious and time-consuming process which brings many challenges with it, especially in the context of Big Data. In such a context, gaining insightful information, finding patterns, and extracting knowledge from large datasets are quite complex tasks. Additionally, the configurations of the underlying Big Data infrastructure introduce more complexity for configuring and running ML tasks. With the growing interest in ML the last few years, particularly people without extensive ML expertise have a high demand for frameworks assisting people in applying the right ML algorithm to their problem setting. This is especially true in the field of smart energy system applications where more and more ML algorithms are used e.g. for time series forecasting. Generally, two groups of non-expert users are distinguished to perform energy time series forecasting. The first one includes the users who are familiar with statistics and ML but are not able to write the necessary programming code for training and evaluating ML models using the well-known trial-and-error approach. Such an approach is time consuming and wastes resources for constructing multiple models. The second group is even more inexperienced in programming and not knowledgeable in statistics and ML but wants to apply given ML solutions to their problem settings. The goal of this thesis is to scientifically explore, in the context of more concrete use cases in the energy domain, how such non-expert users can be optimally supported in creating and performing ML tasks in practice on cluster computing environments. To support the first group of non-expert users, an easy-to-use modular extendable microservice-based ML solution for instrumenting and evaluating ML algorithms on top of a Big Data technology stack is conceptualized and evaluated. Our proposed solution facilitates applying trial-and-error approach by hiding the low level complexities from the users and introduces the best conditions to efficiently perform ML tasks in cluster computing environments. To support the second group of non-expert users, the first solution is extended to realize meta learning approaches for automated model selection. We evaluate how meta learning technology can be efficiently applied to the problem space of data analytics for smart energy systems to assist energy system experts which are not data analytics experts in applying the right ML algorithms to their data analytics problems. To enhance the predictive performance of meta learning, an efficient characterization of energy time series datasets is required. To this end, Descriptive Statistics Time based Meta Features (DSTMF), a new kind of meta features, is designed to accurately capture the deep characteristics of energy time series datasets. We find that DSTMF outperforms the other state-of-the-art meta feature sets introduced in the literature to characterize energy time series datasets in terms of the accuracy of meta learning models and the time needed to extract them. Further enhancement in the predictive performance of the meta learning classification model is achieved by training the meta learner on new efficient meta examples. To this end, we proposed two new approaches to generate new energy time series datasets to be used as training meta examples by the meta learner depending on the type of time series dataset (i.e. generation or energy consumption time series). We find that extending the original training sets with new meta examples generated by our approaches outperformed the case in which the original is extended by new simulated energy time series datasets

    The 1992 Goddard Conference on Space Applications of Artificial Intelligence

    Get PDF
    The purpose of this conference is to provide a forum in which current research and development directed at space applications of artificial intelligence can be presented and discussed. The papers fall into the following areas: planning and scheduling, control, fault monitoring/diagnosis and recovery, information management, tools, neural networks, and miscellaneous applications

    Neural Networks forBuilding Semantic Models and Knowledge Graphs

    Get PDF
    1noL'abstract è presente nell'allegato / the abstract is in the attachmentopen677. INGEGNERIA INFORMATInoopenFutia, Giusepp
    corecore