4 research outputs found

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    Clustering Arabic Tweets for Sentiment Analysis

    Get PDF
    The focus of this study is to evaluate the impact of linguistic preprocessing and similarity functions for clustering Arabic Twitter tweets. The experiments apply an optimized version of the standard K-Means algorithm to assign tweets into positive and negative categories. The results show that root-based stemming has a significant advantage over light stemming in all settings. The Averaged Kullback-Leibler Divergence similarity function clearly outperforms the Cosine, Pearson Correlation, Jaccard Coefficient and Euclidean functions. The combination of the Averaged Kullback-Leibler Divergence and root-based stemming achieved the highest purity of 0.764 while the second-best purity was 0.719. These results are of importance as it is contrary to normal-sized documents where, in many information retrieval applications, light stemming performs better than root-based stemming and the Cosine function is commonly used

    AI-based intrusion detection systems for in-vehicle networks: a survey.

    Get PDF
    The Controller Area Network (CAN) is the most widely used in-vehicle communication protocol, which still lacks the implementation of suitable security mechanisms such as message authentication and encryption. This makes the CAN bus vulnerable to numerous cyber attacks. Various Intrusion Detection Systems (IDSs) have been developed to detect these attacks. However, the high generalization capabilities of Artificial Intelligence (AI) make AI-based IDS an excellent countermeasure against automotive cyber attacks. This article surveys AI-based in-vehicle IDS from 2016 to 2022 (August) with a novel taxonomy. It reviews the detection techniques, attack types, features, and benchmark datasets. Furthermore, the article discusses the security of AI models, necessary steps to develop AI-based IDSs in the CAN bus, identifies the limitations of existing proposals, and gives recommendations for future research directions

    Artificial Intelligence and Cybersecurity: Building an Automotive Cybersecurity Framework Using Machine Learning Algorithms

    Full text link
    Automotive technology has continued to advance in many aspects. As an outcome of such advancements, autonomous vehicles are closer to commercialization and have brought to life a complex automotive technology ecosystem [1]. Like every other technology, these developments bring benefits but also introduce a variety of risks. One of these risks in the automotive space is cybersecurity threats. In the case of cars, these security challenges can produce devastating results and tremendous costs, including loss of life. Therefore, conducting a clear analysis, assessment and detection of threats solves some of the cybersecurity challenges in the automotive ecosystem. This dissertation does just that, by building a three-step framework to analyze, assess,and detect threats using machine learning algorithms. First, it does an analysis of the connected vehicle threats while leveraging the STRIDE framework [2]. Second, it presents an innovative, Fuzzy based threat assessment model (FTAM). FTAM leverages threat characterizations from established threat assessment models while focusing on improving its assessment capabilities by using Fuzzy logic. Through this methodology, FTAM can improve the efficiency and accuracy of the threat assessment process by using Fuzzy logic to determine the ā€œdegreeā€ of the threat over other existing methods. This differs from the current threat assessment models which use subjective assessment processes based on table look-ups or scoring. Thirdly, this dissertation proposes an intrusion detection system (IDS) to detect malicious threats while taking in consideration results from the previous assessment stage. This IDS uses the dataset provided from Wyoming Connected Vehicle Deployment program [3] and consists of a two-stage intrusion detection system based on supervised and unsupervised machine learning algorithms. The first stage uses unsupervised learning to detect whether there is an attack present and the second stage classifies these attacks in a supervised learning fashion. The second stage also addresses data bias and eliminates the number of false positives. The simulation of this approach results in an IDS able to detect and classify attacks at a 99.965% accuracy and lowers the false positives rate to 0%.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttps://deepblue.lib.umich.edu/bitstream/2027.42/149467/1/Nevrus Kaja PhD Dissertation V24.pdfDescription of Nevrus Kaja PhD Dissertation V24.pdf : Dissertatio
    corecore