604 research outputs found

    Automatic generation of meta classifiers with large levels for distributed computing and networking

    Full text link
    This paper is devoted to a case study of a new construction of classifiers. These classifiers are called automatically generated multi-level meta classifiers, AGMLMC. The construction combines diverse meta classifiers in a new way to create a unified system. This original construction can be generated automatically producing classifiers with large levels. Different meta classifiers are incorporated as low-level integral parts of another meta classifier at the top level. It is intended for the distributed computing and networking. The AGMLMC classifiers are unified classifiers with many parts that can operate in parallel. This make it easy to adopt them in distributed applications. This paper introduces new construction of classifiers and undertakes an experimental study of their performance. We look at a case study of their effectiveness in the special case of the detection and filtering of phishing emails. This is a possible important application area for such large and distributed classification systems. Our experiments investigate the effectiveness of combining diverse meta classifiers into one AGMLMC classifier in the case study of detection and filtering of phishing emails. The results show that new classifiers with large levels achieved better performance compared to the base classifiers and simple meta classifiers classifiers. This demonstrates that the new technique can be applied to increase the performance if diverse meta classifiers are included in the system

    Analyzing Social and Stylometric Features to Identify Spear phishing Emails

    Full text link
    Spear phishing is a complex targeted attack in which, an attacker harvests information about the victim prior to the attack. This information is then used to create sophisticated, genuine-looking attack vectors, drawing the victim to compromise confidential information. What makes spear phishing different, and more powerful than normal phishing, is this contextual information about the victim. Online social media services can be one such source for gathering vital information about an individual. In this paper, we characterize and examine a true positive dataset of spear phishing, spam, and normal phishing emails from Symantec's enterprise email scanning service. We then present a model to detect spear phishing emails sent to employees of 14 international organizations, by using social features extracted from LinkedIn. Our dataset consists of 4,742 targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack emails sent to 5,912 non victims; and publicly available information from their LinkedIn profiles. We applied various machine learning algorithms to this labeled data, and achieved an overall maximum accuracy of 97.76% in identifying spear phishing emails. We used a combination of social features from LinkedIn profiles, and stylometric features extracted from email subjects, bodies, and attachments. However, we achieved a slightly better accuracy of 98.28% without the social features. Our analysis revealed that social features extracted from LinkedIn do not help in identifying spear phishing emails. To the best of our knowledge, this is one of the first attempts to make use of a combination of stylometric features extracted from emails, and social features extracted from an online social network to detect targeted spear phishing emails.Comment: Detection of spear phishing using social media feature

    Performance evaluation of multi-tier ensemble classifiers for phishing websites

    Get PDF
    This article is devoted to large multi-tier ensemble classifiers generated as ensembles of ensembles and applied to phishing websites. Our new ensemble construction is a special case of the general and productive multi-tier approach well known in information security. Many efficient multi-tier classifiers have been considered in the literature. Our new contribution is in generating new large systems as ensembles of ensembles by linking a top-tier ensemble to another middletier ensemble instead of a base classifier so that the top~ tier ensemble can generate the whole system. This automatic generation capability includes many large ensemble classifiers in two tiers simultaneously and automatically combines them into one hierarchical unified system so that one ensemble is an integral part of another one. This new construction makes it easy to set up and run such large systems. The present article concentrates on the investigation of performance of these new multi~tier ensembles for the example of detection of phishing websites. We carried out systematic experiments evaluating several essential ensemble techniques as well as more recent approaches and studying their performance as parts of multi~level ensembles with three tiers. The results presented here demonstrate that new three-tier ensemble classifiers performed better than the base classifiers and standard ensembles included in the system. This example of application to the classification of phishing websites shows that the new method of combining diverse ensemble techniques into a unified hierarchical three-tier ensemble can be applied to increase the performance of classifiers in situations where data can be processed on a large computer

    Detecting Abnormal Social Robot Behavior through Emotion Recognition

    Get PDF
    Sharing characteristics with both the Internet of Things and the Cyber Physical Systems categories, a new type of device has arrived to claim a third category and raise its very own privacy concerns. Social robots are in the market asking consumers to become part of their daily routine and interactions. Ranging in the level and method of communication with the users, all social robots are able to collect, share and analyze a great variety and large volume of personal data.In this thesis, we focus the community’s attention to this emerging area of interest for privacy and security research. We discuss the likely privacy issues, comment on current defense mechanisms that are applicable to this new category of devices, outline new forms of attack that are made possible through social robots, highlight paths that research on consumer perceptions could follow, and propose a system for detecting abnormal social robot behavior based on emotion detection

    Character and Word Embeddings for Phishing Email Detection

    Get PDF
    Phishing attacks are among the most common malicious activities on the Internet. During a phishing attack, cybercriminals present themselves as a trusted organization or individual. Their goal is to lure people to enter their private information, such as passwords and bank card numbers, while believing that nothing malicious is happening. The attack often starts with a phishing email, which is an email that is very similar to a legitimate email, but usually contains links to malicious websites or uses some other techniques to mislead victims. To prevent phishing attacks, it is crucial to detect phishing emails and remove them from email inbox folders. In this paper, a neural network based phishing email detection model is proposed. In comparison to some earlier approaches, our model does not use manually engineered input features. It learns character and word embeddings directly from email texts, and uses them to extract local and global features using convolutional and recurrent layers, respectively. Our model is tested on the two commonly used datasets for phishing email detection, the SpamAssassin Public Corpus and Nazario Phishing Corpus, and it achieves an accuracy of 99.81 % and F_1-score of 99.74 %, which is on par or better than the current state-of-the-art approaches

    Using Four Learning Algorithms for Evaluating Questionable Uniform Resource Locators (URLs)

    Get PDF
    Malicious Uniform Resource Locator (URL) is a common and serious threat to cyber security. Malicious URLs host unsolicited contents (spam, phishing, drive-by exploits, etc.) and lure unsuspecting internet users to become victims of scams such as monetary loss, theft, loss of information privacy and unexpected malware installation. This phenomenon has resulted in the increase of cybercrime on social media via transfer of malicious URLs. This situation prompted an efficient and reliable classification of a web-page based on the information contained in the URL to have a clear understanding of the nature and status of the site to be accessed. It is imperative to detect and act on URLs shared on social media platform in a timely manner. Though researchers have carried out similar researches in the past, there are however conflicting results regarding the conclusions drawn at the end of their experimentations. Against this backdrop, four machine learning algorithms:Naïve Bayes Algorithm, K-means Algorithm, Decision Tree Algorithm and Logistic Regression Algorithm were selected for classification of fake and vulnerable URLs. The implementation of algorithms was implemented with Java programming language. Through statistical analysis and comparison made on the four algorithms, Naïve Bayes algorithm is the most efficient and effective based on the metrics used

    A maximum entropy classification scheme for phishing detection using parsimonious features

    Get PDF
    Over the years, electronic mail (e-mail) has been the target of several malicious attacks. Phishing is one of the most recognizable forms of manipulation aimed at e-mail users and usually, employs social engineering to trick innocent users into supplying sensitive information into an imposter website. Attacks from phishing emails can result in the exposure of confidential information, financial loss, data misuse, and others. This paper presents the implementation of a maximum entropy (ME) classification method for an efficient approach to the identification of phishing emails. Our result showed that maximum entropy with parsimonious feature space gives a better classification precision than both the Naïve Bayes and support vector machine (SVM)

    Optimizing cybersecurity incident response decisions using deep reinforcement learning

    Get PDF
    The main purpose of this paper is to explore and investigate the role of deep reinforcement learning (DRL) in optimizing the post-alert incident response process in security incident and event management (SIEM) systems. Although machine learning is used at multiple levels of SIEM systems, the last mile decision process is often ignored. Few papers reported efforts regarding the use of DRL to improve the post-alert decision and incident response processes. All the reported efforts applied only shallow (traditional) machine learning approaches to solve the problem. This paper explores the possibility of solving the problem using DRL approaches. The main attraction of DRL models is their ability to make accurate decisions based on live streams of data without the need for prior training, and they proved to be very successful in other fields of applications. Using standard datasets, a number of experiments have been conducted using different DRL configurations The results showed that DRL models can provide highly accurate decisions without the need for prior training

    Hybrid features-based prediction for novel phish websites

    Get PDF
    Phishers frequently craft novel deceptions on their websites and circumvent existing anti-phishing techniques for insecure intrusions, users’ digital identity theft, and then illegal profits. This raises the needs to incorporate new features for detecting novel phish websites and optimizing the existing anti-phishing techniques. In this light, 58 new hybrid features were proposed in this paper and their prediction susceptibilities were evaluated by using feature co-occurrence criterion and a baseline machine learning algorithm. Empirical test and analysis showed the significant outcomes of the proposed features on detection performance. As a result, the most influential features are identified, and new insights are offered for further detection improvement
    corecore