244 research outputs found

    Poisson factorization for peer-based anomaly detection

    Get PDF
    Anomaly detection systems are a promising tool to identify compromised user credentials and malicious insiders in enterprise networks. Most existing approaches for modelling user behaviour rely on either independent observations for each user or on pre-defined user peer groups. A method is proposed based on recommender system algorithms to learn overlapping user peer groups and to use this learned structure to detect anomalous activity. Results analysing the authentication and process-running activities of thousands of users show that the proposed method can detect compromised user accounts during a red team exercise

    Graph based Anomaly Detection and Description: A Survey

    Get PDF
    Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field

    Graph-based, systems approach for detecting violent extremist radicalization trajectories and other latent behaviors, A

    Get PDF
    2017 Summer.Includes bibliographical references.The number and lethality of violent extremist plots motivated by the Salafi-jihadist ideology have been growing for nearly the last decade in both the U.S and Western Europe. While detecting the radicalization of violent extremists is a key component in preventing future terrorist attacks, it remains a significant challenge to law enforcement due to the issues of both scale and dynamics. Recent terrorist attack successes highlight the real possibility of missed signals from, or continued radicalization by, individuals whom the authorities had formerly investigated and even interviewed. Additionally, beyond considering just the behavioral dynamics of a person of interest is the need for investigators to consider the behaviors and activities of social ties vis-à-vis the person of interest. We undertake a fundamentally systems approach in addressing these challenges by investigating the need and feasibility of a radicalization detection system, a risk assessment assistance technology for law enforcement and intelligence agencies. The proposed system first mines public data and government databases for individuals who exhibit risk indicators for extremist violence, and then enables law enforcement to monitor those individuals at the scope and scale that is lawful, and account for the dynamic indicative behaviors of the individuals and their associates rigorously and automatically. In this thesis, we first identify the operational deficiencies of current law enforcement and intelligence agency efforts, investigate the environmental conditions and stakeholders most salient to the development and operation of the proposed system, and address both programmatic and technical risks with several initial mitigating strategies. We codify this large effort into a radicalization detection system framework. The main thrust of this effort is the investigation of the technological opportunities for the identification of individuals matching a radicalization pattern of behaviors in the proposed radicalization detection system. We frame our technical approach as a unique dynamic graph pattern matching problem, and develop a technology called INSiGHT (Investigative Search for Graph Trajectories) to help identify individuals or small groups with conforming subgraphs to a radicalization query pattern, and follow the match trajectories over time. INSiGHT is aimed at assisting law enforcement and intelligence agencies in monitoring and screening for those individuals whose behaviors indicate a significant risk for violence, and allow for the better prioritization of limited investigative resources. We demonstrated the performance of INSiGHT on a variety of datasets, to include small synthetic radicalization-specific data sets, a real behavioral dataset of time-stamped radicalization indicators of recent U.S. violent extremists, and a large, real-world BlogCatalog dataset serving as a proxy for the type of intelligence or law enforcement data networks that could be utilized to track the radicalization of violent extremists. We also extended INSiGHT by developing a non-combinatorial neighbor matching technique to enable analysts to maintain visibility of potential collective threats and conspiracies and account for the role close social ties have in an individual's radicalization. This enhancement was validated on small, synthetic radicalization-specific datasets as well as the large BlogCatalog dataset with real social network connections and tagging behaviors for over 80K accounts. The results showed that our algorithm returned whole and partial subgraph matches that enabled analysts to gain and maintain visibility on neighbors' activities. Overall, INSiGHT led to consistent, informed, and reliable assessments about those who pose a significant risk for some latent behavior in a variety of settings. Based upon these results, we maintain that INSiGHT is a feasible and useful supporting technology with the potential to optimize law enforcement investigative efforts and ultimately enable the prevention of individuals from carrying out extremist violence. Although the prime motivation of this research is the detection of violent extremist radicalization, we found that INSiGHT is applicable in detecting latent behaviors in other domains such as on-line student assessment and consumer analytics. This utility was demonstrated through experiments with real data. For on-line student assessment, we tested INSiGHT on a MOOC dataset of students and time-stamped on-line course activities to predict those students who persisted in the course. For consumer analytics, we tested the performance on a real, large proprietary consumer activities dataset from a home improvement retailer. Lastly, motivated by the desire to validate INSiGHT as a screening technology when ground truth is known, we developed a synthetic data generator of large population, time-stamped, individual-level consumer activities data consistent with an a priori project set designation (latent behavior). This contribution also sets the stage for future work in developing an analogous synthetic data generator for radicalization indicators to serve as a testbed for INSiGHT and other data mining algorithms

    The Impact of Mindfulness on Non-Malicious Spillage within Images on Social Networking Sites

    Get PDF
    Insider threat by employees in organizations is a problematic issue in today’s fast-paced, internet-driven society. Gone are the days when securing the perimeter of one’s network protected their business. Security threats are now mobile, and employees have the ability to share sensitive business data with hundreds of people instantaneously from mobile devices. While prior research has addressed social networking topics such as trust in relation to information systems, the use of social networking sites, social networking security, and social networking sharing, there is a lack of research in the mindfulness of users who spill sensitive data contained within images posted on social networking sites (SNS). The author seeks to provide an understanding of how non-malicious spillage through images relates to the mindfulness of employees, who are also deemed insiders. Specifically, it explores the relationships between the following variables: mindfulness, proprietary information spillage, and spillage of personally identifiable information (PII). A quasi-experimental study was designed, which was correlational in nature. Individuals were the unit of analysis. A sample population of business managers with SNS accounts were studied. A series of video vignettes were used to measure mindfulness. Surveys were used as a tool to collect and analyze data. There was a positive correlation between non-malicious spillage of sensitive business, both personally identifiable information and proprietary data, and a lack of mindfulness

    AI-enabled modeling and monitoring of data-rich advanced manufacturing systems

    Get PDF
    The infrastructure of cyber-physical systems (CPS) is based on a meta-concept of cybermanufacturing systems (CMS) that synchronizes the Industrial Internet of Things (IIoTs), Cloud Computing, Industrial Control Systems (ICSs), and Big Data analytics in manufacturing operations. Artificial Intelligence (AI) can be incorporated to make intelligent decisions in the day-to-day operations of CMS. Cyberattack spaces in AI-based cybermanufacturing operations pose significant challenges, including unauthorized modification of systems, loss of historical data, destructive malware, software malfunctioning, etc. However, a cybersecurity framework can be implemented to prevent unauthorized access, theft, damage, or other harmful attacks on electronic equipment, networks, and sensitive data. The five main cybersecurity framework steps are divided into procedures and countermeasure efforts, including identifying, protecting, detecting, responding, and recovering. Given the major challenges in AI-enabled cybermanufacturing systems, three research objectives are proposed in this dissertation by incorporating cybersecurity frameworks. The first research aims to detect the in-situ additive manufacturing (AM) process authentication problem using high-volume video streaming data. A side-channel monitoring approach based on an in-situ optical imaging system is established, and a tensor-based layer-wise texture descriptor is constructed to describe the observed printing path. Subsequently, multilinear principal component analysis (MPCA) is leveraged to reduce the dimension of the tensor-based texture descriptor, and low-dimensional features can be extracted for detecting attack-induced alterations. The second research work seeks to address the high-volume data stream problems in multi-channel sensor fusion for diverse bearing fault diagnosis. This second approach proposes a new multi-channel sensor fusion method by integrating acoustics and vibration signals with different sampling rates and limited training data. The frequency-domain tensor is decomposed by MPCA, resulting in low-dimensional process features for diverse bearing fault diagnosis by incorporating a Neural Network classifier. By linking the second proposed method, the third research endeavor is aligned to recovery systems of multi-channel sensing signals when a substantial amount of missing data exists due to sensor malfunction or transmission issues. This study has leveraged a fully Bayesian CANDECOMP/PARAFAC (FBCP) factorization method that enables to capture of multi-linear interaction (channels × signals) among latent factors of sensor signals and imputes missing entries based on observed signals

    Secure and Authenticated Message Dissemination in Vehicular ad hoc Networks and an Incentive-Based Architecture for Vehicular Cloud

    Get PDF
    Vehicular ad hoc Networks (VANETs) allow vehicles to form a self-organized network. VANETs are likely to be widely deployed in the future, given the interest shown by industry in self-driving cars and satisfying their customers various interests. Problems related to Mobile ad hoc Networks (MANETs) such as routing, security, etc.have been extensively studied. Even though VANETs are special type of MANETs, solutions proposed for MANETs cannot be directly applied to VANETs because all problems related to MANETs have been studied for small networks. Moreover, in MANETs, nodes can move randomly. On the other hand, movement of nodes in VANETs are constrained to roads and the number of nodes in VANETs is large and covers typically large area. The following are the contributions of the thesis. Secure, authenticated, privacy preserving message dissemination in VANETs: When vehicles in VANET observe phenomena such as accidents, icy road condition, etc., they need to disseminate this information to vehicles in appropriate areas so the drivers of those vehicles can take appropriate action. When such messages are disseminated, the authenticity of the vehicles disseminating such messages should be verified while at the same time the anonymity of the vehicles should be preserved. Moreover, to punish the vehicles spreading malicious messages, authorities should be able to trace such messages to their senders when necessary. For this, we present an efficient protocol for the dissemination of authenticated messages. Incentive-based architecture for vehicular cloud: Due to the advantages such as exibility and availability, interest in cloud computing has gained lot of attention in recent years. Allowing vehicles in VANETs to store the collected information in the cloud would facilitate other vehicles to retrieve this information when they need. In this thesis, we present a secure incentive-based architecture for vehicular cloud. Our architecture allows vehicles to collect and store information in the cloud; it also provides a mechanism for rewarding vehicles that contributing to the cloud. Privacy preserving message dissemination in VANETs: Sometimes, it is sufficient to ensure the anonymity of the vehicles disseminating messages in VANETs. We present a privacy preserving message dissemination protocol for VANETs

    Feature Space Modeling for Accurate and Efficient Learning From Non-Stationary Data

    Get PDF
    A non-stationary dataset is one whose statistical properties such as the mean, variance, correlation, probability distribution, etc. change over a specific interval of time. On the contrary, a stationary dataset is one whose statistical properties remain constant over time. Apart from the volatile statistical properties, non-stationary data poses other challenges such as time and memory management due to the limitation of computational resources mostly caused by the recent advancements in data collection technologies which generate a variety of data at an alarming pace and volume. Additionally, when the collected data is complex, managing data complexity, emerging from its dimensionality and heterogeneity, can pose another challenge for effective computational learning. The problem is to enable accurate and efficient learning from non-stationary data in a continuous fashion over time while facing and managing the critical challenges of time, memory, concept change, and complexity simultaneously. Feature space modeling is one of the most effective solutions to address this problem. For non-stationary data, selecting relevant features is even more critical than stationary data due to the reduction of feature dimension which can ensure the best use a computational resource to produce higher accuracy and efficiency by data mining algorithms. In this dissertation, we investigated a variety of feature space modeling techniques to improve the overall performance of data mining algorithms. In particular, we built Relief based feature sub selection method in combination with data complexity iv analysis to improve the classification performance using ovarian cancer image data collected in a non-stationary batch mode. We also collected time series health sensor data in a streaming environment and deployed feature space transformation using Singular Value Decomposition (SVD). This led to reduced dimensionality of feature space resulting in better accuracy and efficiency produced by Density Ration Estimation Method in identifying potential change points in data over time. We have also built an unsupervised feature space modeling using matrix factorization and Lasso Regression which was successfully deployed in conjugate with Relative Density Ratio Estimation to address the botnet attacks in a non-stationary environment. Relief based feature model improved 16% accuracy of Fuzzy Forest classifier. For change detection framework, we observed 9% improvement in accuracy for PCA feature transformation. Due to the unsupervised feature selection model, for 2% and 5% malicious traffic ratio, the proposed botnet detection framework exhibited average 20% better accuracy than One Class Support Vector Machine (OSVM) and average 25% better accuracy than Autoencoder. All these results successfully demonstrate the effectives of these feature space models. The fundamental theme that repeats itself in this dissertation is about modeling efficient feature space to improve both accuracy and efficiency of selected data mining models. Every contribution in this dissertation has been subsequently and successfully employed to capitalize on those advantages to solve real-world problems. Our work bridges the concepts from multiple disciplines ineffective and surprising ways, leading to new insights, new frameworks, and ultimately to a cross-production of diverse fields like mathematics, statistics, and data mining

    Adversarial Item Promotion: Vulnerabilities at the Core of Top-N Recommenders that Use Images to Address Cold Start

    Get PDF
    E-commerce platforms provide their customers with ranked lists of recommended items matching the customers' preferences. Merchants on e-commerce platforms would like their items to appear as high as possible in the top-N of these ranked lists. In this paper, we demonstrate how unscrupulous merchants can create item images that artificially promote their products, improving their rankings. Recommender systems that use images to address the cold start problem are vulnerable to this security risk. We describe a new type of attack, Adversarial Item Promotion (AIP), that strikes directly at the core of Top-N recommenders: the ranking mechanism itself. Existing work on adversarial images in recommender systems investigates the implications of conventional attacks, which target deep learning classifiers. In contrast, our AIP attacks are embedding attacks that seek to push features representations in a way that fools the ranker (not a classifier) and directly lead to item promotion. We introduce three AIP attacks insider attack, expert attack, and semantic attack, which are defined with respect to three successively more realistic attack models. Our experiments evaluate the danger of these attacks when mounted against three representative visually-aware recommender algorithms in a framework that uses images to address cold start. We also evaluate two common defenses against adversarial images in the classification scenario and show that these simple defenses do not eliminate the danger of AIP attacks. In sum, we show that using images to address cold start opens recommender systems to potential threats with clear practical implications. To facilitate future research, we release an implementation of our attacks and defenses, which allows reproduction and extension.Comment: Our code is available at https://github.com/liuzrcc/AI
    • …
    corecore