3,284 research outputs found

    Detecting Flow Anomalies in Distributed Systems

    Get PDF
    Deep within the networks of distributed systems, one often finds anomalies that affect their efficiency and performance. These anomalies are difficult to detect because the distributed systems may not have sufficient sensors to monitor the flow of traffic within the interconnected nodes of the networks. Without early detection and making corrections, these anomalies may aggravate over time and could possibly cause disastrous outcomes in the system in the unforeseeable future. Using only coarse-grained information from the two end points of network flows, we propose a network transmission model and a localization algorithm, to detect the location of anomalies and rank them using a proposed metric within distributed systems. We evaluate our approach on passengers' records of an urbanized city's public transportation system and correlate our findings with passengers' postings on social media microblogs. Our experiments show that the metric derived using our localization algorithm gives a better ranking of anomalies as compared to standard deviation measures from statistical models. Our case studies also demonstrate that transportation events reported in social media microblogs matches the locations of our detect anomalies, suggesting that our algorithm performs well in locating the anomalies within distributed systems

    Learning structure and schemas from heterogeneous domains in networked systems: a survey

    Get PDF
    The rapidly growing amount of available digital documents of various formats and the possibility to access these through internet-based technologies in distributed environments, have led to the necessity to develop solid methods to properly organize and structure documents in large digital libraries and repositories. Specifically, the extremely large size of document collections make it impossible to manually organize such documents. Additionally, most of the document sexist in an unstructured form and do not follow any schemas. Therefore, research efforts in this direction are being dedicated to automatically infer structure and schemas. This is essential in order to better organize huge collections as well as to effectively and efficiently retrieve documents in heterogeneous domains in networked system. This paper presents a survey of the state-of-the-art methods for inferring structure from documents and schemas in networked environments. The survey is organized around the most important application domains, namely, bio-informatics, sensor networks, social networks, P2Psystems, automation and control, transportation and privacy preserving for which we analyze the recent developments on dealing with unstructured data in such domains.Peer ReviewedPostprint (published version

    Community-based Outlier Detection for Edge-attributed Graphs

    Get PDF
    The study of networks has emerged in diverse disciplines as a means of analyzing complex relationship data. Beyond graph analysis tasks like graph query processing, link analysis, influence propagation, there has recently been some work in the area of outlier detection for information network data. Although various kinds of outliers have been studied for graph data, there is not much work on anomaly detection from edge-attributed graphs. In this paper, we introduce a method that detects novel outlier graph nodes by taking into account the node data and edge data simultaneously to detect anomalies. We model the problem as a community detection task, where outliers form a separate community. We propose a method that uses a probabilistic graph model (Hidden Markov Random Field) for joint modeling of nodes and edges in the network to compute Holistic Community Outliers (HCOutliers). Thus, our model presents a natural setting for heterogeneous graphs that have multiple edges/relationships between two nodes. EM (Expectation Maximization) is used to learn model parameters, and infer hidden community labels. Experimental results on synthetic datasets and the DBLP dataset show the effectiveness of our approach for finding novel outliers from networks

    Compromised user credentials detection in a digital enterprise using behavioral analytics

    Get PDF
    © 2018 In today\u27s digital age, the digital transformation is necessary for almost every competitive enterprise in terms of having access to the best resources and ensuring customer satisfaction. However, due to such rewards, these enterprises are facing key concerns around the risk of next-generation data security or cybercrime which is continually increasing issue due to the digital transformation four essential pillars—cloud computing, big data analytics, social and mobile computing. Data transformation-driven enterprises should ready to handle this next-generation data security problem, in particular, the compromised user credential (CUC). When an intruder or cybercriminal develops trust relationships as a legitimate account holder and then gain privileged access to the system for misuse. Many state-of-the-art risk mitigation tools are being developed, such as encrypted and secure password policy, authentication, and authorization mechanism. However, the CUC has become more complex and increasingly critical to the digital transformation process of the enterprise\u27s database by a cybercriminal, we propose a novel technique that effectively detects CUC at the enterprise-level. The proposed technique is learning from the user\u27s behavior and builds a knowledge base system (KBS) which observe changes in the user\u27s operational behavior. For that reason, a series of experiments were carried out on the dataset that collected from a sensitive database. All empirical results are validated through well-known evaluation measures, such as (i) accuracy, (ii) sensitivity, (iii) specificity, (iv) prudence accuracy, (v) precision, (vi) f-measure, and (vii) error rate. The experiments show that the proposed approach obtained weighted accuracy up to 99% and overall error of about 1%. The results clearly demonstrate that the proposed model efficiently can detect CUC which may keep an organization safe from major damage in data through cyber-attacks

    Laplacian Change Point Detection for Dynamic Graphs

    Full text link
    Dynamic and temporal graphs are rich data structures that are used to model complex relationships between entities over time. In particular, anomaly detection in temporal graphs is crucial for many real world applications such as intrusion identification in network systems, detection of ecosystem disturbances and detection of epidemic outbreaks. In this paper, we focus on change point detection in dynamic graphs and address two main challenges associated with this problem: I) how to compare graph snapshots across time, II) how to capture temporal dependencies. To solve the above challenges, we propose Laplacian Anomaly Detection (LAD) which uses the spectrum of the Laplacian matrix of the graph structure at each snapshot to obtain low dimensional embeddings. LAD explicitly models short term and long term dependencies by applying two sliding windows. In synthetic experiments, LAD outperforms the state-of-the-art method. We also evaluate our method on three real dynamic networks: UCI message network, US senate co-sponsorship network and Canadian bill voting network. In all three datasets, we demonstrate that our method can more effectively identify anomalous time points according to significant real world events.Comment: in KDD 2020, 10 page
    corecore