6,102 research outputs found

    Properties of Healthcare Teaming Networks as a Function of Network Construction Algorithms

    Full text link
    Network models of healthcare systems can be used to examine how providers collaborate, communicate, refer patients to each other. Most healthcare service network models have been constructed from patient claims data, using billing claims to link patients with providers. The data sets can be quite large, making standard methods for network construction computationally challenging and thus requiring the use of alternate construction algorithms. While these alternate methods have seen increasing use in generating healthcare networks, there is little to no literature comparing the differences in the structural properties of the generated networks. To address this issue, we compared the properties of healthcare networks constructed using different algorithms and the 2013 Medicare Part B outpatient claims data. Three different algorithms were compared: binning, sliding frame, and trace-route. Unipartite networks linking either providers or healthcare organizations by shared patients were built using each method. We found that each algorithm produced networks with substantially different topological properties. Provider networks adhered to a power law, and organization networks to a power law with exponential cutoff. Censoring networks to exclude edges with less than 11 shared patients, a common de-identification practice for healthcare network data, markedly reduced edge numbers and greatly altered measures of vertex prominence such as the betweenness centrality. We identified patterns in the distance patients travel between network providers, and most strikingly between providers in the Northeast United States and Florida. We conclude that the choice of network construction algorithm is critical for healthcare network analysis, and discuss the implications for selecting the algorithm best suited to the type of analysis to be performed.Comment: With links to comprehensive, high resolution figures and networks via figshare.co

    Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

    Full text link
    Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

    Symmetry degree measurement and its applications to anomaly detection

    Get PDF
    IEEE Anomaly detection is an important technique used to identify patterns of unusual network behavior and keep the network under control. Today, network attacks are increasing in terms of both their number and sophistication. To avoid causing significant traffic patterns and being detected by existing techniques, many new attacks tend to involve gradual adjustment of behaviors, which always generate incomplete sessions due to their running mechanisms. Accordingly, in this work, we employ the behavior symmetry degree to profile the anomalies and further identify unusual behaviors. We first proposed a symmetry degree to identify the incomplete sessions generated by unusual behaviors; we then employ a sketch to calculate the symmetry degree of internal hosts to improve the identification efficiency for online applications. To reduce the memory cost and probability of collision, we divide the IP addresses into four segments that can be used as keys of the hash functions in the sketch. Moreover, to further improve detection accuracy, a threshold selection method is proposed for dynamic traffic pattern analysis. The hash functions in the sketch are then designed using Chinese remainder theory, which can analytically trace the IP addresses associated with the anomalies. We tested the proposed techniques based on traffic data collected from the northwest center of CERNET (China Education and Research Network); the results show that the proposed methods can effectively detect anomalies in large-scale networks

    Dynamic Circular Network-Based Federated Dual-View Learning for Multivariate Time Series Anomaly Detection

    Get PDF
    Multivariate time-series data exhibit intricate correlations in both temporal and spatial dimensions. However, existing network architectures often overlook dependencies in the spatial dimension and struggle to strike a balance between long-term and short-term patterns when extracting features from the data. Furthermore, industries within the business community are hesitant to share their raw data, which hinders anomaly prediction accuracy and detection performance. To address these challenges, the authors propose a dynamic circular network-based federated dual-view learning approach. Experimental results from four open-source datasets demonstrate that the method outperforms existing methods in terms of accuracy, recall, and F1_score for anomaly detection

    Network Sampling: From Static to Streaming Graphs

    Full text link
    Network sampling is integral to the analysis of social, information, and biological networks. Since many real-world networks are massive in size, continuously evolving, and/or distributed in nature, the network structure is often sampled in order to facilitate study. For these reasons, a more thorough and complete understanding of network sampling is critical to support the field of network science. In this paper, we outline a framework for the general problem of network sampling, by highlighting the different objectives, population and units of interest, and classes of network sampling methods. In addition, we propose a spectrum of computational models for network sampling methods, ranging from the traditionally studied model based on the assumption of a static domain to a more challenging model that is appropriate for streaming domains. We design a family of sampling methods based on the concept of graph induction that generalize across the full spectrum of computational models (from static to streaming) while efficiently preserving many of the topological properties of the input graphs. Furthermore, we demonstrate how traditional static sampling algorithms can be modified for graph streams for each of the three main classes of sampling methods: node, edge, and topology-based sampling. Our experimental results indicate that our proposed family of sampling methods more accurately preserves the underlying properties of the graph for both static and streaming graphs. Finally, we study the impact of network sampling algorithms on the parameter estimation and performance evaluation of relational classification algorithms

    Control of transport dynamics in overlay networks

    Get PDF
    Transport control is an important factor in the performance of Internet protocols, particularly in the next generation network applications involving computational steering, interactive visualization, instrument control, and transfer of large data sets. The widely deployed Transport Control Protocol is inadequate for these tasks due to its performance drawbacks. The purpose of this dissertation is to conduct a rigorous analytical study on the design and performance of transport protocols, and systematically develop a new class of protocols to overcome the limitations of current methods. Various sources of randomness exist in network performance measurements due to the stochastic nature of network traffic. We propose a new class of transport protocols that explicitly accounts for the randomness based on dynamic stochastic approximation methods. These protocols use congestion window and idle time to dynamically control the source rate to achieve transport objectives. We conduct statistical analyses to determine the main effects of these two control parameters and their interaction effects. The application of stochastic approximation methods enables us to show the analytical stability of the transport protocols and avoid pre-selecting the flow and congestion control parameters. These new protocols are successfully applied to transport control for both goodput stabilization and maximization. The experimental results show the superior performance compared to current methods particularly for Internet applications. To effectively deploy these protocols over the Internet, we develop an overlay network, which resides at the application level to provide data transmission service using User Datagram Protocol. The overlay network, together with the new protocols based on User Datagram Protocol, provides an effective environment for implementing transport control using application-level modules. We also study problems in overlay networks such as path bandwidth estimation and multiple quickest path computation. In wireless networks, most packet losses are caused by physical signal losses and do not necessarily indicate network congestion. Furthermore, the physical link connectivity in ad-hoc networks deployed in unstructured areas is unpredictable. We develop the Connectivity-Through-Time protocols that exploit the node movements to deliver data under dynamic connectivity. We integrate this protocol into overlay networks and present experimental results using network to support a team of mobile robots
    • …
    corecore