6,102 research outputs found
Properties of Healthcare Teaming Networks as a Function of Network Construction Algorithms
Network models of healthcare systems can be used to examine how providers
collaborate, communicate, refer patients to each other. Most healthcare service
network models have been constructed from patient claims data, using billing
claims to link patients with providers. The data sets can be quite large,
making standard methods for network construction computationally challenging
and thus requiring the use of alternate construction algorithms. While these
alternate methods have seen increasing use in generating healthcare networks,
there is little to no literature comparing the differences in the structural
properties of the generated networks. To address this issue, we compared the
properties of healthcare networks constructed using different algorithms and
the 2013 Medicare Part B outpatient claims data. Three different algorithms
were compared: binning, sliding frame, and trace-route. Unipartite networks
linking either providers or healthcare organizations by shared patients were
built using each method. We found that each algorithm produced networks with
substantially different topological properties. Provider networks adhered to a
power law, and organization networks to a power law with exponential cutoff.
Censoring networks to exclude edges with less than 11 shared patients, a common
de-identification practice for healthcare network data, markedly reduced edge
numbers and greatly altered measures of vertex prominence such as the
betweenness centrality. We identified patterns in the distance patients travel
between network providers, and most strikingly between providers in the
Northeast United States and Florida. We conclude that the choice of network
construction algorithm is critical for healthcare network analysis, and discuss
the implications for selecting the algorithm best suited to the type of
analysis to be performed.Comment: With links to comprehensive, high resolution figures and networks via
figshare.co
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
Symmetry degree measurement and its applications to anomaly detection
IEEE Anomaly detection is an important technique used to identify patterns of unusual network behavior and keep the network under control. Today, network attacks are increasing in terms of both their number and sophistication. To avoid causing significant traffic patterns and being detected by existing techniques, many new attacks tend to involve gradual adjustment of behaviors, which always generate incomplete sessions due to their running mechanisms. Accordingly, in this work, we employ the behavior symmetry degree to profile the anomalies and further identify unusual behaviors. We first proposed a symmetry degree to identify the incomplete sessions generated by unusual behaviors; we then employ a sketch to calculate the symmetry degree of internal hosts to improve the identification efficiency for online applications. To reduce the memory cost and probability of collision, we divide the IP addresses into four segments that can be used as keys of the hash functions in the sketch. Moreover, to further improve detection accuracy, a threshold selection method is proposed for dynamic traffic pattern analysis. The hash functions in the sketch are then designed using Chinese remainder theory, which can analytically trace the IP addresses associated with the anomalies. We tested the proposed techniques based on traffic data collected from the northwest center of CERNET (China Education and Research Network); the results show that the proposed methods can effectively detect anomalies in large-scale networks
Dynamic Circular Network-Based Federated Dual-View Learning for Multivariate Time Series Anomaly Detection
Multivariate time-series data exhibit intricate correlations in both temporal and spatial dimensions. However, existing network architectures often overlook dependencies in the spatial dimension and struggle to strike a balance between long-term and short-term patterns when extracting features from the data. Furthermore, industries within the business community are hesitant to share their raw data, which hinders anomaly prediction accuracy and detection performance. To address these challenges, the authors propose a dynamic circular network-based federated dual-view learning approach. Experimental results from four open-source datasets demonstrate that the method outperforms existing methods in terms of accuracy, recall, and F1_score for anomaly detection
Network Sampling: From Static to Streaming Graphs
Network sampling is integral to the analysis of social, information, and
biological networks. Since many real-world networks are massive in size,
continuously evolving, and/or distributed in nature, the network structure is
often sampled in order to facilitate study. For these reasons, a more thorough
and complete understanding of network sampling is critical to support the field
of network science. In this paper, we outline a framework for the general
problem of network sampling, by highlighting the different objectives,
population and units of interest, and classes of network sampling methods. In
addition, we propose a spectrum of computational models for network sampling
methods, ranging from the traditionally studied model based on the assumption
of a static domain to a more challenging model that is appropriate for
streaming domains. We design a family of sampling methods based on the concept
of graph induction that generalize across the full spectrum of computational
models (from static to streaming) while efficiently preserving many of the
topological properties of the input graphs. Furthermore, we demonstrate how
traditional static sampling algorithms can be modified for graph streams for
each of the three main classes of sampling methods: node, edge, and
topology-based sampling. Our experimental results indicate that our proposed
family of sampling methods more accurately preserves the underlying properties
of the graph for both static and streaming graphs. Finally, we study the impact
of network sampling algorithms on the parameter estimation and performance
evaluation of relational classification algorithms
Control of transport dynamics in overlay networks
Transport control is an important factor in the performance of Internet protocols, particularly in the next generation network applications involving computational steering, interactive visualization, instrument control, and transfer of large data sets. The widely deployed Transport Control Protocol is inadequate for these tasks due to its performance drawbacks. The purpose of this dissertation is to conduct a rigorous analytical study on the design and performance of transport protocols, and systematically develop a new class of protocols to overcome the limitations of current methods. Various sources of randomness exist in network performance measurements due to the stochastic nature of network traffic. We propose a new class of transport protocols that explicitly accounts for the randomness based on dynamic stochastic approximation methods. These protocols use congestion window and idle time to dynamically control the source rate to achieve transport objectives. We conduct statistical analyses to determine the main effects of these two control parameters and their interaction effects. The application of stochastic approximation methods enables us to show the analytical stability of the transport protocols and avoid pre-selecting the flow and congestion control parameters. These new protocols are successfully applied to transport control for both goodput stabilization and maximization. The experimental results show the superior performance compared to current methods particularly for Internet applications. To effectively deploy these protocols over the Internet, we develop an overlay network, which resides at the application level to provide data transmission service using User Datagram Protocol. The overlay network, together with the new protocols based on User Datagram Protocol, provides an effective environment for implementing transport control using application-level modules. We also study problems in overlay networks such as path bandwidth estimation and multiple quickest path computation. In wireless networks, most packet losses are caused by physical signal losses and do not necessarily indicate network congestion. Furthermore, the physical link connectivity in ad-hoc networks deployed in unstructured areas is unpredictable. We develop the Connectivity-Through-Time protocols that exploit the node movements to deliver data under dynamic connectivity. We integrate this protocol into overlay networks and present experimental results using network to support a team of mobile robots
- …