459 research outputs found

    Sparse Inverse Covariance Estimation for Chordal Structures

    Full text link
    In this paper, we consider the Graphical Lasso (GL), a popular optimization problem for learning the sparse representations of high-dimensional datasets, which is well-known to be computationally expensive for large-scale problems. Recently, we have shown that the sparsity pattern of the optimal solution of GL is equivalent to the one obtained from simply thresholding the sample covariance matrix, for sparse graphs under different conditions. We have also derived a closed-form solution that is optimal when the thresholded sample covariance matrix has an acyclic structure. As a major generalization of the previous result, in this paper we derive a closed-form solution for the GL for graphs with chordal structures. We show that the GL and thresholding equivalence conditions can significantly be simplified and are expected to hold for high-dimensional problems if the thresholded sample covariance matrix has a chordal structure. We then show that the GL and thresholding equivalence is enough to reduce the GL to a maximum determinant matrix completion problem and drive a recursive closed-form solution for the GL when the thresholded sample covariance matrix has a chordal structure. For large-scale problems with up to 450 million variables, the proposed method can solve the GL problem in less than 2 minutes, while the state-of-the-art methods converge in more than 2 hours

    Multitask Learning for Network Traffic Classification

    Full text link
    Traffic classification has various applications in today's Internet, from resource allocation, billing and QoS purposes in ISPs to firewall and malware detection in clients. Classical machine learning algorithms and deep learning models have been widely used to solve the traffic classification task. However, training such models requires a large amount of labeled data. Labeling data is often the most difficult and time-consuming process in building a classifier. To solve this challenge, we reformulate the traffic classification into a multi-task learning framework where bandwidth requirement and duration of a flow are predicted along with the traffic class. The motivation of this approach is twofold: First, bandwidth requirement and duration are useful in many applications, including routing, resource allocation, and QoS provisioning. Second, these two values can be obtained from each flow easily without the need for human labeling or capturing flows in a controlled and isolated environment. We show that with a large amount of easily obtainable data samples for bandwidth and duration prediction tasks, and only a few data samples for the traffic classification task, one can achieve high accuracy. We conduct two experiment with ISCX and QUIC public datasets and show the efficacy of our approach

    A path layer for the internet : enabling network operations on encrypted protocols

    Get PDF
    The deployment of encrypted transport protocols imposes new challenges for network operations. Key in-network functions such as those implemented by firewalls and passive measurement devices currently rely on information exposed by the transport layer. Encryption, in addition to improving privacy, helps to address ossification of network protocols caused by middleboxes that assume certain information to be present in the clear. However, “encrypting it all” risks diminishing the utility of these middleboxes for the traffic management tasks for which they were designed. A middlebox cannot use what it cannot see. We propose an architectural solution to this issue, by introducing a new “path layer” for transport-independent, in-band signaling between Internet endpoints and network elements on the paths between them, and using this layer to reinforce the boundary between the hop-by-hop network layer and the end-to- end transport layer. We define a path layer header on top of UDP to provide a common wire image for new, encrypted transports. This path layer header provides information to a transport- independent on-path state machine that replaces stateful handling currently based on exposed header flags and fields in TCP; it enables explicit measurability of transport layer performance; and offers extensibility by sender-to-path and path-to-receiver communications for diagnostics and management. This provides not only a replacement for signals that are not available with encrypted traffic, but also allows integrity-protected, enhanced signaling under endpoint control. We present an implementation of this wire image integrated with the QUIC protocol, as well as a basic stateful middlebox built on Vector Packet Processing (VPP) provided by FD.io

    Learning Sparse Gaussian Graphical Model with l0-regularization

    Get PDF
    For the problem of learning sparse Gaussian graphical models, it is desirable to obtain both sparse structures as well as good parameter estimates. Classical techniques, such as optimizing the l1-regularized maximum likelihood or Chow-Liu algorithm, either focus on parameter estimation or constrain to speci c structure. This paper proposes an alternative that is based on l0-regularized maximum likelihood and employs a greedy algorithm to solve the optimization problem. We show that, when the graph is acyclic, the greedy solution finds the optimal acyclic graph. We also show it can update the parameters in constant time when connecting two sub-components, thus work efficiently on sparse graphs. Empirical results are provided to demonstrate this new algorithm can learn sparse structures with cycles efficiently and that it dominates l1-regularized approach on graph likelihood.ARO MURI grant W911NF-11-1-0391
    corecore