459 research outputs found
Sparse Inverse Covariance Estimation for Chordal Structures
In this paper, we consider the Graphical Lasso (GL), a popular optimization
problem for learning the sparse representations of high-dimensional datasets,
which is well-known to be computationally expensive for large-scale problems.
Recently, we have shown that the sparsity pattern of the optimal solution of GL
is equivalent to the one obtained from simply thresholding the sample
covariance matrix, for sparse graphs under different conditions. We have also
derived a closed-form solution that is optimal when the thresholded sample
covariance matrix has an acyclic structure. As a major generalization of the
previous result, in this paper we derive a closed-form solution for the GL for
graphs with chordal structures. We show that the GL and thresholding
equivalence conditions can significantly be simplified and are expected to hold
for high-dimensional problems if the thresholded sample covariance matrix has a
chordal structure. We then show that the GL and thresholding equivalence is
enough to reduce the GL to a maximum determinant matrix completion problem and
drive a recursive closed-form solution for the GL when the thresholded sample
covariance matrix has a chordal structure. For large-scale problems with up to
450 million variables, the proposed method can solve the GL problem in less
than 2 minutes, while the state-of-the-art methods converge in more than 2
hours
Multitask Learning for Network Traffic Classification
Traffic classification has various applications in today's Internet, from
resource allocation, billing and QoS purposes in ISPs to firewall and malware
detection in clients. Classical machine learning algorithms and deep learning
models have been widely used to solve the traffic classification task. However,
training such models requires a large amount of labeled data. Labeling data is
often the most difficult and time-consuming process in building a classifier.
To solve this challenge, we reformulate the traffic classification into a
multi-task learning framework where bandwidth requirement and duration of a
flow are predicted along with the traffic class. The motivation of this
approach is twofold: First, bandwidth requirement and duration are useful in
many applications, including routing, resource allocation, and QoS
provisioning. Second, these two values can be obtained from each flow easily
without the need for human labeling or capturing flows in a controlled and
isolated environment. We show that with a large amount of easily obtainable
data samples for bandwidth and duration prediction tasks, and only a few data
samples for the traffic classification task, one can achieve high accuracy. We
conduct two experiment with ISCX and QUIC public datasets and show the efficacy
of our approach
A path layer for the internet : enabling network operations on encrypted protocols
The deployment of encrypted transport protocols imposes new challenges for network operations. Key in-network functions such as those implemented by firewalls and passive measurement devices currently rely on information exposed by the transport layer. Encryption, in addition to improving privacy, helps to address ossification of network protocols caused by middleboxes that assume certain information to be present in the clear. However, “encrypting it all” risks diminishing the utility of these middleboxes for the traffic management tasks for which they were designed. A middlebox cannot use what it cannot see.
We propose an architectural solution to this issue, by introducing a new “path layer” for transport-independent, in-band signaling between Internet endpoints and network elements on the paths between them, and using this layer to reinforce the boundary between the hop-by-hop network layer and the end-to- end transport layer. We define a path layer header on top of UDP to provide a common wire image for new, encrypted transports. This path layer header provides information to a transport- independent on-path state machine that replaces stateful handling currently based on exposed header flags and fields in TCP; it enables explicit measurability of transport layer performance; and offers extensibility by sender-to-path and path-to-receiver communications for diagnostics and management. This provides not only a replacement for signals that are not available with encrypted traffic, but also allows integrity-protected, enhanced signaling under endpoint control. We present an implementation of this wire image integrated with the QUIC protocol, as well as a basic stateful middlebox built on Vector Packet Processing (VPP) provided by FD.io
Learning Sparse Gaussian Graphical Model with l0-regularization
For the problem of learning sparse Gaussian graphical models, it is desirable to obtain both sparse structures as well as good parameter estimates. Classical techniques, such as optimizing the l1-regularized maximum likelihood or Chow-Liu algorithm, either focus on parameter estimation or constrain to speci c structure. This paper proposes an alternative that is based on l0-regularized maximum likelihood and employs a greedy algorithm to solve the optimization problem. We show that, when the graph is acyclic, the greedy solution finds the optimal acyclic graph. We also show it can update the parameters in constant time when connecting two sub-components, thus work efficiently on sparse graphs. Empirical results are provided to demonstrate this new algorithm can learn sparse structures with cycles efficiently and that it dominates l1-regularized approach on graph likelihood.ARO MURI grant W911NF-11-1-0391
- …