352 research outputs found
Networkmetrics unraveled: MBDA in Action
We propose networkmetrics, a new data-driven approach for monitoring, troubleshooting and understanding communication networks using multivariate analysis. Networkmetric models are powerful machine-learning tools to interpret and interact with data collected from a network. In this paper, we illustrate the application of Multivariate Big Data Analysis (MBDA), a recently proposed networkmetric method with application to Big Data sets. We use MBDA for the detection and troubleshooting of network problems in a campus-wide Wi-Fi network. Data includes a seven-year trace (from 2012 to 2018) of the network’s most recent activity, with approximately 3,000 distinct access points, 40,000 authenticated users, and 600,000 distinct Wi-Fi stations. This is the longest and largest Wi-Fi trace known to date. To analyze this data, we propose learning and visualization procedures that extend MBDA. These procedures result in a methodology that allows network analysts to identify problems and diagnose and troubleshoot them, optimizing the network performance. In the paper, we go through the entire workflow of the approach, illustrating its application in detail and discussing processing times for parallel hardware
Networkmetrics unraveled: MBDA in Action
We propose networkmetrics, a new data-driven approach for monitoring,
troubleshooting and understanding communication networks using multivariate
analysis. Networkmetric models are powerful machine-learning tools to interpret
and interact with data collected from a network. In this paper, we illustrate
the application of Multivariate Big Data Analysis (MBDA), a recently proposed
networkmetric method with application to Big Data sets. We use MBDA for the
detection and troubleshooting of network problems in a campus-wide Wi-Fi
network. Data includes a seven-year trace (from 2012 to 2018) of the network's
most recent activity, with approximately 3,000 distinct access points, 40,000
authenticated users, and 600,000 distinct Wi-Fi stations. This is the longest
and largest Wi-Fi trace known to date. To analyze this data, we propose
learning and visualization procedures that extend MBDA. These procedures result
in a methodology that allows network analysts to identify problems and diagnose
and troubleshoot them, optimizing the network performance. In the paper, we go
through the entire workflow of the approach, illustrating its application in
detail and discussing processing times for parallel hardware
Interpretable Learning in Multivariate Big Data Analysis for Network Monitoring
There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows us to detect and diagnose disparate network anomalies, with a data-analysis workflow that combines the advantages of interpretable and interactive models with the power of parallel processing. We apply the extended MBDA to two case studies: UGR\u2716, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth\u2718, the longest and largest Wi-Fi trace known to date
Joint Tensor Factorization and Outlying Slab Suppression with Applications
We consider factoring low-rank tensors in the presence of outlying slabs.
This problem is important in practice, because data collected in many
real-world applications, such as speech, fluorescence, and some social network
data, fit this paradigm. Prior work tackles this problem by iteratively
selecting a fixed number of slabs and fitting, a procedure which may not
converge. We formulate this problem from a group-sparsity promoting point of
view, and propose an alternating optimization framework to handle the
corresponding () minimization-based low-rank tensor
factorization problem. The proposed algorithm features a similar per-iteration
complexity as the plain trilinear alternating least squares (TALS) algorithm.
Convergence of the proposed algorithm is also easy to analyze under the
framework of alternating optimization and its variants. In addition,
regularization and constraints can be easily incorporated to make use of
\emph{a priori} information on the latent loading factors. Simulations and real
data experiments on blind speech separation, fluorescence data analysis, and
social network mining are used to showcase the effectiveness of the proposed
algorithm
Fused Adjacency Matrices to enhance information extraction: the beer benchmark
Multivariate exploratory data analysis allows revealing patterns and extracting information from
complex multivariate data sets. However, highly complex data may not show evident groupings or
trends in the principal component space, e.g. because the variation of the variables are not grouped
but rather continuous. In these cases, classical exploratory methods may not provide satisfactory
results when the aim is to find distinct groupings in the data.
To enhance information extraction in such situations, we propose a novel approach inspired by the
concept of combining weak classifiers, but in the unsupervised context. The approach is based on
the fusion of several adjacency matrices obtained by different distance measures on data from
different analytical platforms. This paper is intended to present and discuss the potential of the
approach through a benchmark data set of beer samples. The beer data were acquired using three
spectroscopic techniques: Visible, near-Infrared and Nuclear Magnetic Resonance.
The results of fusing the three data sets via the proposed approach are compared with those from the
single data blocks (Visible, NIR and NMR) and from a standard mid-level data fusion methodology.
It is shown that, with the suggested approach, groupings related to beer style and other features are
efficiently recovered, and generally more evident
- …