4,169 research outputs found
SeLINA: a Self-Learning Insightful Network Analyzer
Understanding the behavior of a network from a large scale traffic dataset is a challenging problem. Big data frameworks offer scalable algorithms to extract information from raw data, but often require a sophisticated fine-tuning and a detailed knowledge of machine learning algorithms. To streamline this process, we propose SeLINA (Self-Learning Insightful Network Analyzer), a generic, self-tuning, simple tool to extract knowledge from network traffic measurements. SeLINA includes different data analytics techniques providing self-learning capabilities to state-of-the-art scalable approaches, jointly with parameter auto-selection to off-load the network expert from parameter tuning. We combine both unsupervised and supervised approaches to mine data with a scalable approach. SeLINA embeds mechanisms to check if the new data fits the model, to detect possible changes in the traffic, and to, possibly automatically, trigger model rebuilding. The result is a system that offers human-readable models of the data with minimal user intervention, supporting domain experts in extracting actionable knowledge and highlighting possibly meaningful interpretations. SeLINA's current implementation runs on Apache Spark. We tested it on large collections of realworld passive network measurements from a nationwide ISP, investigating YouTube and P2P traffic. The experimental results confirmed the ability of SeLINA to provide insights and detect changes in the data that suggest further analyse
Towards meta-learning for multi-target regression problems
Several multi-target regression methods were devel-oped in the last years
aiming at improving predictive performanceby exploring inter-target correlation
within the problem. However, none of these methods outperforms the others for
all problems. This motivates the development of automatic approachesto
recommend the most suitable multi-target regression method. In this paper, we
propose a meta-learning system to recommend the best predictive method for a
given multi-target regression problem. We performed experiments with a
meta-dataset generated by a total of 648 synthetic datasets. These datasets
were created to explore distinct inter-targets characteristics toward
recommending the most promising method. In experiments, we evaluated four
different algorithms with different biases as meta-learners. Our meta-dataset
is composed of 58 meta-features, based on: statistical information, correlation
characteristics, linear landmarking, from the distribution and smoothness of
the data, and has four different meta-labels. Results showed that induced
meta-models were able to recommend the best methodfor different base level
datasets with a balanced accuracy superior to 70% using a Random Forest
meta-model, which statistically outperformed the meta-learning baselines.Comment: To appear on the 8th Brazilian Conference on Intelligent Systems
(BRACIS
SeLINA: a Self-Learning Insightful Network Analyzer
Understanding the behavior of a network from a large scale traffic dataset is a challenging problem. Big data frameworks offer scalable algorithms to extract information from raw data, but often require a sophisticated fine-tuning and a detailed knowledge of machine learning algorithms. To streamline this process, we propose SeLINA (Self-Learning Insightful Network Analyzer), a generic, self-tuning, simple tool to extract knowledge from network traffic measurements. SeLINA includes different data analytics techniques providing self-learning capabilities to state-of-the-art scalable approaches, jointly with parameter auto-selection to off-load the network expert from parameter tuning. We combine both unsupervised and supervised approaches to mine data with a scalable approach. SeLINA embeds mechanisms to check if the new data fits the model, to detect possible changes in the traffic, and to, possibly automatically, trigger model rebuilding. The result is a system that offers human-readable models of the data with minimal user intervention, supporting domain experts in extracting actionable knowledge and highlighting possibly meaningful interpretations. SeLINA’s current implementation runs on Apache Spark. We tested it on large collections of realworld passive network measurements from a nationwide ISP, investigating YouTube and P2P traffic. The experimental results confirmed the ability of SeLINA to provide insights and detect changes in the data that suggest further analyses
Hyperparameter optimization in deep multi-target prediction
As a result of the ever increasing complexity of configuring and fine-tuning
machine learning models, the field of automated machine learning (AutoML) has
emerged over the past decade. However, software implementations like Auto-WEKA
and Auto-sklearn typically focus on classical machine learning (ML) tasks such
as classification and regression. Our work can be seen as the first attempt at
offering a single AutoML framework for most problem settings that fall under
the umbrella of multi-target prediction, which includes popular ML settings
such as multi-label classification, multivariate regression, multi-task
learning, dyadic prediction, matrix completion, and zero-shot learning.
Automated problem selection and model configuration are achieved by extending
DeepMTP, a general deep learning framework for MTP problem settings, with
popular hyperparameter optimization (HPO) methods. Our extensive benchmarking
across different datasets and MTP problem settings identifies cases where
specific HPO methods outperform others.Comment: 17 pages, 4 figures, 1 tabl
- …