Search CORE

9,993 research outputs found

Algorithm Selection Framework for Cyber Attack Detection

Author: Ajmera Aman
Arel-Bundock Vincent
Brazdil Pavel
Cui Can
Jacob Sunil
Janosi Andras
Maxwell Paul
Paliwal Swati
Revathi S
Rice John
Simpson Timothy W
Smith Michael R.
Sobirey Michael
Tavallaee Mahbod
Utgoff Paul E
Wolberg William H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/03/2020
Field of study

The number of cyber threats against both wired and wireless computer systems and other components of the Internet of Things continues to increase annually. In this work, an algorithm selection framework is employed on the NSL-KDD data set and a novel paradigm of machine learning taxonomy is presented. The framework uses a combination of user input and meta-features to select the best algorithm to detect cyber attacks on a network. Performance is compared between a rule-of-thumb strategy and a meta-learning strategy. The framework removes the conjecture of the common trial-and-error algorithm selection method. The framework recommends five algorithms from the taxonomy. Both strategies recommend a high-performing algorithm, though not the best performing. The work demonstrates the close connectedness between algorithm selection and the taxonomy for which it is premised.Comment: 6 pages, 7 figures, 1 table, accepted to WiseML '2

arXiv.org e-Print Archive

AFTI Scholar (Air Force Institute of Technology)

Crossref

USMA Digital Commons (United States Military Academy, West Point)

Hierarchical meta-rules for scalable meta-learning

Author: Pfahringer Bernhard
Sun Quan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The Pairwise Meta-Rules (PMR) method proposed in [18] has been shown to improve the predictive performances of several metalearning algorithms for the algorithm ranking problem. Given m target objects (e.g., algorithms), the training complexity of the PMR method with respect to m is quadratic: (formula presented). This is usually not a problem when m is moderate, such as when ranking 20 different learning algorithms. However, for problems with a much larger m, such as the meta-learning-based parameter ranking problem, where m can be 100+, the PMR method is less efficient. In this paper, we propose a novel method named Hierarchical Meta-Rules (HMR), which is based on the theory of orthogonal contrasts. The proposed HMR method has a linear training complexity with respect to m, providing a way of dealing with a large number of objects that the PMR method cannot handle efficiently. Our experimental results demonstrate the benefit of the new method in the context of meta-learning

Research Commons@Waikato

Engineering Crowdsourced Stream Processing Systems

Author: Carlos Castillo
Crp Henri Tudor
Ioanna Lykourentzou
Muhammad Imran
Yannick Naudet
Publication venue
Publication date: 04/08/2014
Field of study

A crowdsourced stream processing system (CSP) is a system that incorporates crowdsourced tasks in the processing of a data stream. This can be seen as enabling crowdsourcing work to be applied on a sample of large-scale data at high speed, or equivalently, enabling stream processing to employ human intelligence. It also leads to a substantial expansion of the capabilities of data processing systems. Engineering a CSP system requires the combination of human and machine computation elements. From a general systems theory perspective, this means taking into account inherited as well as emerging properties from both these elements. In this paper, we position CSP systems within a broader taxonomy, outline a series of design principles and evaluation metrics, present an extensible framework for their design, and describe several design patterns. We showcase the capabilities of CSP systems by performing a case study that applies our proposed framework to the design and analysis of a real system (AIDR) that classifies social media messages during time-critical crisis events. Results show that compared to a pure stream processing system, AIDR can achieve a higher data classification accuracy, while compared to a pure crowdsourcing solution, the system makes better use of human workers by requiring much less manual work effort

arXiv.org e-Print Archive

CiteSeerX

Facilitating and Enhancing the Performance of Model Selection for Energy Time Series Forecasting in Cluster Computing Environments

Author: Shahoud Shadi
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 14/01/2023
Field of study

Applying Machine Learning (ML) manually to a given problem setting is a tedious and time-consuming process which brings many challenges with it, especially in the context of Big Data. In such a context, gaining insightful information, finding patterns, and extracting knowledge from large datasets are quite complex tasks. Additionally, the configurations of the underlying Big Data infrastructure introduce more complexity for configuring and running ML tasks. With the growing interest in ML the last few years, particularly people without extensive ML expertise have a high demand for frameworks assisting people in applying the right ML algorithm to their problem setting. This is especially true in the field of smart energy system applications where more and more ML algorithms are used e.g. for time series forecasting. Generally, two groups of non-expert users are distinguished to perform energy time series forecasting. The first one includes the users who are familiar with statistics and ML but are not able to write the necessary programming code for training and evaluating ML models using the well-known trial-and-error approach. Such an approach is time consuming and wastes resources for constructing multiple models. The second group is even more inexperienced in programming and not knowledgeable in statistics and ML but wants to apply given ML solutions to their problem settings. The goal of this thesis is to scientifically explore, in the context of more concrete use cases in the energy domain, how such non-expert users can be optimally supported in creating and performing ML tasks in practice on cluster computing environments. To support the first group of non-expert users, an easy-to-use modular extendable microservice-based ML solution for instrumenting and evaluating ML algorithms on top of a Big Data technology stack is conceptualized and evaluated. Our proposed solution facilitates applying trial-and-error approach by hiding the low level complexities from the users and introduces the best conditions to efficiently perform ML tasks in cluster computing environments. To support the second group of non-expert users, the first solution is extended to realize meta learning approaches for automated model selection. We evaluate how meta learning technology can be efficiently applied to the problem space of data analytics for smart energy systems to assist energy system experts which are not data analytics experts in applying the right ML algorithms to their data analytics problems. To enhance the predictive performance of meta learning, an efficient characterization of energy time series datasets is required. To this end, Descriptive Statistics Time based Meta Features (DSTMF), a new kind of meta features, is designed to accurately capture the deep characteristics of energy time series datasets. We find that DSTMF outperforms the other state-of-the-art meta feature sets introduced in the literature to characterize energy time series datasets in terms of the accuracy of meta learning models and the time needed to extract them. Further enhancement in the predictive performance of the meta learning classification model is achieved by training the meta learner on new efficient meta examples. To this end, we proposed two new approaches to generate new energy time series datasets to be used as training meta examples by the meta learner depending on the type of time series dataset (i.e. generation or energy consumption time series). We find that extending the original training sets with new meta examples generated by our approaches outperformed the case in which the original is extended by new simulated energy time series datasets

KITopen

Pairwise meta-rules for better meta-learning-based algorithm ranking

Author: Pfahringer Bernhard
Sun Quan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2013
Field of study

In this paper, we present a novel meta-feature generation method in the context of meta-learning, which is based on rules that compare the performance of individual base learners in a one-against-one manner. In addition to these new meta-features, we also introduce a new meta-learner called Approximate Ranking Tree Forests (ART Forests) that performs very competitively when compared with several state-of-the-art meta-learners. Our experimental results are based on a large collection of datasets and show that the proposed new techniques can improve the overall performance of meta-learning for algorithm ranking significantly. A key point in our approach is that each performance figure of any base learner for any specific dataset is generated by optimising the parameters of the base learner separately for each dataset

Research Commons@Waikato

A Recommendation System for Meta-modeling: A Meta-learning Based Approach

Author: Acar
Baker
Banks
Barker
Barton
Bashiri
Brazdil
Brazdil
Brazdil
Chakroborty
Chang
Clarke
Cui
De Souto
Draper
Drucker
Dyn
Eckart
Efroymson
Fang
Fonseca
Friedman
Gergonne
Giraud-Carrier
Goodarzi
Greenland
Grubbs
Hocking
Jin
Kira
Kleijnen
Kohavi
Kononenko
Kristensen
Kuba
Köpf
Lan
Liang
Liang
Matala
Matijaš
McCulloch
Nasereddin
Neave
Packianather
Phillips
Prudencio
Rendell
Rosenblatt
Shaw
Simek
Simek
Simpson
Smith
Smith
Smith-Miles
Souza
Sun
Utgoff
Vilalta
Wang
Wolpert
Wolpert
Yin
Zhang
Zhou
Publication venue: AFIT Scholar
Publication date: 01/01/2016
Field of study

Various meta-modeling techniques have been developed to replace computationally expensive simulation models. The performance of these meta-modeling techniques on different models is varied which makes existing model selection/recommendation approaches (e.g., trial-and-error, ensemble) problematic. To address these research gaps, we propose a general meta-modeling recommendation system using meta-learning which can automate the meta-modeling recommendation process by intelligently adapting the learning bias to problem characterizations. The proposed intelligent recommendation system includes four modules: (1) problem module, (2) meta-feature module which includes a comprehensive set of meta-features to characterize the geometrical properties of problems, (3) meta-learner module which compares the performance of instance-based and model-based learning approaches for optimal framework design, and (4) performance evaluation module which introduces two criteria, Spearman\u27s ranking correlation coefficient and hit ratio, to evaluate the system on the accuracy of model ranking prediction and the precision of the best model recommendation, respectively. To further improve the performance of meta-learning for meta-modeling recommendation, different types of feature reduction techniques, including singular value decomposition, stepwise regression and ReliefF, are studied. Experiments show that our proposed framework is able to achieve 94% correlation on model rankings, and a 91% hit ratio on best model recommendation. Moreover, the computational cost of meta-modeling recommendation is significantly reduced from an order of minutes to seconds compared to traditional trial-and-error and ensemble process. The proposed framework can significantly advance the research in meta-modeling recommendation, and can be applied for data-driven system modeling

AFTI Scholar (Air Force Institute of Technology)

Crossref

Artificial intelligence for MRI diagnosis of joints: a scoping review of the current state-of-the-art of deep learning-based approaches

Author: Fritz Benjamin
Fritz Jan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Deep learning-based MRI diagnosis of internal joint derangement is an emerging field of artificial intelligence, which offers many exciting possibilities for musculoskeletal radiology. A variety of investigational deep learning algorithms have been developed to detect anterior cruciate ligament tears, meniscus tears, and rotator cuff disorders. Additional deep learning-based MRI algorithms have been investigated to detect Achilles tendon tears, recurrence prediction of musculoskeletal neoplasms, and complex segmentation of nerves, bones, and muscles. Proof-of-concept studies suggest that deep learning algorithms may achieve similar diagnostic performances when compared to human readers in meta-analyses; however, musculoskeletal radiologists outperformed most deep learning algorithms in studies including a direct comparison. Earlier investigations and developments of deep learning algorithms focused on the binary classification of the presence or absence of an abnormality, whereas more advanced deep learning algorithms start to include features for characterization and severity grading. While many studies have focused on comparing deep learning algorithms against human readers, there is a paucity of data on the performance differences of radiologists interpreting musculoskeletal MRI studies without and with artificial intelligence support. Similarly, studies demonstrating the generalizability and clinical applicability of deep learning algorithms using realistic clinical settings with workflow-integrated deep learning algorithms are sparse. Contingent upon future studies showing the clinical utility of deep learning algorithms, artificial intelligence may eventually translate into clinical practice to assist detection and characterization of various conditions on musculoskeletal MRI exams

PubMed Central

ZORA