Search CORE

17,600 research outputs found

A Survey of Methods for Encrypted Traffic Classification and Analysis

Author: Adami
Alpaydin
Alshammari
Bacquet
Bar-Yanai
Bujlow
Callado
Cao
Dainotti
Du
Finsterbusch
Hellemons
Huang
ISO
Khakpour
Khalife
Maiolini
Miller
Nguyen
Stallings
Wright
Zhang
Publication venue: 'Wiley'
Publication date: 01/01/2015
Field of study

With the widespread use of encrypted data transport network traffic encryption is becoming a standard nowadays. This presents a challenge for traffic measurement, especially for analysis and anomaly detection methods which are dependent on the type of network traffic. In this paper, we survey existing approaches for classification and analysis of encrypted traffic. First, we describe the most widespread encryption protocols used throughout the Internet. We show that the initiation of an encrypted connection and the protocol structure give away a lot of information for encrypted traffic classification and analysis. Then, we survey payload and feature-based classification methods for encrypted traffic and categorize them using an established taxonomy. The advantage of some of described classification methods is the ability to recognize the encrypted application protocol in addition to the encryption protocol. Finally, we make a comprehensive comparison of the surveyed feature-based classification methods and present their weaknesses and strengths.Šifrování síťového provozu se v dnešní době stalo standardem. To přináší vysoké nároky na monitorování síťového provozu, zejména pak na analýzu provozu a detekci anomálií, které jsou závislé na znalosti typu síťového provozu. V tomto článku přinášíme přehled existujících způsobů klasifikace a analýzy šifrovaného provozu. Nejprve popisujeme nejrozšířenější šifrovací protokoly, a ukazujeme, jakým způsobem lze získat informace pro analýzu a klasifikaci šifrovaného provozu. Následně se zabýváme klasifikačními metodami založenými na obsahu paketů a vlastnostech síťového provozu. Tyto metody klasifikujeme pomocí zavedené taxonomie. Výhodou některých popsaných klasifikačních metod je schopnost rozeznat nejen šifrovací protokol, ale také šifrovaný aplikační protokol. Na závěr porovnáváme silné a slabé stránky všech popsaných klasifikačních metod

Crossref

Univerzitní repozitář Masarykovy univerzity

A comprehensive literature classification of simulation optimisation methods

Author: Ammeri Ahmed
Chachoub Habib
Hachicha Wafik
Masmoudi Faouzi
Publication venue
Publication date
Field of study

Simulation Optimization (SO) provides a structured approach to the system design and configuration when analytical expressions for input/output relationships are unavailable. Several excellent surveys have been written on this topic. Each survey concentrates on only few classification criteria. This paper presents a literature survey with all classification criteria on techniques for SO according to the problem of characteristics such as shape of the response surface (global as compared to local optimization), objective functions (single or multiple objectives) and parameter spaces (discrete or continuous parameters). The survey focuses specifically on the SO problem that involves single per-formance measureSimulation Optimization, classification methods, literature survey

Research Papers in Economics

A Bayesian approach to star-galaxy classification

Author: Andreon
Axel Gandy
Baldry
Ball
Ball
Bazell
Bazell
Bertin
Casali
Conover
Daniel J. Mortlock
David J. Hand
Drinkwater
Dye
Fukugita
Hambly
Hastie
Hewett
Heydon-Dumbleton
Irwin
Jarvis
Koo
Kron
Lawrence
Leauthaud
Lintott
MacGillivray
Marc Henrion
Messier
Miller
Mortlock
Mähönen
Odewahn
Philip
Richards
Schwarz
Scranton
Scranton
Sebok
Shapiro
Skrutskie
Suchkov
Sérsic
Warren
Weir
Wolf
Yasuda
York
Publication venue: 'Wiley'
Publication date: 19/11/2010
Field of study

Star-galaxy classification is one of the most fundamental data-processing tasks in survey astronomy, and a critical starting point for the scientific exploitation of survey data. For bright sources this classification can be done with almost complete reliability, but for the numerous sources close to a survey's detection limit each image encodes only limited morphological information. In this regime, from which many of the new scientific discoveries are likely to come, it is vital to utilise all the available information about a source, both from multiple measurements and also prior knowledge about the star and galaxy populations. It is also more useful and realistic to provide classification probabilities than decisive classifications. All these desiderata can be met by adopting a Bayesian approach to star-galaxy classification, and we develop a very general formalism for doing so. An immediate implication of applying Bayes's theorem to this problem is that it is formally impossible to combine morphological measurements in different bands without using colour information as well; however we develop several approximations that disregard colour information as much as possible. The resultant scheme is applied to data from the UKIRT Infrared Deep Sky Survey (UKIDSS), and tested by comparing the results to deep Sloan Digital Sky Survey (SDSS) Stripe 82 measurements of the same sources. The Bayesian classification probabilities obtained from the UKIDSS data agree well with the deep SDSS classifications both overall (a mismatch rate of 0.022, compared to 0.044 for the UKIDSS pipeline classifier) and close to the UKIDSS detection limit (a mismatch rate of 0.068 compared to 0.075 for the UKIDSS pipeline classifier). The Bayesian formalism developed here can be applied to improve the reliability of any star-galaxy classification schemes based on the measured values of morphology statistics alone.Comment: Accepted 22 November 2010, 19 pages, 17 figure

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Automated Protein Structure Classification: A Survey

Author: Hassanzadeh Oktie
Publication venue
Publication date: 01/01/2008
Field of study

Classification of proteins based on their structure provides a valuable resource for studying protein structure, function and evolutionary relationships. With the rapidly increasing number of known protein structures, manual and semi-automatic classification is becoming ever more difficult and prohibitively slow. Therefore, there is a growing need for automated, accurate and efficient classification methods to generate classification databases or increase the speed and accuracy of semi-automatic techniques. Recognizing this need, several automated classification methods have been developed. In this survey, we overview recent developments in this area. We classify different methods based on their characteristics and compare their methodology, accuracy and efficiency. We then present a few open problems and explain future directions.Comment: 14 pages, Technical Report CSRG-589, University of Toront

arXiv.org e-Print Archive

CiteSeerX

An intelligent assistant for exploratory data analysis

Author: B. S. Everitt
C. A. O'Muircheartaigh
D. E. Goldberg
J. A. Davis
J. A. Davis
J. Dougherty
J. Fox
J. Healey
J. R. Quinlan
J. R. Quinlan
J. W. Tukey
K. M. Ho
L. Breiman
L. Davis
P. D. Scott
P. D. Scott
U. M. Fayyad
W. J. Frawley
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/1997
Field of study

In this paper we present an account of the main features of SNOUT, an intelligent assistant for exploratory data analysis (EDA) of social science survey data that incorporates a range of data mining techniques. EDA has much in common with existing data mining techniques: its main objective is to help an investigator reach an understanding of the important relationships ina data set rather than simply develop predictive models for selectd variables. Brief descriptions of a number of novel techniques developed for use in SNOUT are presented. These include heuristic variable level inference and classification, automatic category formation, the use of similarity trees to identify groups of related variables, interactive decision tree construction and model selection using a genetic algorithm

Crossref

Kent Academic Repository

Using machine learning techniques to automate sky survey catalog generation

Author: Djorgovski S. G.
Doyle R. J.
Fayyad Usama M.
Roden J. C.
Weir Nicholas
Publication venue
Publication date
Field of study

We describe the application of machine classification techniques to the development of an automated tool for the reduction of a large scientific data set. The 2nd Palomar Observatory Sky Survey provides comprehensive photographic coverage of the northern celestial hemisphere. The photographic plates are being digitized into images containing on the order of 10(exp 7) galaxies and 10(exp 8) stars. Since the size of this data set precludes manual analysis and classification of objects, our approach is to develop a software system which integrates independently developed techniques for image processing and data classification. Image processing routines are applied to identify and measure features of sky objects. Selected features are used to determine the classification of each object. GID3* and O-BTree, two inductive learning techniques, are used to automatically learn classification decision trees from examples. We describe the techniques used, the details of our specific application, and the initial encouraging results which indicate that our approach is well-suited to the problem. The benefits of the approach are increased data reduction throughput, consistency of classification, and the automated derivation of classification rules that will form an objective, examinable basis for classifying sky objects. Furthermore, astronomers will be freed from the tedium of an intensely visual task to pursue more challenging analysis and interpretation problems given automatically cataloged data

NASA Technical Reports Server

A Survey on Compiler Autotuning using Machine Learning

Author: Ashouri Amir H.
Cavazos John
Killian William
Palermo Gianluca
Silvano Cristina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/09/2018
Field of study

Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano