14 research outputs found

    Radar Emitter Classification based on Deep Ensemble

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceElectronic Support Measures (ESM) systems are designed to classify radar signals, providing information about the presence of threats. This function aids in battlefield situational awareness and the commander's decision on which countermeasures to employ. This dissertation aims to develop a deep ensemble model, recognizing the importance of a fast and precise classification based on a deep forest as an alternative to the parameter matching method. Four deep ensemble models and six of its base learners were built and evaluated to classify 52 emitters, using seven train/test datasets and two test datasets with noise, totalling 420 measurements of accuracy and classification speed. After analyzing these results, two deep ensemble models and their base learners were optimized, each for a different dataset, achieving 100% accuracy in a feature-engineered dataset and up to 98.358% in the original dataset. Regarding classification speed, the fastest models can classify 1000 records in 64ms, which may be acceptable in the real world. The experimental results of this approach reveal several advantages, making it a feasible alternative, including reduced dependency on ESM experts, ease of maintenance, quick to update, and high accuracy

    Определение обфускации JavaScript-программ с помощью раскрасок на абстрактных синтаксических деревьях

    Get PDF
    The paper deals with a problem of the obfuscated JavaScript code detection and classification based on Abstract Syntax Trees (AST) coloring. Colors of the AST vertexes and edges are assigned with regard to the types of the AST vertexes specified by the program lexical and syntax structure and the programming language standard. Research involved a few stages. First of the all, a non-obfuscated JavaScript programs dataset was collected by the public repositories evaluation. Secondly, obfuscated samples were created using eight open-source obfuscators. Classifier models were built using an algorithm of gradient boosting on the decision trees (GBDT). We built two types of the classifiers. The first one is the model that classifies the program according to the type of the obfuscator used, i.e. based on what obfuscator created the sample. The second one tries to detect samples obfuscated by the obfuscator whose samples are not observed during training. The quality of the obtained models is on par with the known published results. The feature engineering method proposed in the paper does not require a preliminary analysis of the obfuscators and obfuscating transformations. In the final part of the paper we analyze a quality of models estimated, discussing the certain statistical properties of the obfuscated and non-obfuscated samples obtained and corresponding colored ASTs. Analysis of generated samples of obfuscated programs has shown that the method proposed in the paper has some limitations. In particular, it is difficult to recognize minifiers or other obfuscating programs, which change the lexical structure to a greater extent and the syntax to a lesser extent. To improve the quality of detection of this kind of obscuring transformations, one can built combined classifiers using both the method based on the AST coloring and the additional information about lexemes and punctuation, for example, entropy of identifiers and strings, proportion of characters in upper and lower case, usage frequency of certain characters etc.В работе анализируется способ определения обфускации и вида используемого обфускатора методами машинного обучения при помощи данных о раскраске по типу вершин абстрактного синтаксического дерева (АСД) программы на языке JavaScript. Цвета вершин и рёбер назначаются в соответствии с типами вершин АСД, которые в свою очередь определяются лексической и синтаксической структурой программы и стандартом языка программирования. Исследование состояло из нескольких этапов. В начале был собран набор необфусцированных программ. После это создан набор обфусцированных программ при помощи восьми программ-обфускаторов с открытым исходным кодом. Классификаторы строились на основе алгоритма градиентного бустинга на решающих деревьях. Были построены модели, которые классифицировали программы по типу используемого обфускатора и по признаку обфусцированности. Модели, классифицирующие по признаку обфусцированности, детектировали образцы, обфусцированные в т.ч. теми обфускаторами, образцы для которых не входили в обучающую выборку. Качество полученных моделей находится на одном уровне с известными в литературе результатами. Предлагаемый в работе метод выделения признаков, подаваемых на вход классификатору, не требует предварительного анализа самих обфускаторов и знания обфусцирующих преобразований. В конце работы приводится анализ качества полученных моделей и рассматриваются некоторые статистические свойства полученного набора образцов обфусцированного кода. Анализ сгенерированных образцов обфусцированных программ показал, что предложенный в статье метод имеет некоторые ограничения, в частности, затруднено распознавание минификаторов и прочих обфусцирующих программ, в большей степени изменяющих лексическую структуру, и в меньшей — синтаксическую. Для улучшения качества детектирования запутывающих преобразований такого рода можно строить комбинированные классификаторы, использующие как метод, основанный на данных о раскраске АСД, так и дополнительную информацию о лексемах и пунктуации, например, данные об энтропии, пропорции символов в верхнем и нижнем регистре, частоте употребления определённых символов и т.д

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 18th China Annual Conference on Cyber Security, CNCERT 2022, held in Beijing, China, in August 2022. The 17 papers presented were carefully reviewed and selected from 64 submissions. The papers are organized according to the following topical sections: ​​data security; anomaly detection; cryptocurrency; information security; vulnerabilities; mobile internet; threat intelligence; text recognition

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 18th China Annual Conference on Cyber Security, CNCERT 2022, held in Beijing, China, in August 2022. The 17 papers presented were carefully reviewed and selected from 64 submissions. The papers are organized according to the following topical sections: ​​data security; anomaly detection; cryptocurrency; information security; vulnerabilities; mobile internet; threat intelligence; text recognition

    Anomaly Detection in Sequential Data: A Deep Learning-Based Approach

    Get PDF
    Anomaly Detection has been researched in various domains with several applications in intrusion detection, fraud detection, system health management, and bio-informatics. Conventional anomaly detection methods analyze each data instance independently (univariate or multivariate) and ignore the sequential characteristics of the data. Anomalies in the data can be detected by grouping the individual data instances into sequential data and hence conventional way of analyzing independent data instances cannot detect anomalies. Currently: (1) Deep learning-based algorithms are widely used for anomaly detection purposes. However, significant computational overhead time is incurred during the training process due to static constant batch size and learning rate parameters for each epoch, (2) the threshold to decide whether an event is normal or malicious is often set as static. This can drastically increase the false alarm rate if the threshold is set low or decrease the True Alarm rate if it is set to a remarkably high value, (3) Real-life data is messy. It is impossible to learn the data features by training just one algorithm. Therefore, several one-class-based algorithms need to be trained. The final output is the ensemble of the output from all the algorithms. The prediction accuracy can be increased by giving a proper weight to each algorithm\u27s output. By extending the state-of-the-art techniques in learning-based algorithms, this dissertation provides the following solutions: (i) To address (1), we propose a hybrid, dynamic batch size and learning rate tuning algorithm that reduces the overall training time of the neural network. (ii) As a solution for (2), we present an adaptive thresholding algorithm that reduces high false alarm rates. (iii) To overcome (3), we propose a multilevel hybrid ensemble anomaly detection framework that increases the anomaly detection rate of the high dimensional dataset

    TOWARDS A HOLISTIC EFFICIENT STACKING ENSEMBLE INTRUSION DETECTION SYSTEM USING NEWLY GENERATED HETEROGENEOUS DATASETS

    Get PDF
    With the exponential growth of network-based applications globally, there has been a transformation in organizations\u27 business models. Furthermore, cost reduction of both computational devices and the internet have led people to become more technology dependent. Consequently, due to inordinate use of computer networks, new risks have emerged. Therefore, the process of improving the speed and accuracy of security mechanisms has become crucial.Although abundant new security tools have been developed, the rapid-growth of malicious activities continues to be a pressing issue, as their ever-evolving attacks continue to create severe threats to network security. Classical security techniquesfor instance, firewallsare used as a first line of defense against security problems but remain unable to detect internal intrusions or adequately provide security countermeasures. Thus, network administrators tend to rely predominantly on Intrusion Detection Systems to detect such network intrusive activities. Machine Learning is one of the practical approaches to intrusion detection that learns from data to differentiate between normal and malicious traffic. Although Machine Learning approaches are used frequently, an in-depth analysis of Machine Learning algorithms in the context of intrusion detection has received less attention in the literature.Moreover, adequate datasets are necessary to train and evaluate anomaly-based network intrusion detection systems. There exist a number of such datasetsas DARPA, KDDCUP, and NSL-KDDthat have been widely adopted by researchers to train and evaluate the performance of their proposed intrusion detection approaches. Based on several studies, many such datasets are outworn and unreliable to use. Furthermore, some of these datasets suffer from a lack of traffic diversity and volumes, do not cover the variety of attacks, have anonymized packet information and payload that cannot reflect the current trends, or lack feature set and metadata.This thesis provides a comprehensive analysis of some of the existing Machine Learning approaches for identifying network intrusions. Specifically, it analyzes the algorithms along various dimensionsnamely, feature selection, sensitivity to the hyper-parameter selection, and class imbalance problemsthat are inherent to intrusion detection. It also produces a new reliable dataset labeled Game Theory and Cyber Security (GTCS) that matches real-world criteria, contains normal and different classes of attacks, and reflects the current network traffic trends. The GTCS dataset is used to evaluate the performance of the different approaches, and a detailed experimental evaluation to summarize the effectiveness of each approach is presented. Finally, the thesis proposes an ensemble classifier model composed of multiple classifiers with different learning paradigms to address the issue of detection accuracy and false alarm rate in intrusion detection systems

    Cyber Security

    Get PDF
    This open access book constitutes the refereed proceedings of the 16th International Annual Conference on Cyber Security, CNCERT 2020, held in Beijing, China, in August 2020. The 17 papers presented were carefully reviewed and selected from 58 submissions. The papers are organized according to the following topical sections: access control; cryptography; denial-of-service attacks; hardware security implementation; intrusion/anomaly detection and malware mitigation; social network security and privacy; systems security
    corecore