Search CORE

10 research outputs found

Comparison of Deep Learning and the Classical Machine Learning Algorithm for the Malware Detection

Author: Rathore Hemant
Sahay Sanjay K.
Sewak Mohit
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2018
Field of study

Recently, Deep Learning has been showing promising results in various Artificial Intelligence applications like image recognition, natural language processing, language modeling, neural machine translation, etc. Although, in general, it is computationally more expensive as compared to classical machine learning techniques, their results are found to be more effective in some cases. Therefore, in this paper, we investigated and compared one of the Deep Learning Architecture called Deep Neural Network (DNN) with the classical Random Forest (RF) machine learning algorithm for the malware classification. We studied the performance of the classical RF and DNN with 2, 4 & 7 layers architectures with the four different feature sets, and found that irrespective of the features inputs, the classical RF accuracy outperforms the DNN.Comment: 11 Pages, 1 figur

arXiv.org e-Print Archive

Crossref

DRLDO A Novel DRL based De obfuscation System for Defence Against Metamorphic Malware

Author: Rathore Hemant
Sahay Sanjay K.
Sewak Mohit
Publication venue: 'Defence Scientific Information and Documentation Centre'
Publication date: 01/02/2021
Field of study

In this paper, we propose a novel mechanism to normalise metamorphic and obfuscated malware down at the opcode level and hence create an advanced metamorphic malware de-obfuscation and defence system. We name this system as DRLDO, for deep reinforcement learning based de-obfuscator. With the inclusion of the DRLDO as a sub-component, an existing Intrusion Detection System could be augmented with defensive capabilities against ‘zero-day’ attack from obfuscated and metamorphic variants of existing malware. This gains importance, not only because there exists no system till date that use advance DRL to intelligently and automatically normalise obfuscation down even to the opcode level, but also because the DRLDO system does not mandate any changes to the existing IDS. The DRLDO system does not even mandate the IDS’ classifier to be retrained with any new dataset containing obfuscated samples. Hence DRLDO could be easily retrofitted into any existing IDS deployment. We designed, developed, and conducted experiments on the system to evaluate the same against multiple-simultaneous attacks from obfuscations generated from malware samples from a standardised dataset that contain multiple generations of malware. Experimental results prove that DRLDO was able to successfully make the otherwise undetectable obfuscated variants of the malware detectable by an existing pre-trained malware classifier. The detection probability was raised well above the cut-off mark to 0.6 for the classifier to detect the obfuscated malware unambiguously. Further, the de-obfuscated variants generated by DRLDO achieved a very high correlation (of ≈ 0.99) with the base malware. This observation validates that the DRLDO system is actually learning to de-obfuscate and not exploiting a trivial trick

arXiv.org e-Print Archive

Defence Science Journal

Making Large Language Models Better Data Creators

Author: Jauhar Sujay Kumar
Lee Dong-Ho
Pujara Jay
Sewak Mohit
White Ryen W.
Publication venue
Publication date: 30/10/2023
Field of study

Although large language models (LLMs) have advanced the state-of-the-art in NLP significantly, deploying them for downstream applications is still challenging due to cost, responsiveness, control, or concerns around privacy and security. As such, trainable models are still the preferred option in some cases. However, these models still require human-labeled data for optimal performance, which is expensive and time-consuming to obtain. In order to address this issue, several techniques to reduce human effort involve labeling or generating data using LLMs. Although these methods are effective for certain applications, in practice they encounter difficulties in real-world scenarios. Labeling data requires careful data selection, while generating data necessitates task-specific prompt engineering. In this paper, we propose a unified data creation pipeline that requires only a single formatting example, and which is applicable to a broad range of tasks, including traditionally problematic ones with semantically devoid label spaces. In our experiments we demonstrate that instruction-following LLMs are highly cost-effective data creators, and that models trained with these data exhibit performance better than those trained with human-labeled data (by up to 17.5%) on out-of-distribution evaluation, while maintaining comparable performance on in-distribution tasks. These results have important implications for the robustness of NLP systems deployed in the real-world.Comment: Accepted to EMNLP 2023 main conference. 12 pages, 5 figures, 6 tables. Code is available at https://github.com/microsoft/llm-data-creatio

arXiv.org e-Print Archive

Deep reinforcement learning: frontiers of artificial intelligence

Author: Sewak Mohit
Publication venue: Springer Singapore Pte Limited
Publication date: 01/01/2019
Field of study

CERN Document Server

Practical convolutional neural networks: implement advanced deep learning models using Python

Author: Karim Rezaul
Pujari Pradeep
Sewak Mohit
Publication venue: Packt Publishing
Publication date: 01/01/2018
Field of study

CERN Document Server

AI in Finance: A Review

Author: A Hakman
A Hussein
A I Marqu�s
A Kim
Ahmet Murat �zbayoglu
Aijaz Shaikh
Alejandro Baldominos
Ali Abdallah Alalwan
Amiangshu Bosu
Angelos Filos
Anna Maria
Antonin Ponsich
Arash Bahrammirzaee
Benjamin Munro
Bernardo Nicoletti
Bhaskar Gowri Sankar Ramachandran
Bin Li
Blake Lebaron
Bob Mather
Bonnie G Buchanan
Boris A Galitsky
Boris Kovalerchuk
Bruno Remillard
Burhan Khan
Carlos Luis Casal� Ari�o
Cataldo Musto
Cesare Fracassi
Chen Liu
Colm Kearney
Cris Doloc
D Ferhat
Dadabada Pradeepkumar
Daniel E O&apos
Delei Sheng
Dimitrios Bisias
Dirk Helbing
Douglas W Arner
Douglas Wood
Du Mingxiao
Efstathios Kirkos
Elias Karam
Eric S�verin
Fang Chen
Firoozaeh Hajialiakbari
Fran Casino
G Henry
Georgios Sermpinis
G�rard Cornu�jols
Heiko Hesse
Henri Arslanian
Henry Hexmoor
Herbert Dawid
Huaizhi Wang
Huashan Chen
Ian Goodfellow
Ila Dutta
J B Heaton
Jakub Nowotarski
James E Gentle
Jaya Gutha
Jennifer Conway Viriato
Jesse Yli-Huumo
Jia Xu
Jianguo Xu
Jiawei Zhang
Jimmy Risk
Jing Tang
Johannes Paefgen
Johannes Ruf
Jonas Rothfuss
Jorge Maldonado-Correa
Jun Chen
Justin A Sirignano
K William Goetzmann
Kazutoshi Umemoto
Kem Zhang
L Cao
L Christian
Lawrence Trautman
Lean Yu
Leigh Tesfatsion
Leonid Hurwicz
Liying Mu
Longbing Cao
Longbing Cao
Longbing Cao
Longbing Cao
Longbing Cao
Longbing Cao
Longbing Cao
Longbing Cao
Longbing Cao
Lorella Fatone
Lovro Subelj
Luigi Troiano
Lukas Ryll
Lynne Hamill
M Mitchell
Manfred Gilli
Maozhu Jin
Marco Fagiani
Marcus Edwards
Marian H Amin
Mariana Rosa Montenegro
Matloob Terry Lingze Meng
Matthew Dixon
Maurice Peat
Mauricio Garc�a-Galicia
Michael Nwogugu
Michael Pinedo
Michal Tk�c
Mieko Tanaka-Yamawaki
Min-Yuh Day
Mohit Sewak
Mouad Bahij
Mousa Albashrawi
Nadire Cavus
Neelam Shinde
Nikita Kozodoi
Norman Ehrentreich
O&apos
Oecd
Pamela P Alvarez
Paolo Neirotti
Paul Embrechts
Peter Duchessi
Piotr Ladyzynski
P�l Roe
Qiang Wang
R Y Goh
Renato P Santos
Robert J Kauffman
Rodolfo C Cavalcante
Roger B Myerson
Roy Rada
S Charles
S Fatonah
Saeed Reza Arman Khadjeh Nassirtoussi
Saman Arash Negahdari Kia
Sameer Sharyn O&apos
Sang Il Lee
Sarah Perrin
Sepideh Kaffash
Shahriar Akter
Shane Underwood
Sima Siami-Namini
Souradeep Chakraborty
Springer
Stefano Moret
Steve Phelps
Steven Kou
Susan Athey
S�rgio Moro
Tania Ziegler
Theo Lynn
Thomas G Fischer
Tomi Dahlberg
Vasant Dhar
Wei Cao
Wei Cao
Wei Cao
Wei Huang
Wei Wei
Wenbo Wang
Wendland Marcelle Von
Wojciech Chmiel
Xiaohui Hou
Xiaolin Zheng
Xiaolong Liu
Xiaoqi Li
Yang Shirley Ou
Yong Hu
Yoshiharu Sato
Youfang Jin
Yuan Qi
Yumo Xu
Yves Hilpisch
Z Frank
Zeping Tong
Zhuoshu Li
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

Crossref