Search CORE

24 research outputs found

XGV-BERT: Leveraging Contextualized Language Model and Graph Neural Network for Efficient Software Vulnerability Detection

Author: Duy Phan The
Pham Van-Hau
Phat Chau Thuan
Quan Vu Le Anh
Van Nguyen Kiet
Publication venue
Publication date: 26/09/2023
Field of study

With the advancement of deep learning (DL) in various fields, there are many attempts to reveal software vulnerabilities by data-driven approach. Nonetheless, such existing works lack the effective representation that can retain the non-sequential semantic characteristics and contextual relationship of source code attributes. Hence, in this work, we propose XGV-BERT, a framework that combines the pre-trained CodeBERT model and Graph Neural Network (GCN) to detect software vulnerabilities. By jointly training the CodeBERT and GCN modules within XGV-BERT, the proposed model leverages the advantages of large-scale pre-training, harnessing vast raw data, and transfer learning by learning representations for training data through graph convolution. The research results demonstrate that the XGV-BERT method significantly improves vulnerability detection accuracy compared to two existing methods such as VulDeePecker and SySeVR. For the VulDeePecker dataset, XGV-BERT achieves an impressive F1-score of 97.5%, significantly outperforming VulDeePecker, which achieved an F1-score of 78.3%. Again, with the SySeVR dataset, XGV-BERT achieves an F1-score of 95.5%, surpassing the results of SySeVR with an F1-score of 83.5%

arXiv.org e-Print Archive

Fed-LSAE: Thwarting Poisoning Attacks against Federated Cyber Threat Detection System via Autoencoder-based Latent Space Inspection

Author: Duy Phan The
Hien Do Thi Thu
Luong Tran Duc
Pham Van-Hau
Quyen Nguyen Huu
Tien Vuong Minh
Publication venue
Publication date: 20/09/2023
Field of study

The significant rise of security concerns in conventional centralized learning has promoted federated learning (FL) adoption in building intelligent applications without privacy breaches. In cybersecurity, the sensitive data along with the contextual information and high-quality labeling in each enterprise organization play an essential role in constructing high-performance machine learning (ML) models for detecting cyber threats. Nonetheless, the risks coming from poisoning internal adversaries against FL systems have raised discussions about designing robust anti-poisoning frameworks. Whereas defensive mechanisms in the past were based on outlier detection, recent approaches tend to be more concerned with latent space representation. In this paper, we investigate a novel robust aggregation method for FL, namely Fed-LSAE, which takes advantage of latent space representation via the penultimate layer and Autoencoder to exclude malicious clients from the training process. The experimental results on the CIC-ToN-IoT and N-BaIoT datasets confirm the feasibility of our defensive mechanism against cutting-edge poisoning attacks for developing a robust FL-based threat detector in the context of IoT. More specifically, the FL evaluation witnesses an upward trend of approximately 98% across all metrics when integrating with our Fed-LSAE defense

arXiv.org e-Print Archive

Some algorithms to solve a bi-objectives problem for team selection

Author: Bui DC
Bui NA
Kieu QT
Le PC
Ngo TS
Nguyen BS
Nguyen The Duy
Phan LD
Tran SN
Tran TT
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

In real life, many problems are instances of combinatorial optimization. Cross-functional team selection is one of the typical issues. The decision-maker has to select solutions among (kh) solutions in the decision space, where k is the number of all candidates, and h is the number of members in the selected team. This paper is our continuing work since 2018; here, we introduce the completed version of the Min Distance to the Boundary model (MDSB) that allows access to both the "deep" and "wide" aspects of the selected team. The compromise programming approach enables decision-makers to ignore the parameters in the decision-making process. Instead, they point to the one scenario they expect. The aim of model construction focuses on finding the solution that matched the most to the expectation. We develop two algorithms: one is the genetic algorithm and another based on the philosophy of DC programming (DC) and its algorithm (DCA) to find the optimal solution. We also compared the introduced algorithms with the MIQP-CPLEX search algorithm to show their effectiveness

Multidisciplinary Digital Publishing Institute

University of Tasmania Open Access Repository

On the Effectiveness of Adversarial Samples against Ensemble Learning-based Windows PE Malware Detectors

Author: Duy Phan The
Hien Do Thi Thu
Hoang Hien Do
Khoa Nghi Hoang
Kim Danh Le
Pham Van-Hau
To Trong-Nghia
Publication venue
Publication date: 24/09/2023
Field of study

Recently, there has been a growing focus and interest in applying machine learning (ML) to the field of cybersecurity, particularly in malware detection and prevention. Several research works on malware analysis have been proposed, offering promising results for both academic and practical applications. In these works, the use of Generative Adversarial Networks (GANs) or Reinforcement Learning (RL) can aid malware creators in crafting metamorphic malware that evades antivirus software. In this study, we propose a mutation system to counteract ensemble learning-based detectors by combining GANs and an RL model, overcoming the limitations of the MalGAN model. Our proposed FeaGAN model is built based on MalGAN by incorporating an RL model called the Deep Q-network anti-malware Engines Attacking Framework (DQEAF). The RL model addresses three key challenges in performing adversarial attacks on Windows Portable Executable malware, including format preservation, executability preservation, and maliciousness preservation. In the FeaGAN model, ensemble learning is utilized to enhance the malware detector's evasion ability, with the generated adversarial patterns. The experimental results demonstrate that 100\% of the selected mutant samples preserve the format of executable files, while certain successes in both executability preservation and maliciousness preservation are achieved, reaching a stable success rate

arXiv.org e-Print Archive

VNHSGE: VietNamese High School Graduation Examination Dataset for Large Language Models

Author: Bac-Bien Ngo
Hong-Phuoc Nguyen
Ngoc-Bich Le
The-Duy Vo
Thi-My-Thanh Nguyen
Van-Tien Nguyen
Xuan-Dung Phan
Xuan-Quy Dao
Publication venue
Publication date: 20/05/2023
Field of study

The VNHSGE (VietNamese High School Graduation Examination) dataset, developed exclusively for evaluating large language models (LLMs), is introduced in this article. The dataset, which covers nine subjects, was generated from the Vietnamese National High School Graduation Examination and comparable tests. 300 literary essays have been included, and there are over 19,000 multiple-choice questions on a range of topics. The dataset assesses LLMs in multitasking situations such as question answering, text generation, reading comprehension, visual question answering, and more by including both textual data and accompanying images. Using ChatGPT and BingChat, we evaluated LLMs on the VNHSGE dataset and contrasted their performance with that of Vietnamese students to see how well they performed. The results show that ChatGPT and BingChat both perform at a human level in a number of areas, including literature, English, history, geography, and civics education. They still have space to grow, though, especially in the areas of mathematics, physics, chemistry, and biology. The VNHSGE dataset seeks to provide an adequate benchmark for assessing the abilities of LLMs with its wide-ranging coverage and variety of activities. We intend to promote future developments in the creation of LLMs by making this dataset available to the scientific community, especially in resolving LLMs' limits in disciplines involving mathematics and the natural sciences.Comment: 74 pages, 44 figure

arXiv.org e-Print Archive

Awareness and preparedness of healthcare workers against the first wave of the COVID-19 pandemic: A cross-sectional survey across 57 countries.

Author: Abbas Kirellos Said
Abdul Aziz Jeza
Alhady Shamael Thabit Mohammed
Balogun Emmanuel Oluwadare
Chico R Matthew
contributors of the TMGH-Global COVID-19 Collaborative
Cox Sharon
Dat Vu Quoc
Dhouibi Nacir
Dong Vinh
Duc Nguyen Tran Minh
Dumre Shyam Prakash
Dung Tran Nu Thuy
Duru Vincent
Duy Nguyen The
Gad Abdelrahman
Ghozy Sherief
Giang Hoang Thi Nam
Hai Yen Tran
Hashan Mohammad Rashidul
Hirayama Kenji
Huan Vuong Thanh
Hue Nguyen Thi Linh
Hung Pham Dinh Long
Huy Nguyen Tien
Huynh Trang
Huynh Vy Thi Nhat
Imoto Atsuko
Jee Yap Siang
Karimzadeh Sedighe
Khue Bui Diem
Koonrungsesomboon Nut
Kubota Kazumi
Lee Peter N
Linh Le Khac
Luu Mai Ngoc
Matsui Mitsuaki
Mohamed Eltaras Mennatullah
Mohammed Ali Al-Ahdal Tareq
Moji Kazuhiko
Nam Nguyen Hai
Ng Sze Jia
Nguyen Hoang-Minh
Pavlenko Dmytro
Phan Truc
Phuong Dang Thuy Ha
Qarawi Ahmad Taysir Atieh
Quynh Tran Thuy Huong
Shah Jaffer
Shaikhkhalil Hosam Waleed
Sharma Akash
Smith Chris
Soliman Mohammed
Tam Dao Ngoc Hien
Tawfik Gehad Mohamed
Thi Nguyen Anh
TMGH-Global COVID-19 Collaborative
Trang Luong Thi
Trang Vu Thi Thu
Truong Le Van
Uyen Vuong Ngoc Thao
Vu Le Thuong
Vuong Nguyen Lam
Yen-Xuan Nguyen Thi
Publication venue: PLoS One
Publication date: 01/01/2021
Field of study

BACKGROUND: Since the COVID-19 pandemic began, there have been concerns related to the preparedness of healthcare workers (HCWs). This study aimed to describe the level of awareness and preparedness of hospital HCWs at the time of the first wave. METHODS: This multinational, multicenter, cross-sectional survey was conducted among hospital HCWs from February to May 2020. We used a hierarchical logistic regression multivariate analysis to adjust the influence of variables based on awareness and preparedness. We then used association rule mining to identify relationships between HCW confidence in handling suspected COVID-19 patients and prior COVID-19 case-management training. RESULTS: We surveyed 24,653 HCWs from 371 hospitals across 57 countries and received 17,302 responses from 70.2% HCWs overall. The median COVID-19 preparedness score was 11.0 (interquartile range [IQR] = 6.0-14.0) and the median awareness score was 29.6 (IQR = 26.6-32.6). HCWs at COVID-19 designated facilities with previous outbreak experience, or HCWs who were trained for dealing with the SARS-CoV-2 outbreak, had significantly higher levels of preparedness and awareness (p<0.001). Association rule mining suggests that nurses and doctors who had a 'great-extent-of-confidence' in handling suspected COVID-19 patients had participated in COVID-19 training courses. Male participants (mean difference = 0.34; 95% CI = 0.22, 0.46; p<0.001) and nurses (mean difference = 0.67; 95% CI = 0.53, 0.81; p<0.001) had higher preparedness scores compared to women participants and doctors. INTERPRETATION: There was an unsurprising high level of awareness and preparedness among HCWs who participated in COVID-19 training courses. However, disparity existed along the lines of gender and type of HCW. It is unknown whether the difference in COVID-19 preparedness that we detected early in the pandemic may have translated into disproportionate SARS-CoV-2 burden of disease by gender or HCW type

LSHTM Research Online

PubMed Central

EUR Research Repository

Apollo (Cambridge)

Archivio della ricerca- Università di Roma La Sapienza

DigitalCommons@The Texas Medical Center

Awareness and preparedness of healthcare workers against the first wave of the COVID-19 pandemic: A cross-sectional survey across 57 countries

Author: Abbas Kirellos Said
Abdul Aziz Jeza
Alhady Shamael Thabit Mohammed
Balogun Emmanuel Oluwadare
Chico R. Matthew
Cox Sharon
Dat Vu Quoc
Dhouibi Nacir
Dong Vinh
Duc Nguyen Tran Minh
Dumre Shyam Prakash
Dung Tran Nu Thuy
Duru Vincent
Duy Nguyen The
Gad Abdelrahman
Ghozy Sherief
Giang Hoang Thi Nam
Hai Yen Tran
Hashan Mohammad Rashidul
Hirayama Kenji
Huan Vuong Thanh
Hue Nguyen Thi Linh
Hung Pham Dinh Long
Huy Nguyen Tien
Huynh Trang
Huynh Vy Thi Nhat
Imoto Atsuko
Jee Yap Siang
Karimzadeh Sedighe
Khue Bui Diem
Koonrungsesomboon Nut
Kubota Kazumi
Lee Peter N.
Linh Le Khac
Luu Mai Ngoc
Matsui Mitsuaki
Mohamed Eltaras Mennatullah
Mohammed Ali AL-Ahdal Tareq
Moji Kazuhiko
Nam Nguyen Hai
Ng Sze Jia
Nguyen Hoang-Minh
Pavlenko Dmytro
Phan Truc
Phuong Dang Thuy Ha
Qarawi Ahmad Taysir Atieh
Quynh Tran Thuy Huong
Shah Jaffer
Shaikhkhalil Hosam Waleed
Sharma Akash
Smith Chris
Soliman Mohammed
Tam Dao Ngoc Hien
Tawfik Gehad Mohamed
Thi Nguyen Anh
Trang Luong Thi
Trang Vu Thi Thu
Truong Le Van
Uyen Vuong Ngoc Thao
Vu Le Thuong
Vuong Nguyen Lam
Yen-Xuan Nguyen Thi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/12/2021
Field of study

Nagasaki University's Academic Output SITE: NAOSITE

Perpendicular magnetic anisotropy and the magnetization process in CoFeB/Pd multilayer films

Author: Hung Tran Quang
Kim Dong-Hyun
Mølhave Kristian
Ngo Duc-The
Phan The-Long
Quach Duy-Truong
Publication venue: 'IOP Publishing'
Publication date: 01/01/2014
Field of study

Perpendicular magnetic anisotropy (PMA) and dynamic magnetization reversal process in [CoFeB

t

nm/Pd 1.0 nm]

_n

(

t

= 0.4, 0.6, 0.8, 1.0, and 1.2 nm;

n

= 2 - 20) multilayer films have been studied by means of magnetic hysteresis and Kerr effect measurements. Strong and controllable PMA with an effective uniaxial anisotropy up to 7.7

\times

^6

J.m

^{-3}

and a saturation magnetization as low as 200 emu/cc are achieved. Surface/interfacial anisotropy of CoFeB/Pd interfaces, the main contribution to the PMA, is separated from the effective uniaxial anisotropy of the films, and appears to increase with the number of the CoFeB/Pd bilayers. Observation of the magnetic domains during a magnetization reversal process using polar magneto-optical Kerr microscopy shows the detailed behavior of nucleation and displacement of the domain walls.Comment: 18 pages, 5 figures, original research articl

arXiv.org e-Print Archive

Crossref

The University of Manchester - Institutional Repository

Online Research Database In Technology

AAGAN: Android Malware Generation System Based on Generative Adversarial Network

Author: Doan Minh Trung
Nghi Hoang Khoa
Nguyen Tan Cam
Phan The Duy
Van-Hau Pham
Publication venue: World Scientific Publishing
Publication date: 01/05/2024
Field of study

With the rapid evolution of mobile malware, especially Android malware, machine learning (ML)-based Android malware detection systems have drawn massive attention. Although ML algorithms have recently led to many vital breakthroughs in malware detection, they are still particularly vulnerable to adversarial example (AE) attacks. By applying small random perturbations (e.g. simply modifying different kinds of features from the application’s manifest file), an AE attack can cause the misclassification of legitimate applications. This paper proposes AAGAN, an automated Android malware generation system based on Generative Adversarial Networks (GAN) that can successfully deceive current ML detectors. Our experiment results indicate that AEs generated by our system can flip the prediction of the state-of-the-art detection algorithms in 99% of cases using a real-world dataset. To defend against AE attacks, we improve the robustness of our detection system by alternatively retraining with these newly generated AEs. Surprisingly, after retraining five times, AAGAN can achieve an 89% success rate in bypassing our malware detection system

Directory of Open Access Journals