1,960 research outputs found

    Low-cost deep learning UAV and Raspberry Pi solution to real time pavement condition assessment

    Get PDF
    In this thesis, a real-time and low-cost solution to the autonomous condition assessment of pavement is proposed using deep learning, Unmanned Aerial Vehicle (UAV) and Raspberry Pi tiny computer technologies, which makes roads maintenance and renovation management more efficient and cost effective. A comparison study was conducted to compare the performance of seven different combinations of meta-architectures for pavement distress classification. It was observed that real-time object detection architecture SSD with MobileNet feature extractor is the best combination for real-time defect detection to be used by tiny computers. A low-cost Raspberry Pi smart defect detector camera was configured using the trained SSD MobileNet v1, which can be deployed with UAV for real-time and remote pavement condition assessment. The preliminary results show that the smart pavement detector camera achieves an accuracy of 60% at 1.2 frames per second in raspberry pi and 96% at 13.8 frames per second in CPU-based computer

    Metric Selection and Metric Learning for Matching Tasks

    Get PDF
    A quarter of a century after the world-wide web was born, we have grown accustomed to having easy access to a wealth of data sets and open-source software. The value of these resources is restricted if they are not properly integrated and maintained. A lot of this work boils down to matching; finding existing records about entities and enriching them with information from a new data source. In the realm of code this means integrating new code snippets into a code base while avoiding duplication. In this thesis, we address two different such matching problems. First, we leverage the diverse and mature set of string similarity measures in an iterative semisupervised learning approach to string matching. It is designed to query a user to make a sequence of decisions on specific cases of string matching. We show that we can find almost optimal solutions after only a small amount of such input. The low labelling complexity of our algorithm is due to addressing the cold start problem that is inherent to Active Learning; by ranking queries by variance before the arrival of enough supervision information, and by a self-regulating mechanism that counteracts initial biases. Second, we address the matching of code fragments for deduplication. Programming code is not only a tool, but also a resource that itself demands maintenance. Code duplication is a frequent problem arising especially from modern development practice. There are many reasons to detect and address code duplicates, for example to keep a clean and maintainable codebase. In such more complex data structures, string similarity measures are inadequate. In their stead, we study a modern supervised Metric Learning approach to model code similarity with Neural Networks. We find that in such a model representing the elementary tokens with a pretrained word embedding is the most important ingredient. Our results show both qualitatively (by visualization) that relatedness is modelled well by the embeddings and quantitatively (by ablation) that the encoded information is useful for the downstream matching task. As a non-technical contribution, we unify the common challenges arising in supervised learning approaches to Record Matching, Code Clone Detection and generic Metric Learning tasks. We give a novel account to string similarity measures from a psychological standpoint and point out and document one longstanding naming conflict in string similarity measures. Finally, we point out the overlap of latest research in Code Clone Detection with the field of Natural Language Processing

    Deep Learning -Powered Computational Intelligence for Cyber-Attacks Detection and Mitigation in 5G-Enabled Electric Vehicle Charging Station

    Get PDF
    An electric vehicle charging station (EVCS) infrastructure is the backbone of transportation electrification. However, the EVCS has various cyber-attack vulnerabilities in software, hardware, supply chain, and incumbent legacy technologies such as network, communication, and control. Therefore, proactively monitoring, detecting, and defending against these attacks is very important. The state-of-the-art approaches are not agile and intelligent enough to detect, mitigate, and defend against various cyber-physical attacks in the EVCS system. To overcome these limitations, this dissertation primarily designs, develops, implements, and tests the data-driven deep learning-powered computational intelligence to detect and mitigate cyber-physical attacks at the network and physical layers of 5G-enabled EVCS infrastructure. Also, the 5G slicing application to ensure the security and service level agreement (SLA) in the EVCS ecosystem has been studied. Various cyber-attacks such as distributed denial of services (DDoS), False data injection (FDI), advanced persistent threats (APT), and ransomware attacks on the network in a standalone 5G-enabled EVCS environment have been considered. Mathematical models for the mentioned cyber-attacks have been developed. The impact of cyber-attacks on the EVCS operation has been analyzed. Various deep learning-powered intrusion detection systems have been proposed to detect attacks using local electrical and network fingerprints. Furthermore, a novel detection framework has been designed and developed to deal with ransomware threats in high-speed, high-dimensional, multimodal data and assets from eccentric stakeholders of the connected automated vehicle (CAV) ecosystem. To mitigate the adverse effects of cyber-attacks on EVCS controllers, novel data-driven digital clones based on Twin Delayed Deep Deterministic Policy Gradient (TD3) Deep Reinforcement Learning (DRL) has been developed. Also, various Bruteforce, Controller clones-based methods have been devised and tested to aid the defense and mitigation of the impact of the attacks of the EVCS operation. The performance of the proposed mitigation method has been compared with that of a benchmark Deep Deterministic Policy Gradient (DDPG)-based digital clones approach. Simulation results obtained from the Python, Matlab/Simulink, and NetSim software demonstrate that the cyber-attacks are disruptive and detrimental to the operation of EVCS. The proposed detection and mitigation methods are effective and perform better than the conventional and benchmark techniques for the 5G-enabled EVCS

    Android source code vulnerability detection: a systematic literature review

    Get PDF
    The use of mobile devices is rising daily in this technological era. A continuous and increasing number of mobile applications are constantly offered on mobile marketplaces to fulfil the needs of smartphone users. Many Android applications do not address the security aspects appropriately. This is often due to a lack of automated mechanisms to identify, test, and fix source code vulnerabilities at the early stages of design and development. Therefore, the need to fix such issues at the initial stages rather than providing updates and patches to the published applications is widely recognized. Researchers have proposed several methods to improve the security of applications by detecting source code vulnerabilities and malicious codes. This Systematic Literature Review (SLR) focuses on Android application analysis and source code vulnerability detection methods and tools by critically evaluating 118 carefully selected technical studies published between 2016 and 2022. It highlights the advantages, disadvantages, applicability of the proposed techniques and potential improvements of those studies. Both Machine Learning (ML) based methods and conventional methods related to vulnerability detection are discussed while focusing more on ML-based methods since many recent studies conducted experiments with ML. Therefore, this paper aims to enable researchers to acquire in-depth knowledge in secure mobile application development while minimizing the vulnerabilities by applying ML methods. Furthermore, researchers can use the discussions and findings of this SLR to identify potential future research and development directions

    Checking smart contracts with structural code embedding

    Get PDF
    Ministry of Education, Singapore under its Academic Research Funding Tier

    How to get better embeddings with code pre-trained models? An empirical study

    Full text link
    Pre-trained language models have demonstrated powerful capabilities in the field of natural language processing (NLP). Recently, code pre-trained model (PTM), which draw from the experiences of the NLP field, have also achieved state-of-the-art results in many software engineering (SE) downstream tasks. These code PTMs take into account the differences between programming languages and natural languages during pre-training and make adjustments to pre-training tasks and input data. However, researchers in the SE community still inherit habits from the NLP field when using these code PTMs to generate embeddings for SE downstream classification tasks, such as generating semantic embeddings for code snippets through special tokens and inputting code and text information in the same way as pre-training the PTMs. In this paper, we empirically study five different PTMs (i.e. CodeBERT, CodeT5, PLBART, CodeGPT and CodeGen) with three different architectures (i.e. encoder-only, decoder-only and encoder-decoder) on four SE downstream classification tasks (i.e. code vulnerability detection, code clone detection, just-in-time defect prediction and function docstring mismatch detection) with respect to the two aforementioned aspects. Our experimental results indicate that (1) regardless of the architecture of the code PTMs used, embeddings obtained through special tokens do not sufficiently aggregate the semantic information of the entire code snippet; (2) the quality of code embeddings obtained by combing code data and text data in the same way as pre-training the PTMs is poor and cannot guarantee richer semantic information; (3) using the method that aggregates the vector representations of all code tokens, the decoder-only PTMs can obtain code embeddings with semantics as rich as or even better quality than those obtained from the encoder-only and encoder-decoder PTMs

    Deep learning for unsupervised domain adaptation in medical imaging: Recent advancements and future perspectives

    Full text link
    Deep learning has demonstrated remarkable performance across various tasks in medical imaging. However, these approaches primarily focus on supervised learning, assuming that the training and testing data are drawn from the same distribution. Unfortunately, this assumption may not always hold true in practice. To address these issues, unsupervised domain adaptation (UDA) techniques have been developed to transfer knowledge from a labeled domain to a related but unlabeled domain. In recent years, significant advancements have been made in UDA, resulting in a wide range of methodologies, including feature alignment, image translation, self-supervision, and disentangled representation methods, among others. In this paper, we provide a comprehensive literature review of recent deep UDA approaches in medical imaging from a technical perspective. Specifically, we categorize current UDA research in medical imaging into six groups and further divide them into finer subcategories based on the different tasks they perform. We also discuss the respective datasets used in the studies to assess the divergence between the different domains. Finally, we discuss emerging areas and provide insights and discussions on future research directions to conclude this survey.Comment: Under Revie
    corecore