11 research outputs found

    SparkFlow : towards high-performance data analytics for Spark-based genome analysis

    Get PDF
    The recent advances in DNA sequencing technology triggered next-generation sequencing (NGS) research in full scale. Big Data (BD) is becoming the main driver in analyzing these large-scale bioinformatic data. However, this complicated process has become the system bottleneck, requiring an amalgamation of scalable approaches to deliver the needed performance and hide the deployment complexity. Utilizing cutting-edge scientific workflows can robustly address these challenges. This paper presents a Spark-based alignment workflow called SparkFlow for massive NGS analysis over singularity containers. SparkFlow is highly scalable, reproducible, and capable of parallelizing computation by utilizing data-level parallelism and load balancing techniques in HPC and Cloud environments. The proposed workflow capitalizes on benchmarking two state-of-art NGS workflows, i.e., BaseRecalibrator and ApplyBQSR. SparkFlow realizes the ability to accelerate large-scale cancer genomic analysis by scaling vertically (HyperThreading) and horizontally (provisions on-demand). Our result demonstrates a trade-off inevitably between the targeted applications and processor architecture. SparkFlow achieves a decisive improvement in NGS computation performance, throughput, and scalability while maintaining deployment complexity. The paper’s findings aim to pave the way for a wide range of revolutionary enhancements and future trends within the High-performance Data Analytics (HPDA) genome analysis realm.Postprin

    FedCSD: A Federated Learning Based Approach for Code-Smell Detection

    Full text link
    This paper proposes a Federated Learning Code Smell Detection (FedCSD) approach that allows organizations to collaboratively train federated ML models while preserving their data privacy. These assertions have been supported by three experiments that have significantly leveraged three manually validated datasets aimed at detecting and examining different code smell scenarios. In experiment 1, which was concerned with a centralized training experiment, dataset two achieved the lowest accuracy (92.30%) with fewer smells, while datasets one and three achieved the highest accuracy with a slight difference (98.90% and 99.5%, respectively). This was followed by experiment 2, which was concerned with cross-evaluation, where each ML model was trained using one dataset, which was then evaluated over the other two datasets. Results from this experiment show a significant drop in the model's accuracy (lowest accuracy: 63.80\%) where fewer smells exist in the training dataset, which has a noticeable reflection (technical debt) on the model's performance. Finally, the last and third experiments evaluate our approach by splitting the dataset into 10 companies. The ML model was trained on the company's site, then all model-updated weights were transferred to the server. Ultimately, an accuracy of 98.34% was achieved by the global model that has been trained using 10 companies for 100 training rounds. The results reveal a slight difference in the global model's accuracy compared to the highest accuracy of the centralized model, which can be ignored in favour of the global model's comprehensive knowledge, lower training cost, preservation of data privacy, and avoidance of the technical debt problem.Comment: 17 pages, 7 figures, Journal pape

    Machine Learning Schemes for Anomaly Detection in Solar Power Plants

    No full text
    The rapid industrial growth in solar energy is gaining increasing interest in renewable power from smart grids and plants. Anomaly detection in photovoltaic (PV) systems is a demanding task. In this sense, it is vital to utilize the latest updates in machine learning technology to accurately and timely disclose different system anomalies. This paper addresses this issue by evaluating the performance of different machine learning schemes and applying them to detect anomalies on photovoltaic components. The following schemes are evaluated: AutoEncoder Long Short-Term Memory (AE-LSTM), Facebook-Prophet, and Isolation Forest. These models can identify the PV system’s healthy and abnormal actual behaviors. Our results provide clear insights to make an informed decision, especially with experimental trade-offs for such a complex solution space

    A cognitive deep learning approach for medical image processing

    No full text
    In ophthalmic diagnostics, achieving precise segmentation of retinal blood vessels is a critical yet challenging task, primarily due to the complex nature of retinal images. The intricacies of these images often hinder the accuracy and efficiency of segmentation processes. To overcome these challenges, we introduce the cognitive DL retinal blood vessel segmentation (CoDLRBVS), a novel hybrid model that synergistically combines the deep learning capabilities of the U-Net architecture with a suite of advanced image processing techniques. This model uniquely integrates a preprocessing phase using a matched filter (MF) for feature enhancement and a post-processing phase employing morphological techniques (MT) for refining the segmentation output. Also, the model incorporates multi-scale line detection and scale space methods to enhance its segmentation capabilities. Hence, CoDLRBVS leverages the strengths of these combined approaches within the cognitive computing framework, endowing the system with human-like adaptability and reasoning. This strategic integration enables the model to emphasize blood vessels, accurately segment effectively, and proficiently detect vessels of varying sizes. CoDLRBVS achieves a notable mean accuracy of 96.7%, precision of 96.9%, sensitivity of 99.3%, and specificity of 80.4% across all of the studied datasets, including DRIVE, STARE, HRF, retinal blood vessel and Chase-DB1. CoDLRBVS has been compared with different models, and the resulting metrics surpass the compared models and establish a new benchmark in retinal vessel segmentation. The success of CoDLRBVS underscores its significant potential in advancing medical image processing, particularly in the realm of retinal blood vessel segmentation

    Active Machine Learning Adversarial Attack Detection in the User Feedback Process

    No full text
    Modern Information and Communication Technology (ICT)-based applications utilize current technological advancements for purposes of streaming data, as a way of adapting to the ever-changing technological landscape. Such efforts require providing accurate, meaningful, and trustworthy output from the streaming sensors particularly during dynamic virtual sensing. However, to ensure that the sensing ecosystem is devoid of any sensor threats or active attacks, it is paramount to implement secure real-time strategies. Fundamentally, real-time detection of adversarial attacks/instances during the User Feedback Process (UFP) is the key to forecasting potential attacks in active learning. Also, according to existing literature, there lacks a comprehensive study that has a focus on adversarial detection from an active machine learning perspective at the time of writing this paper. Therefore, the authors posit the importance of detecting adversarial attacks in active learning strategy. Attack in the context of this paper through a UFP-Threat driven model has been presented as any action that exerts an alteration to the learning system or data. To achieve this, the study employed ambient data collected from a smart environment human activity recognition from (Continuous Ambient Sensors Dataset, CASA) with fully labeled connections, where we intentionally subject the Dataset to wrong labels as a targeted/manipulative attack (by a malevolent labeler) in the UFP, with an assumption that the user-labels were connected to unique identities. While the dataset's focus is to classify tasks and predict activities, our study gives a focus on active adversarial strategies from an information security point of view. Furthermore, the strategies for modeling threats have been presented using the Meta Attack Language (MAL) compiler for purposes adversarial detection. The findings from the experiments conducted have shown that real-time adversarial identification and profiling during the UFP could significantly increase the accuracy during the learning process with a high degree of certainty and paves the way towards an automated adversarial detection and profiling approaches on the Internet of Cognitive Things (ICoT)

    An Overview of using of Artificial Intelligence in Enhancing Security and Privacy in Mobile Social Networks

    No full text
    Mobile Social Networks (MSNs) have emerged as pivotal platforms for communication, information dissemination, and social connection in contemporary society. As their prevalence escalates, so too do concerns regarding security and privacy. This paper presents a furnishes a detailed analysis of these pressing issues and elucidates how Artificial Intelligence (AI) can be instrumental in addressing them. The study thoroughly explores a spectrum of security and privacy challenges endemic to MSNs, such as data leakage, unauthorized access, cyberstalking, location privacy, and more. Additionally, the investigation expands to encompass problems like impersonation, phishing attacks, malware threats, information overload, user profiling, inadequate privacy policies, third-party application vulnerabilities, and privacy issues related to photos, videos, end-to-end encryption, Wi-Fi connections, and data retention. Each of these issues is dissected in depth, highlighting the potential risks and implications for users. Furthermore, the paper underlines how AI can provide in mitigating these problems, establishing its fundamental role in fortifying the security and privacy of MSNs. This thorough analysis offers valuable insights and feasible solutions using AI to bolster security and privacy in the ever-evolving landscape of Mobile Social Networks. © 2023 IEEE

    A Comprehensive Study on the Role of Machine Learning in 5G Security: Challenges, Technologies, and Solutions

    No full text
    Fifth-generation (5G) mobile networks have already marked their presence globally, revolutionizing entertainment, business, healthcare, and other domains. While this leap forward brings numerous advantages in speed and connectivity, it also poses new challenges for security protocols. Machine learning (ML) and deep learning (DL) have been employed to augment traditional security measures, promising to mitigate risks and vulnerabilities. This paper conducts an exhaustive study to assess ML and DL algorithms’ role and effectiveness within the 5G security landscape. Also, it offers a profound dissection of the 5G network’s security paradigm, particularly emphasizing the transformative role of ML and DL as enabling security tools. This study starts by examining the unique architecture of 5G and its inherent vulnerabilities, contrasting them with emerging threat vectors. Next, we conduct a detailed analysis of the network’s underlying segments, such as network slicing, Massive Machine-Type Communications (mMTC), and edge computing, revealing their associated security challenges. By scrutinizing current security protocols and international regulatory impositions, this paper delineates the existing 5G security landscape. Finally, we outline the capabilities of ML and DL in redefining 5G security. We detail their application in enhancing anomaly detection, fortifying predictive security measures, and strengthening intrusion prevention strategies. This research sheds light on the present-day 5G security challenges and offers a visionary perspective, highlighting the intersection of advanced computational methods and future 5G security

    FedCSD : A Federated Learning Based Approach for Code-Smell Detection

    No full text
    Software quality is critical, as low quality, or "Code smell," increases technical debt and maintenance costs. There is a timely need for a collaborative model that detects and manages code smells by learning from diverse and distributed data sources while respecting privacy and providing a scalable solution for continuously integrating new patterns and practices in code quality management. However, the current literature is still missing such capabilities. This paper addresses the previous challenges by proposing a Federated Learning Code Smell Detection (FedCSD) approach, specifically targeting "God Class," to enable organizations to train distributed ML models while safeguarding data privacy collaboratively. We conduct experiments using manually validated datasets to detect and analyze code smell scenarios to validate our approach. Experiment 1, a centralized training experiment, revealed varying accuracies across datasets, with dataset two achieving the lowest accuracy (92.30%) and datasets one and three achieving the highest (98.90% and 99.5%, respectively). Experiment 2, focusing on cross-evaluation, showed a significant drop in accuracy (lowest: 63.80%) when fewer smells were present in the training dataset, reflecting technical debt. Experiment 3 involved splitting the dataset across 10 companies, resulting in a global model accuracy of 98.34%, comparable to the centralized model's highest accuracy. The application of federated ML techniques demonstrates promising performance improvements in code-smell detection, benefiting both software developers and researchers

    ASSERT : A Blockchain-Based Architectural Approach for Engineering Secure Self-Adaptive IoT Systems

    No full text
    Internet of Things (IoT) systems are complex systems that can manage mission-critical, costly operations or the collection, storage, and processing of sensitive data. Therefore, security represents a primary concern that should be considered when engineering IoT systems. Additionally, several challenges need to be addressed, including the following ones. IoT systems’ environments are dynamic and uncertain. For instance, IoT devices can be mobile or might run out of batteries, so they can become suddenly unavailable. To cope with such environments, IoT systems can be engineered as goal-driven and self-adaptive systems. A goal-driven IoT system is composed of a dynamic set of IoT devices and services that temporarily connect and cooperate to achieve a specific goal. Several approaches have been proposed to engineer goal-driven and self-adaptive IoT systems. However, none of the existing approaches enable goal-driven IoT systems to automatically detect security threats and autonomously adapt to mitigate them. Toward bridging these gaps, this paper proposes a distributed architectural Approach for engineering goal-driven IoT Systems that can autonomously SElf-adapt to secuRity Threats in their environments (ASSERT). ASSERT exploits techniques and adopts notions, such as agents, federated learning, feedback loops, and blockchain, for maintaining the systems’ security and enhancing the trustworthiness of the adaptations they perform. The results of the experiments that we conducted to validate the approach’s feasibility show that it performs and scales well when detecting security threats, performing autonomous security adaptations to mitigate the threats and enabling systems’ constituents to learn about security threats in their environments collaboratively. © 2022 by the authors.open access</p
    corecore