726 research outputs found

    The Application of Deep Learning and Cloud Technologies to Data Science

    Get PDF
    Machine Learning and Cloud Computing have become a staple to businesses and educational institutions over the recent years. The two forefronts of big data solutions have garnered technology giants to race for the superior implementation of both Machine Learning and Cloud Computing. The objective of this thesis is to test and utilize AWS SageMaker in three different applications: time-series forecasting with sentiment analysis, automated Machine Learning (AutoML), and finally anomaly detection. The first study covered is a sentiment-based LSTM for stock price prediction. The LSTM was created with two methods, the first being SQL Server Data Tools, and the second being an implementation of LSTM using the Keras library. These results were then evaluated using accuracy, precision, recall, f-1 score, mean absolute error (MAE), root mean squared error (RMSE), and symmetric mean absolute percentage error (SMAPE). The results of this project were that the sentiment models all outperformed the control LSTM. The public model for Facebook on SQL Server Data Tools performed the best overall with 0.9743 accuracy and 0.9940 precision. The second study covered is an application of AWS SageMaker AutoPilot which is an AutoML platform designed to make Machine Learning more accessible to those without programming backgrounds. The methodology of this study follows the application of AWS Data Wrangler and AutoPilot from beginning of the process to completion. The results were evaluated using the metrics of: accuracy, precision, recall, and f-1 score. The best accuracy is given to the LightGBM model on the AI4I Maintenance dataset with an accuracy of 0.983. This model also scored the best on precision, recall, and F1 Score. The final study covered is an anomaly detection system for cyber security intrusion detection system data. The Intrusion Detection Systems that have been rule based are able to catch most of the cyber threats that are prevalent in network traffic; however, the copious amounts of alerts are nearly impossible for humans to keep up with. The methodology of this study follows a typical taxonomy of: data collection, data processing, model creation, and model evaluation. Both Random Cut Forest and XGBoost are implemented using AWS SageMaker. The Supervised Learning Algorithm of XGBoost was able to have the highest accuracy of all models with Model 2 giving an accuracy of 0.6183. This model also showed a Precision of 0.5902, Recall of 0.9649, and F1 Score 0.7324

    Optimizing on-demand resource deployment for peer-assisted content delivery (PhD thesis)

    Full text link
    Increasingly, content delivery solutions leverage client resources in exchange for service in a peer-to-peer (P2P) fashion. Such peer-assisted service paradigms promise significant infrastructure cost reduction, but suffer from the unpredictability associated with client resources, which is often exhibited as an imbalance between the contribution and consumption of resources by clients. This imbalance hinders the ability to guarantee a minimum service fidelity of these services to the clients. In this thesis, we propose a novel architectural service model that enables the establishment of higher fidelity services through (1) coordinating the content delivery to optimally utilize the available resources, and (2) leasing the least additional cloud resources, available through special nodes (angels) that join the service on-demand, and only if needed, to complement the scarce resources available through clients. While the proposed service model can be deployed in many settings, this thesis focuses on peer-assisted content delivery applications, in which the scarce resource is typically the uplink capacity of clients. We target three applications that require the delivery of fresh as opposed to stale content. The first application is bulk-synchronous transfer, in which the goal of the system is to minimize the maximum distribution time -- the time it takes to deliver the content to all clients in a group. The second application is live streaming, in which the goal of the system is to maintain a given streaming quality. The third application is Tor, the anonymous onion routing network, in which the goal of the system is to boost performance (increase throughput and reduce latency) throughout the network, and especially for bandwidth-intensive applications. For each of the above applications, we develop mathematical models that optimally allocate the already available resources. They also optimally allocate additional on-demand resource to achieve a certain level of service. Our analytical models and efficient constructions depend on some simplifying, yet impractical, assumptions. Thus, inspired by our models and constructions, we develop practical techniques that we incorporate into prototypical peer-assisted angel-enabled cloud services. We evaluate those techniques through simulation and/or implementation. (Major Advisor: Azer Bestavros

    Optimizing on-demand resource deployment for peer-assisted content delivery

    Full text link
    Increasingly, content delivery solutions leverage client resources in exchange for services in a pee-to-peer (P2P) fashion. Such peer-assisted service paradigm promises significant infrastructure cost reduction, but suffers from the unpredictability associated with client resources, which is often exhibited as an imbalance between the contribution and consumption of resources by clients. This imbalance hinders the ability to guarantee a minimum service fidelity of these services to clients especially for real-time applications where content can not be cached. In this thesis, we propose a novel architectural service model that enables the establishment of higher fidelity services through (1) coordinating the content delivery to efficiently utilize the available resources, and (2) leasing the least additional cloud resources, available through special nodes (angels) that join the service on-demand, and only if needed, to complement the scarce resources available through clients. While the proposed service model can be deployed in many settings, this thesis focuses on peer-assisted content delivery applications, in which the scarce resource is typically the upstream capacity of clients. We target three applications that require the delivery of real-time as opposed to stale content. The first application is bulk-synchronous transfer, in which the goal of the system is to minimize the maximum distribution time - the time it takes to deliver the content to all clients in a group. The second application is live video streaming, in which the goal of the system is to maintain a given streaming quality. The third application is Tor, the anonymous onion routing network, in which the goal of the system is to boost performance (increase throughput and reduce latency) throughout the network, and especially for clients running bandwidth-intensive applications. For each of the above applications, we develop analytical models that efficiently allocate the already available resources. They also efficiently allocate additional on-demand resource to achieve a certain level of service. Our analytical models and efficient constructions depend on some simplifying, yet impractical, assumptions. Thus, inspired by our models and constructions, we develop practical techniques that we incorporate into prototypical peer-assisted angel-enabled cloud services. We evaluate these techniques through simulation and/or implementation

    Nature-inspired survivability: Prey-inspired survivability countermeasures for cloud computing security challenges

    Get PDF
    As cloud computing environments become complex, adversaries have become highly sophisticated and unpredictable. Moreover, they can easily increase attack power and persist longer before detection. Uncertain malicious actions, latent risks, Unobserved or Unobservable risks (UUURs) characterise this new threat domain. This thesis proposes prey-inspired survivability to address unpredictable security challenges borne out of UUURs. While survivability is a well-addressed phenomenon in non-extinct prey animals, applying prey survivability to cloud computing directly is challenging due to contradicting end goals. How to manage evolving survivability goals and requirements under contradicting environmental conditions adds to the challenges. To address these challenges, this thesis proposes a holistic taxonomy which integrate multiple and disparate perspectives of cloud security challenges. In addition, it proposes the TRIZ (Teorija Rezbenija Izobretatelskib Zadach) to derive prey-inspired solutions through resolving contradiction. First, it develops a 3-step process to facilitate interdomain transfer of concepts from nature to cloud. Moreover, TRIZ’s generic approach suggests specific solutions for cloud computing survivability. Then, the thesis presents the conceptual prey-inspired cloud computing survivability framework (Pi-CCSF), built upon TRIZ derived solutions. The framework run-time is pushed to the user-space to support evolving survivability design goals. Furthermore, a target-based decision-making technique (TBDM) is proposed to manage survivability decisions. To evaluate the prey-inspired survivability concept, Pi-CCSF simulator is developed and implemented. Evaluation results shows that escalating survivability actions improve the vitality of vulnerable and compromised virtual machines (VMs) by 5% and dramatically improve their overall survivability. Hypothesis testing conclusively supports the hypothesis that the escalation mechanisms can be applied to enhance the survivability of cloud computing systems. Numeric analysis of TBDM shows that by considering survivability preferences and attitudes (these directly impacts survivability actions), the TBDM method brings unpredictable survivability information closer to decision processes. This enables efficient execution of variable escalating survivability actions, which enables the Pi-CCSF’s decision system (DS) to focus upon decisions that achieve survivability outcomes under unpredictability imposed by UUUR

    Empowering Patient Similarity Networks through Innovative Data-Quality-Aware Federated Profiling

    Get PDF
    Continuous monitoring of patients involves collecting and analyzing sensory data from a multitude of sources. To overcome communication overhead, ensure data privacy and security, reduce data loss, and maintain efficient resource usage, the processing and analytics are moved close to where the data are located (e.g., the edge). However, data quality (DQ) can be degraded because of imprecise or malfunctioning sensors, dynamic changes in the environment, transmission failures, or delays. Therefore, it is crucial to keep an eye on data quality and spot problems as quickly as possible, so that they do not mislead clinical judgments and lead to the wrong course of action. In this article, a novel approach called federated data quality profiling (FDQP) is proposed to assess the quality of the data at the edge. FDQP is inspired by federated learning (FL) and serves as a condensed document or a guide for node data quality assurance. The FDQP formal model is developed to capture the quality dimensions specified in the data quality profile (DQP). The proposed approach uses federated feature selection to improve classifier precision and rank features based on criteria such as feature value, outlier percentage, and missing data percentage. Extensive experimentation using a fetal dataset split into different edge nodes and a set of scenarios were carefully chosen to evaluate the proposed FDQP model. The results of the experiments demonstrated that the proposed FDQP approach positively improved the DQ, and thus, impacted the accuracy of the federated patient similarity network (FPSN)-based machine learning models. The proposed data-quality-aware federated PSN architecture leveraging FDQP model with data collected from edge nodes can effectively improve the data quality and accuracy of the federated patient similarity network (FPSN)-based machine learning models. Our profiling algorithm used lightweight profile exchange instead of full data processing at the edge, which resulted in optimal data quality achievement, thus improving efficiency. Overall, FDQP is an effective method for assessing data quality in the edge computing environment, and we believe that the proposed approach can be applied to other scenarios beyond patient monitoring

    Cybersecurity in travel and tourism: a risk-based approach

    Get PDF
    As the travel and tourism sector is embracing emerging technologies to redefine products, services, and consumer experiences, their cyber ecosystems become increasingly vulnerable to security risks related with these technologies, the huge amount of financial transactions they carry out and the valuable customer data they store. Over the last few years, several high-profile organizations in the sector made negative headlines because they did not pay appropriate attention to these risks and took an approach to cybersecurity that was fragmented, technology-focused and compliance-oriented. It is evident that a step change is needed, and this chapter presents a more comprehensive, business-driven and risk-based approach to building cybersecurity capability in an organization. The chapter starts with the business case for a cybersecurity strategy and then unfolds the components of a risk-based approach to cybersecurity

    Cognitive Machine Individualism in a Symbiotic Cybersecurity Policy Framework for the Preservation of Internet of Things Integrity: A Quantitative Study

    Get PDF
    This quantitative study examined the complex nature of modern cyber threats to propose the establishment of cyber as an interdisciplinary field of public policy initiated through the creation of a symbiotic cybersecurity policy framework. For the public good (and maintaining ideological balance), there must be recognition that public policies are at a transition point where the digital public square is a tangible reality that is more than a collection of technological widgets. The academic contribution of this research project is the fusion of humanistic principles with Internet of Things (IoT) technologies that alters our perception of the machine from an instrument of human engineering into a thinking peer to elevate cyber from technical esoterism into an interdisciplinary field of public policy. The contribution to the US national cybersecurity policy body of knowledge is a unified policy framework (manifested in the symbiotic cybersecurity policy triad) that could transform cybersecurity policies from network-based to entity-based. A correlation archival data design was used with the frequency of malicious software attacks as the dependent variable and diversity of intrusion techniques as the independent variable for RQ1. For RQ2, the frequency of detection events was the dependent variable and diversity of intrusion techniques was the independent variable. Self-determination Theory is the theoretical framework as the cognitive machine can recognize, self-endorse, and maintain its own identity based on a sense of self-motivation that is progressively shaped by the machine’s ability to learn. The transformation of cyber policies from technical esoterism into an interdisciplinary field of public policy starts with the recognition that the cognitive machine is an independent consumer of, advisor into, and influenced by public policy theories, philosophical constructs, and societal initiatives
    • …
    corecore