2,612 research outputs found

    Data Replication and Its Alignment with Fault Management in the Cloud Environment

    Get PDF
    Nowadays, the exponential data growth becomes one of the major challenges all over the world. It may cause a series of negative impacts such as network overloading, high system complexity, and inadequate data security, etc. Cloud computing is developed to construct a novel paradigm to alleviate massive data processing challenges with its on-demand services and distributed architecture. Data replication has been proposed to strategically distribute the data access load to multiple cloud data centres by creating multiple data copies at multiple cloud data centres. A replica-applied cloud environment not only achieves a decrease in response time, an increase in data availability, and more balanced resource load but also protects the cloud environment against the upcoming faults. The reactive fault tolerance strategy is also required to handle the faults when the faults already occurred. As a result, the data replication strategies should be aligned with the reactive fault tolerance strategies to achieve a complete management chain in the cloud environment. In this thesis, a data replication and fault management framework is proposed to establish a decentralised overarching management to the cloud environment. Three data replication strategies are firstly proposed based on this framework. A replica creation strategy is proposed to reduce the total cost by jointly considering the data dependency and the access frequency in the replica creation decision making process. Besides, a cloud map oriented and cost efficiency driven replica creation strategy is proposed to achieve the optimal cost reduction per replica in the cloud environment. The local data relationship and the remote data relationship are further analysed by creating two novel data dependency types, Within-DataCentre Data Dependency and Between-DataCentre Data Dependency, according to the data location. Furthermore, a network performance based replica selection strategy is proposed to avoid potential network overloading problems and to increase the number of concurrent-running instances at the same time

    Improved Task Scheduling for Virtual Machines in the Cloud based on the Gravitational Search Algorithm

    Full text link
    The rapid and convenient provision of the available computing resources is a crucial requirement in modern cloud computing environments. However, if only the execution time is taken into account when the resources are scheduled, it could lead to imbalanced workloads as well as to significant under-utilisation of the involved Virtual Machines (VMs). In the present work a novel task scheduling scheme is introduced, which is based on the proper adaptation of a modern and quite effective evolutionary optimization method, the Gravitational Search Algorithm (GSA). The proposed scheme aims at optimizing the entire scheduling procedure, in terms of both the tasks execution time and the system (VMs) resource utilisation. Moreover, the fitness function was properly selected considering both the above factors in an appropriately weighted function in order to obtain better results for large inputs. Sufficient simulation experiments show the efficiency of the proposed scheme, as well as its excellence over related approaches of the bibliography, with similar objectives.Comment: 8 page

    A Survey on Array Storage, Query Languages, and Systems

    Full text link
    Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page

    Towards mobile cloud computing with single sign-on access

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Journal of Grid Computing. The final authenticated version is available online at: http://dx.doi.org/10.1007/s10723-017-9413-3The low computing power of mobile devices impedes the development of mobile applications with a heavy computing load. Mobile Cloud Computing (MCC) has emerged as the solution to this by connecting mobile devices with the “infinite” computing power of the Cloud. As mobile devices typically communicate over untrusted networks, it becomes necessary to secure the communications to avoid privacy-sensitive data breaches. This paper presents work on implementing MCC applications with secure communications. For that purpose, we built on COMPSs-Mobile, a redesigned implementation of the COMP Superscalar (COMPSs) framework aiming to MCC platorms. COMPSs-Mobile automatically exploits the parallelism inherent in an application and orchestrates its execution on loosely-coupled distributed environment. To avoid a vendor lock-in, this extension leverages on the Generic Security Services Application Program Interface (GSSAPI) (RFC2743) as a generic way to access security services to provide communications with authentication, secrecy and integrity. Besides, GSSAPI allows applications to take profit of more advanced features, such as Federated Identity or Single Sign-On, which the underlying security framework could provide. To validate the practicality of the proposal, we use Kerberos as the security services provider to implement SSO; however, applications do not authenticate themselves and require users to obtain and place the credentials beforehand. To evaluate the performance, we conducted some tests running an application on a smartphone offloading tasks to a private cloud. Our results show that the overhead of securing the communications is acceptable.This work has been supported by the Spanish Government (contracts TIN2012-34557, TIN2015-65316-P and grants BES-2013-067167, EEBB-I-15-09808 of the Research Training Program and SEV-2011-00067 of Severo Ochoa Program), by Generalitat de Catalunya (contract 2014-SGR-1051) and by the European Commission (ASCETiC project, FP7-ICT-2013.1.2 contract 610874). The second author was partially supported by the European Commission's Horizon2020 programme under grant agreement 653965 (AARC).Peer ReviewedPostprint (author's final draft

    CLOSURE: A cloud scientific workflow scheduling algorithm based on attack-defense game model

    Get PDF
    The multi-tenant coexistence service mode makes the cloud-based scientific workflow encounter the risks of being intruded. For this problem, we propose a CLoud scientific wOrkflow SchedUling algoRithm based on attack-defensE game model (CLOSURE). In the algorithm, attacks based on different operating system vulnerabilities are regarded as different “attack” strategies; and different operating system distributions in a virtual machine cluster executing the workflows are regarded as different “defense” strategies. The information of the attacker and defender is not balanced. In other words, the defender cannot obtain the information about the attacker’s strategies, while the attacker can acquire information about the defender’s strategies through a network scan. Therefore, we propose to dynamically switch the defense strategies during the workflow execution, which can weaken the network scan effects and transform the workflow security problem into an attack-defense game problem. Then, the probability distribution of the optimal mixed defense strategies can be achieved by calculating the Nash Equilibrium in the attack-defense game model. Based on this probability, diverse VMs are provisioned for workflow execution. Furthermore, a task-VM mapping algorithm based on dynamic Heterogeneous Earliest Finish Time (HEFT) is presented to accelerate the defense strategy switching and improve workflow efficiency. The experiments are conducted on both simulation and actual environment, experimental results demonstrate that compared with other algorithms, the proposed algorithm can reduce the attacker’s benefits by around 15.23%, and decrease the time costs of the algorithm by around 7.86%

    Towards Collaborative Scientific Workflow Management System

    Get PDF
    The big data explosion phenomenon has impacted several domains, starting from research areas to divergent of business models in recent years. As this intensive amount of data opens up the possibilities of several interesting knowledge discoveries, over the past few years divergent of research domains have undergone the shift of trend towards analyzing those massive amount data. Scientific Workflow Management System (SWfMS) has gained much popularity in recent years in accelerating those data-intensive analyses, visualization, and discoveries of important information. Data-intensive tasks are often significantly time-consuming and complex in nature and hence SWfMSs are designed to efficiently support the specification, modification, execution, failure handling, and monitoring of the tasks in a scientific workflow. As far as the complexity, dimension, and volume of data are concerned, their effective analysis or management often become challenging for an individual and requires collaboration of multiple scientists instead. Hence, the notion of 'Collaborative SWfMS' was coined - which gained significant interest among researchers in recent years as none of the existing SWfMSs directly support real-time collaboration among scientists. In terms of collaborative SWfMSs, consistency management in the face of conflicting concurrent operations of the collaborators is a major challenge for its highly interconnected document structure among the computational modules - where any minor change in a part of the workflow can highly impact the other part of the collaborative workflow for the datalink relation among them. In addition to the consistency management, studies show several other challenges that need to be addressed towards a successful design of collaborative SWfMSs, such as sub-workflow composition and execution by different sub-groups, relationship between scientific workflows and collaboration models, sub-workflow monitoring, seamless integration and access control of the workflow components among collaborators and so on. In this thesis, we propose a locking scheme to facilitate consistency management in collaborative SWfMSs. The proposed method works by locking workflow components at a granular attribute level in addition to supporting locks on a targeted part of the collaborative workflow. We conducted several experiments to analyze the performance of the proposed method in comparison to related existing methods. Our studies show that the proposed method can reduce the average waiting time of a collaborator by up to 36% while increasing the average workflow update rate by up to 15% in comparison to existing descendent modular level locking techniques for collaborative SWfMSs. We also propose a role-based access control technique for the management of collaborative SWfMSs. We leverage the Collaborative Interactive Application Methodology (CIAM) for the investigation of role-based access control in the context of collaborative SWfMSs. We present our proposed method with a use-case of Plant Phenotyping and Genotyping research domain. Recent study shows that the collaborative SWfMSs often different sets of opportunities and challenges. From our investigations on existing research works towards collaborative SWfMSs and findings of our prior two studies, we propose an architecture of collaborative SWfMSs. We propose - SciWorCS - a Collaborative Scientific Workflow Management System as a proof of concept of the proposed architecture; which is the first of its kind to the best of our knowledge. We present several real-world use-cases of scientific workflows using SciWorCS. Finally, we conduct several user studies using SciWorCS comprising different real-world scientific workflows (i.e., from myExperiment) to understand the user behavior and styles of work in the context of collaborative SWfMSs. In addition to evaluating SciWorCS, the user studies reveal several interesting facts which can significantly contribute in the research domain, as none of the existing methods considered such empirical studies, and rather relied only on computer generated simulated studies for evaluation

    Strengthening Construction Management in the Rural Rehab Line of Business

    Get PDF
    The Five Key ObservationsObservation#1: Rural rehab success emanated from positive thinking and persistent implementationObservation #2: Almost every RHRO would benefit from a substantial increase in the per unit funding available, especially in light of the forthcoming HUD HOME requirement to establish written rehab standards in ten subcategories.Observation #3: A smartphone and tablet with 20 to 40 apps is the rehab specialist's Swiss Army knife. They are our, GPS, calculator, spec writer, office lifeline in case of danger, camera, clock, cost estimator calendar and a hundred other single-purpose but very important uses.Observation #4: NeighborWorks® Rural Initiative could provide a clearinghouse for success techniques targeted to rural rehab. Each month it might focus on a specific aspect of rehab management; inspection checklists in January, green specs in February, feasibility checklist in March, contractor qualification questionnaires in April and so on.Observation #5: Even with most components of in-house contractor success formula in place, per the Statistic Research Institute 53% of construction firms go out of business with in the first 4 years. It remains a very risky model that requires significant; funding, staff experience, administrative support and risk tolerance.Three Rehab Production Models And Their AlternativesThis middle section restates the introduction and methodology and offers a detailed review of the Traditional Rehab Specialist, Construction Management Of Subcontractor and the In-House General Contractor production models .for each model the article provides: definition and staffing pattern, design roles and tasks for each major player, benefits and challenges, alternative models and finally recommendations for successful implementationFocus TopicsDuring our interview process, three ideas surfaced that were best served with a mini discussion of the topic rather than being embedded in the already large middle section.The three topics are; software and technology, management of community relations – marketing and quality control, and budget solution

    An anti-malware product test orchestration solution for multiple pluggable environments

    Get PDF
    The term automation gets thrown around a lot these days in the software industry. However, the recent change in test automation in the software engineering process is driven by multiple factors such as environmental factors, both external and internal as well as industry-driven factors. Simply, what we all understand about automation is - the use of some technologies to operate a task. The choice of the right tools, be it in-house or any third-party software, can increase effectiveness, efficiency and coverage of the security product testing. Often, test environments are maintained at various stages in the testing process. Developer’s test, dedicated test, integration test and pre-production or business readiness test are some common phrases in software testing. On the other hand, abstraction is often included between different architectural layers, ever-changing providers of virtualization platforms such as VMWare, OpenStack, AWS as test execution environments and many others with a different state of maintainability. As there is an obvious mismatch in configuration between development, testing and production environment; software testing process is often slow and tedious for many organizations due to the lack of collaboration between IT Operations and Software Development teams. Because of this, identifying and addressing test environmentrelated compatibility becomes a major concern for QA teams. In this context, this thesis presents a DevOps approach and implementation method of an automated test execution solution named OneTA that can interact with multiple test environments including isolated malware test environments. The study was performed to identify a common way of preparing test environments in in-house and publicly available virtualization platforms where distributed tests can run on a regular basis. The current solution allows security product testing in multiple pluggable environments in a single setup utilizing the modern DevOps practice to result minimum efforts. This thesis project was carried out in collaboration with F-Secure, a leading cyber security company in Finland. The project deals with the company’s internal environments for test execution. It explores the available infrastructures so that software development team can use this solution as a test execution tool

    Virtual Cluster Management for Analysis of Geographically Distributed and Immovable Data

    Get PDF
    Thesis (Ph.D.) - Indiana University, Informatics and Computing, 2015Scenarios exist in the era of Big Data where computational analysis needs to utilize widely distributed and remote compute clusters, especially when the data sources are sensitive or extremely large, and thus unable to move. A large dataset in Malaysia could be ecologically sensitive, for instance, and unable to be moved outside the country boundaries. Controlling an analysis experiment in this virtual cluster setting can be difficult on multiple levels: with setup and control, with managing behavior of the virtual cluster, and with interoperability issues across the compute clusters. Further, datasets can be distributed among clusters, or even across data centers, so that it becomes critical to utilize data locality information to optimize the performance of data-intensive jobs. Finally, datasets are increasingly sensitive and tied to certain administrative boundaries, though once the data has been processed, the aggregated or statistical result can be shared across the boundaries. This dissertation addresses management and control of a widely distributed virtual cluster having sensitive or otherwise immovable data sets through a controller. The Virtual Cluster Controller (VCC) gives control back to the researcher. It creates virtual clusters across multiple cloud platforms. In recognition of sensitive data, it can establish a single network overlay over widely distributed clusters. We define a novel class of data, notably immovable data that we call "pinned data", where the data is treated as a first-class citizen instead of being moved to where needed. We draw from our earlier work with a hierarchical data processing model, Hierarchical MapReduce (HMR), to process geographically distributed data, some of which are pinned data. The applications implemented in HMR use extended MapReduce model where computations are expressed as three functions: Map, Reduce, and GlobalReduce. Further, by facilitating information sharing among resources, applications, and data, the overall performance is improved. Experimental results show that the overhead of VCC is minimum. The HMR outperforms traditional MapReduce model while processing a particular class of applications. The evaluations also show that information sharing between resources and application through the VCC shortens the hierarchical data processing time, as well satisfying the constraints on the pinned data
    • …
    corecore