4,022 research outputs found

    Adaptive Energy-aware Scheduling of Dynamic Event Analytics across Edge and Cloud Resources

    Full text link
    The growing deployment of sensors as part of Internet of Things (IoT) is generating thousands of event streams. Complex Event Processing (CEP) queries offer a useful paradigm for rapid decision-making over such data sources. While often centralized in the Cloud, the deployment of capable edge devices on the field motivates the need for cooperative event analytics that span Edge and Cloud computing. Here, we identify a novel problem of query placement on edge and Cloud resources for dynamically arriving and departing analytic dataflows. We define this as an optimization problem to minimize the total makespan for all event analytics, while meeting energy and compute constraints of the resources. We propose 4 adaptive heuristics and 3 rebalancing strategies for such dynamic dataflows, and validate them using detailed simulations for 100 - 1000 edge devices and VMs. The results show that our heuristics offer O(seconds) planning time, give a valid and high quality solution in all cases, and reduce the number of query migrations. Furthermore, rebalance strategies when applied in these heuristics have significantly reduced the makespan by around 20 - 25%.Comment: 11 pages, 7 figure

    Data Placement And Task Mapping Optimization For Big Data Workflows In The Cloud

    Get PDF
    Data-centric workflows naturally process and analyze a huge volume of datasets. In this new era of Big Data there is a growing need to enable data-centric workflows to perform computations at a scale far exceeding a single workstation\u27s capabilities. Therefore, this type of applications can benefit from distributed high performance computing (HPC) infrastructures like cluster, grid or cloud computing. Although data-centric workflows have been applied extensively to structure complex scientific data analysis processes, they fail to address the big data challenges as well as leverage the capability of dynamic resource provisioning in the Cloud. The concept of “big data workflows” is proposed by our research group as the next generation of data-centric workflow technologies to address the limitations of exist-ing workflows technologies in addressing big data challenges. Executing big data workflows in the Cloud is a challenging problem as work-flow tasks and data are required to be partitioned, distributed and assigned to the cloud execution sites (multiple virtual machines). In running such big data work-flows in the cloud distributed across several physical locations, the workflow execution time and the cloud resource utilization efficiency highly depends on the initial placement and distribution of the workflow tasks and datasets across the multiple virtual machines in the Cloud. Several workflow management systems have been developed for scientists to facilitate the use of workflows; however, data and work-flow task placement issue has not been sufficiently addressed yet. In this dissertation, I propose BDAP strategy (Big Data Placement strategy) for data placement and TPS (Task Placement Strategy) for task placement, which improve workflow performance by minimizing data movement across multiple virtual machines in the Cloud during the workflow execution. In addition, I propose CATS (Cultural Algorithm Task Scheduling) for workflow scheduling, which improve workflow performance by minimizing workflow execution cost. In this dissertation, I 1) formalize data and task placement problems in workflows, 2) propose a data placement algorithm that considers both initial input dataset and intermediate datasets obtained during workflow run, 3) propose a task placement algorithm that considers placement of workflow tasks before workflow run, 4) propose a workflow scheduling strategy to minimize the workflow execution cost once the deadline is provided by user and 5)perform extensive experiments in the distributed environment to validate that our proposed strategies provide an effective data and task placement solution to distribute and place big datasets and tasks into the appropriate virtual machines in the Cloud within reasonable time

    A two-stage framework for designing visual analytics systems to augment organizational analytical processes

    Get PDF
    A perennially interesting research topic in the field of visual analytics is how to effectively develop systems that support organizational knowledge worker’s decision-making and reasoning processes. The primary objective of a visual analytic system is to facilitate analytical reasoning and discovery of insights through interactive visual interfaces. It also enables the transfer of capability and expertise from where it resides to where it is needed–across individuals, and organizations as necessary. The problem is, however, most domain analytical practices generally vary from organizations to organizations. This leads to the diversified design of visual analytics systems in incorporating domain analytical processes, making it difficult to generalize the success from one domain to another. Exacerbating this problem is the dearth of general models of analytical workflows available to enable such timely and effective designs. To alleviate these problems, this dissertation presents a two-stage framework for informing the design of a visual analytics system. This two-stage design framework builds upon and extends current practices pertaining to analytical workflow and focuses, in particular, on investigating its effect on the design of visual analytics systems for organizational environments. It aims to empower organizations with more systematic and purposeful information analyses through modeling the domain users’ reasoning processes. The first stage in this framework is an Observation and Designing stage, in which a visual analytic system is designed and implemented to abstract and encapsulate general organizational analytical processes, through extensive collaboration with domain users. The second stage is the User-centric Refinement stage, which aims at interactively enriching and refining the already encapsulated domain analysis process based on understanding user’s intentions through analyzing their task behavior. To implement this framework in the process of designing a visual analytics system, this dissertation proposes four general design recommendations that, when followed, empower such systems to bring the users closer to the center of their analytical processes. This dissertation makes three primary contributions: first, it presents a general characterization of the analytical workflow in organizational environments. This characterization fills in the blank of the current lack of such an analytical model and further represents a set of domain analytical tasks that are commonly applicable to various organizations. Secondly, this dissertation describes a two-stage framework for facilitating the domain users’ workflows through integrating their analytical models into interactive visual analytics systems. Finally, this dissertation presents recommendations and suggestions on enriching and refining domain analysis through capturing and analyzing knowledge workers’ analysis processes. To exemplify the generalizability of these design recommendations, this dissertation presents three visual analytics systems that are developed following the proposed recommendations, including Taste for Xerox Corporation, OpsVis for Microsoft, and IRSV for the U.S. Department of Transportation. All of these systems are deployed to domain knowledge workers and are adopted for their analytical practices. Extensive empirical evaluations are further conducted to demonstrate efficacy of these systems in facilitating domain analytical processes

    Resource provisioning and scheduling algorithms for hybrid workflows in edge cloud computing

    Get PDF
    In recent years, Internet of Things (IoT) technology has been involved in a wide range of application domains to provide real-time monitoring, tracking and analysis services. The worldwide number of IoT-connected devices is projected to increase to 43 billion by 2023, and IoT technologies are expected to engaged in 25% of business sector. Latency-sensitive applications in scope of intelligent video surveillance, smart home, autonomous vehicle, augmented reality, are all emergent research directions in industry and academia. These applications are required connecting large number of sensing devices to attain the desired level of service quality for decision accuracy in a sensitive timely manner. Moreover, continuous data stream imposes processing large amounts of data, which adds a huge overhead on computing and network resources. Thus, latency-sensitive and resource-intensive applications introduce new challenges for current computing models, i.e, batch and stream. In this thesis, we refer to the integrated application model of stream and batch applications as a hybrid work ow model. The main challenge of the hybrid model is achieving the quality of service (QoS) requirements of the two computation systems. This thesis provides a systemic and detailed modeling for hybrid workflows which describes the internal structure of each application type for purposes of resource estimation, model systems tuning, and cost modeling. For optimizing the execution of hybrid workflows, this thesis proposes algorithms, techniques and frameworks to serve resource provisioning and task scheduling on various computing systems including cloud, edge cloud and cooperative edge cloud. Overall, experimental results provided in this thesis demonstrated strong evidences on the responsibility of proposing different understanding and vision on the applications of integrating stream and batch applications, and how edge computing and other emergent technologies like 5G networks and IoT will contribute on more sophisticated and intelligent solutions in many life disciplines for more safe, secure, healthy, smart and sustainable society
    • …
    corecore