2,543 research outputs found

    Locality-aware scientific workflow engine for fast-evolving spatiotemporal sensor data, A

    Get PDF
    2017 Spring.Includes bibliographical references.Discerning knowledge from voluminous data involves a series of data manipulation steps. Scientists typically compose and execute workflows for these steps using scientific workflow management systems (SWfMSs). SWfMSs have been developed for several research communities including but not limited to bioinformatics, biology, astronomy, computational science, and physics. Parallel execution of workflows has been widely employed in SWfMSs by exploiting the storage and computing resources of grid and cloud services. However, none of these systems have been tailored for the needs of spatiotemporal analytics on real-time sensor data with high arrival rates. This thesis demonstrates the development and evaluation of a target-oriented workflow model that enables a user to specify dependencies among the workflow components, including data availability. The underlying spatiotemporal data dispersion and indexing scheme provides fast data search and retrieval to plan and execute computations comprising the workflow. This work includes a scheduling algorithm that targets minimizing data movement across machines while ensuring fair and efficient resource allocation among multiple users. The study includes empirical evaluations performed on the Google cloud

    BRAHMA(+): A Framework for Resource Scaling of Streaming and ASAP Time-Varying Workflows

    Get PDF
    Automatic scaling of complex software-as-a-service application workflows is one of the most important problems concerning resource management in clouds. In this paper, we study the automatic workflow resource scaling problem for streaming and ASAP workflows, and its time-varying variant where the workflow resource requirements change over time. Service components of streaming workflows execute concurrently while those of ASAP workflows execute sequentially. We propose an intelligent framework, BRAHMA(+), which possesses the capability to learn the workflow behavior and construct a knowledge base that serves as its decision making engine. The proposed resource provisioning algorithms leverage this learned information curated in the knowledge base to perform informed and intelligent scaling decisions. Additionally, BRAHMA(+) employs the use of online-learning strategies to keep the knowledge base up-to-date, thereby accommodating the changes in the workflow resource requirements over time. We evaluate the proposed algorithms using CloudSim simulations. Results on streaming and ASAP workflows, with both static and time-varying resource requirements show that the proposed algorithms are effective and produce good cost-quality trade-offs. The proactive and hybrid algorithms meet the service level agreements and restrict deadline violations to a small fraction (3%-5% in the considered scenarios), while only suffering a marginal increase in average cost per component compared to the described baseline algorithms

    Adaptive planning for distributed systems using goal accomplishment tracking

    Get PDF
    Goal accomplishment tracking is the process of monitoring the progress of a task or series of tasks towards completing a goal. Goal accomplishment tracking is used to monitor goal progress in a variety of domains, including workflow processing, teleoperation and industrial manufacturing. Practically, it involves the constant monitoring of task execution, analysis of this data to determine the task progress and notification of interested parties. This information is usually used in a passive way to observe goal progress. However, responding to this information may prevent goal failures. In addition, responding proactively in an opportunistic way can also lead to goals being completed faster. This paper proposes an architecture to support the adaptive planning of tasks for fault tolerance or opportunistic task execution based on goal accomplishment tracking. It argues that dramatically increased performance can be gained by monitoring task execution and altering plans dynamically

    VIOLA - A multi-purpose and web-based visualization tool for neuronal-network simulation output

    Full text link
    Neuronal network models and corresponding computer simulations are invaluable tools to aid the interpretation of the relationship between neuron properties, connectivity and measured activity in cortical tissue. Spatiotemporal patterns of activity propagating across the cortical surface as observed experimentally can for example be described by neuronal network models with layered geometry and distance-dependent connectivity. The interpretation of the resulting stream of multi-modal and multi-dimensional simulation data calls for integrating interactive visualization steps into existing simulation-analysis workflows. Here, we present a set of interactive visualization concepts called views for the visual analysis of activity data in topological network models, and a corresponding reference implementation VIOLA (VIsualization Of Layer Activity). The software is a lightweight, open-source, web-based and platform-independent application combining and adapting modern interactive visualization paradigms, such as coordinated multiple views, for massively parallel neurophysiological data. For a use-case demonstration we consider spiking activity data of a two-population, layered point-neuron network model subject to a spatially confined excitation originating from an external population. With the multiple coordinated views, an explorative and qualitative assessment of the spatiotemporal features of neuronal activity can be performed upfront of a detailed quantitative data analysis of specific aspects of the data. Furthermore, ongoing efforts including the European Human Brain Project aim at providing online user portals for integrated model development, simulation, analysis and provenance tracking, wherein interactive visual analysis tools are one component. Browser-compatible, web-technology based solutions are therefore required. Within this scope, with VIOLA we provide a first prototype.Comment: 38 pages, 10 figures, 3 table

    FACILITATING AQUATIC INVASIVE SPECIES MANAGEMENT USING SATELLITE REMOTE SENSING AND MACHINE LEARNING FRAMEWORKS

    Get PDF
    The urgent decision-making needs of invasive species managers can be better met by the integration of biodiversity big data with large-domain models and environmental data products in the form of new workflows and tools that facilitate data utilization across platforms. Timely risk assessments allow for the spatial prioritization of monitoring that could streamline invasive species management paradigms and invasive species’ ability to prevent irreversible damage, such that decision makers can focus surveillance and intervention efforts where they are likely to be most effective under budgetary and resource constraints. I present a workflow that generates rapid spatial risk assessments on aquatic invasive species by combining occurrence data, spatially explicit environmental data, and an ensemble approach to species distribution modeling using five machine learning algorithms. For proof of concept and validation, I tested this workflow using extensive spatial and temporal occurrence data from Rainbow Trout (RBT; Oncorhynchus mykiss) invasion in the upper Flathead River system in northwestern Montana, USA. Due to this workflow’s high performance against cross-validated datasets (87% accuracy) and congruence with known drivers of RBT invasion, I developed a tool that generates agile risk assessments based on the above workflow and suggest that it can be generalized to broader spatial and taxonomic scales in order to provide data-driven management information for early detection of potential invaders. I then use this tool as technical input for a management framework that provides guidance for users to incorporate and synthesize the component features of the workflow and toolkit to derive actionable insight in an efficient manner
    • …
    corecore