128 research outputs found

    BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments

    Get PDF
    Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing (HPC) techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems (SWfMS) and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process

    Multilevel Runtime Verification for Safety and Security Critical Cyber Physical Systems from a Model Based Engineering Perspective

    Get PDF
    Advanced embedded system technology is one of the key driving forces behind the rapid growth of Cyber-Physical System (CPS) applications. CPS consists of multiple coordinating and cooperating components, which are often software-intensive and interact with each other to achieve unprecedented tasks. Such highly integrated CPSs have complex interaction failures, attack surfaces, and attack vectors that we have to protect and secure against. This dissertation advances the state-of-the-art by developing a multilevel runtime monitoring approach for safety and security critical CPSs where there are monitors at each level of processing and integration. Given that computation and data processing vulnerabilities may exist at multiple levels in an embedded CPS, it follows that solutions present at the levels where the faults or vulnerabilities originate are beneficial in timely detection of anomalies. Further, increasing functional and architectural complexity of critical CPSs have significant safety and security operational implications. These challenges are leading to a need for new methods where there is a continuum between design time assurance and runtime or operational assurance. Towards this end, this dissertation explores Model Based Engineering methods by which design assurance can be carried forward to the runtime domain, creating a shared responsibility for reducing the overall risk associated with the system at operation. Therefore, a synergistic combination of Verification & Validation at design time and runtime monitoring at multiple levels is beneficial in assuring safety and security of critical CPS. Furthermore, we realize our multilevel runtime monitor framework on hardware using a stream-based runtime verification language

    Applications Development for the Computational Grid

    Get PDF

    Scientific Grand Challenges: Crosscutting Technologies for Computing at the Exascale - February 2-4, 2010, Washington, D.C.

    Full text link

    Platforms for deployment of scalable on- and off-line data analytics.

    Get PDF
    The ability to exploit the intelligence concealed in bulk data to generate actionable insights is increasingly providing competitive advantages to businesses, government agencies, and charitable organisations. The burgeoning field of Data Science, and its related applications in the field of Data Analytics, finds broader applicability with each passing year. This expansion of users and applications is matched by an explosion in tools, platforms, and techniques designed to exploit more types of data in larger volumes, with more techniques, and at higher frequencies than ever before. This diversity in platforms and tools presents a new challenge for organisations aiming to integrate Data Science into their daily operations. Designing an analytic for a particular platform necessarily involves ā€œlock-inā€ to that specific implementation ā€“ there are few opportunities for algorithmic portability. It is increasingly challenging to find engineers with experience in the diverse suite of tools available as well as understanding the precise details of the domain in which they work: the semantics of the data, the nature of queries and analyses to be executed, and the interpretation and presentation of results. The work presented in this thesis addresses these challenges by introducing a number of techniques to facilitate the creation of analytics for equivalent deployment across a variety of runtime frameworks and capabilities. In the first instance, this capability is demonstrated using the first Domain Specific Language and associated runtime environments to target multiple best-in-class frameworks for data analysis from the streaming and off-line paradigms. This capability is extended with a new approach to modelling analytics based around a semantically rich type system. An analytic planner using this model is detailed, thus empowering domain experts to build their own scalable analyses, without any specific programming or distributed systems knowledge. This planning technique is used to assemble complex ensembles of hybrid analytics: automatically applying multiple frameworks in a single workflow. Finally, this thesis demonstrates a novel approach to the speculative construction, compilation, and deployment of analytic jobs based around the observation of user interactions with an analytic planning system
    • ā€¦
    corecore