41 research outputs found

    Distributed Web Service Coordination for Collaboration Applications and Biological Workflows

    Get PDF
    In this dissertation work, we have investigated the main research thrust of decentralized coordination of workflows over web services. To address distributed workflow coordination, first we have developed “Web Coordination Bonds” as a capable set of dependency modeling primitives that enable each web service to manage its own dependencies. Web bond primitives are as powerful as extended Petri nets and have sufficient modeling and expressive capabilities to model workflow dependencies. We have designed and prototyped our “Web Service Coordination Management Middleware” (WSCMM) system that enhances current web services infrastructure to accommodate web bond enabled web services. Finally, based on core concepts of web coordination bonds and WSCMM, we have developed the “BondFlow” system that allows easy configuration distributed coordination of workflows. The footprint of the BonFlow runtime is 24KB and the additional third party software packages, SOAP client and XML parser, account for 115KB

    An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis

    Get PDF
    >Magister Scientiae - MScFunctional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs

    A Semantic Framework for Declarative and Procedural Knowledge

    Get PDF
    In any scientic domain, the full set of data and programs has reached an-ome status, i.e. it has grown massively. The original article on the Semantic Web describes the evolution of a Web of actionable information, i.e.\ud information derived from data through a semantic theory for interpreting the symbols. In a Semantic Web, methodologies are studied for describing, managing and analyzing both resources (domain knowledge) and applications (operational knowledge) - without any restriction on what and where they\ud are respectively suitable and available in the Web - as well as for realizing automatic and semantic-driven work\ud ows of Web applications elaborating Web resources.\ud This thesis attempts to provide a synthesis among Semantic Web technologies, Ontology Research, Knowledge and Work\ud ow Management. Such a synthesis is represented by Resourceome, a Web-based framework consisting of two components which strictly interact with each other: an ontology-based and domain-independent knowledge manager system (Resourceome KMS) - relying on a knowledge model where resource and operational knowledge are contextualized in any domain - and a semantic-driven work ow editor, manager and agent-based execution system (Resourceome WMS).\ud The Resourceome KMS and the Resourceome WMS are exploited in order to realize semantic-driven formulations of work\ud ows, where activities are semantically linked to any involved resource. In the whole, combining the use of domain ontologies and work ow techniques, Resourceome provides a exible domain and operational knowledge organization, a powerful engine for semantic-driven work\ud ow composition, and a distributed, automatic and\ud transparent environment for work ow execution

    Graphical programming system for dataflow language

    Get PDF
    Dataflow languages are languages that support the notion of data flowing from one operation to another. The flow concept gives dataflow languages the advantage of representing dataflow programs in graphical forms. This thesis presents a graphical programming system that supports the editing and simulating of dataflow programs. The system is implemented on an AT&T UnixTM PC. A high level graphical dataflow language, GDF language, is defined in this thesis. In GDF language, all the operators are represented in graphical forms. A graphical dataflow program is formed by drawing the operators and connecting the arcs in the Graphical Editor which is provided by the system. The system also supports a simulator for simulating the execution of a dataflow program. It will allow a user to discover the power of concurrency and parallel processing. Several simulation control options are offered to facilitate the debugging of dataflow programs

    Enabling Security Analysis and Education of the Ethereum Platform: A Network Traffic Dissection Tool

    Get PDF
    Ethereum, the decentralized global software platform powered by blockchain technology known for its native cryptocurrency, Ether (ETH), provides a technology stack for building apps, holding assets, transacting, and communicating without control by a central authority. At the core of Ethereum’s network is a suite of purpose-built protocols known as DEVP2P, which provides the underlying nodes in an Ethereum network the ability to discover, authenticate and communicate confidentiality. This document discusses the creation of a new Wireshark dissector for DEVP2P’s discovery protocols, DiscoveryV4 and DiscoveryV5, and a dissector for RLPx, an extensible TCP transport protocol for a range of Ethereum node capabilities. Network packet dissectors like Wireshark are commonly used to educate, develop, and analyze underlying network traffic. In support of creating the dissector, a custom private Ethereum docker network was also created, facilitating the communication amongst Go Ethereum execution clients and allowing the Wireshark dissector to capture live network data. Lastly, the dissector is used to understand the differences between DiscoveryV4 and DiscoveryV5, along with stepping through the network packets of RLPx to track a transaction executed on the network

    GRASP/Ada (Graphical Representations of Algorithms, Structures, and Processes for Ada): The development of a program analysis environment for Ada. Reverse engineering tools for Ada, task 1, phase 2

    Get PDF
    The study, formulation, and generation of structures for Ada (GRASP/Ada) are discussed in this second phase report of a three phase effort. Various graphical representations that can be extracted or generated from source code are described and categorized with focus on reverse engineering. The overall goal is to provide the foundation for a CASE (computer-aided software design) environment in which reverse engineering and forward engineering (development) are tightly coupled. Emphasis is on a subset of architectural diagrams that can be generated automatically from source code with the control structure diagram (CSD) included for completeness

    Next-generation information systems for genomics

    Get PDF
    NIH Grant no. HG00739The advent of next-generation sequencing technologies is transforming biology by enabling individual researchers to sequence the genomes of individual organisms or cells on a massive scale. In order to realize the translational potential of this technology we will need advanced information systems to integrate and interpret this deluge of data. These systems must be capable of extracting the location and function of genes and biological features from genomic data, requiring the coordinated parallel execution of multiple bioinformatics analyses and intelligent synthesis of the results. The resulting databases must be structured to allow complex biological knowledge to be recorded in a computable way, which requires the development of logic-based knowledge structures called ontologies. To visualise and manipulate the results, new graphical interfaces and knowledge acquisition tools are required. Finally, to help understand complex disease processes, these information systems must be equipped with the capability to integrate and make inferences over multiple data sets derived from numerous sources. RESULTS: Here I describe research, design and implementation of some of the components of such a next-generation information system. I first describe the automated pipeline system used for the annotation of the Drosophila genome, and the application of this system in genomic research. This was succeeded by the development of a flexible graphoriented database system called Chado, which relies on the use of ontologies for structuring data and knowledge. I also describe research to develop, restructure and enhance a number of biological ontologies, adding a layer of logical semantics that increases the computability of these key knowledge sources. The resulting database and ontology collection can be accessed through a suite of tools. Finally I describe how the combination of genome analysis, ontology-based database representation and powerful tools can be combined in order to make inferences about genotype-phenotype relationships within and across species. CONCLUSION: The large volumes of complex data generated by high-throughput genomic and systems biology technology threatens to overwhelm us, unless we can devise better computing tools to assist us with its analysis. Ontologies are key technologies, but many existing ontologies are not interoperable or lack features that make them computable. Here I have shown how concerted ontology, tool and database development can be applied to make inferences of value to translational research

    Dataflow development of medium-grained parallel software

    Get PDF
    PhD ThesisIn the 1980s, multiple-processor computers (multiprocessors) based on conven- tional processing elements emerged as a popular solution to the continuing demand for ever-greater computing power. These machines offer a general-purpose parallel processing platform on which the size of program units which can be efficiently executed in parallel - the "grain size" - is smaller than that offered by distributed computing environments, though greater than that of some more specialised architectures. However, programming to exploit this medium-grained parallelism remains difficult. Concurrent execution is inherently complex, yet there is a lack of programming tools to support parallel programming activities such as program design, implementation, debugging, performance tuning and so on. In helping to manage complexity in sequential programming, visual tools have often been used to great effect, which suggests one approach towards the goal of making parallel programming less difficult. This thesis examines the possibilities which the dataflow paradigm has to offer as the basis for a set of visual parallel programming tools, and presents a dataflow notation designed as a framework for medium-grained parallel programming. The implementation of this notation as a programming language is discussed, and its suitability for the medium-grained level is examinedScience and Engineering Research Council of Great Britain EC ERASMUS schem

    Quality measures and assurance for AI (Artificial Intelligence) software

    Get PDF
    This report is concerned with the application of software quality and evaluation measures to AI software and, more broadly, with the question of quality assurance for AI software. Considered are not only the metrics that attempt to measure some aspect of software quality, but also the methodologies and techniques (such as systematic testing) that attempt to improve some dimension of quality, without necessarily quantifying the extent of the improvement. The report is divided into three parts Part 1 reviews existing software quality measures, i.e., those that have been developed for, and applied to, conventional software. Part 2 considers the characteristics of AI software, the applicability and potential utility of measures and techniques identified in the first part, and reviews those few methods developed specifically for AI software. Part 3 presents an assessment and recommendations for the further exploration of this important area

    A visual object-oriented environment for LISP.

    Get PDF
    by Leong Hong Va.Thesis (M.Phil.)--Chinese University of Hong Kong, 1989.Bibliography: leaves 142-146
    corecore