770 research outputs found

    DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines

    Get PDF
    Integrated data analysis (IDA) pipelines—that combine data management (DM) and query processing, high-performance computing (HPC), and machine learning (ML) training and scoring—become increasingly common in practice. Interestingly, systems of these areas share many compilation and runtime techniques, and the used—increasingly heterogeneous—hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, data formats and representations, as well as execution strategies differ substantially. DAPHNE is an open and extensible system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for increasing productivity and eliminating unnecessary overheads. In this paper, we make a case for IDA pipelines, describe the overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas, DuckDB, and TensorFlow show promising results

    Web service composition: A survey of techniques and tools

    Get PDF
    Web services are a consolidated reality of the modern Web with tremendous, increasing impact on everyday computing tasks. They turned the Web into the largest, most accepted, and most vivid distributed computing platform ever. Yet, the use and integration of Web services into composite services or applications, which is a highly sensible and conceptually non-trivial task, is still not unleashing its full magnitude of power. A consolidated analysis framework that advances the fundamental understanding of Web service composition building blocks in terms of concepts, models, languages, productivity support techniques, and tools is required. This framework is necessary to enable effective exploration, understanding, assessing, comparing, and selecting service composition models, languages, techniques, platforms, and tools. This article establishes such a framework and reviews the state of the art in service composition from an unprecedented, holistic perspective

    Privacy Preserving Network Security Data Analytics: Architectures and System Design

    Get PDF
    An incessant rhythm of data breaches, data leaks, and privacy exposure highlights the need to improve control over potentially sensitive data. History has shown that neither public nor private sector organizations are immune. Lax data handling, incidental leakage, and adversarial breaches are all contributing factors. Prudent organizations should consider the sensitive nature of network security data. Logged events often contain data elements that are directly correlated with sensitive information about people and their activities -- often at the same level of detail as sensor data. Our intent is to produce a database which holds network security data representative of people\u27s interaction with the network mid-points and end-points without the problems of identifiability. In this paper we discuss architectures and propose a system design that supports a risk based approach to privacy preserving data publication of network security data that enables network security data analytics research

    Towards a Regression Test Selection Technique for Message-Based Software Integration

    Get PDF
    Regression testing is essential to ensure software quality. Regression Test-case selection is another process wherein, the testers would like to ensure that test-cases which are obsolete due to the changes in the system should not be considered for further testing. This is the Regression Test-case Selection problem. Although existing research has addressed many related problems, most of the existing regression test-case selection techniques cater to procedural systems. Being academic, they lack the scalability and detail to cater to multi-tier applications. Such techniques can be employed for procedural systems, usually mathematical applications. Enterprise applications have become complex and distributed leading to component-based architectures. Thus, inter-process communication has become a very important activity of any such system. Messaging is the most widely employed intermodule interaction mechanism. Today\u27s systems, being heavily internet dependent, are Web-Services based which utilize XML for messaging. We propose an RTS technique which is specifically targeted at enterprise applications

    Towards a Regression Test Selection Technique for Message-Based Software Integration

    Get PDF
    Regression testing is essential to ensure software quality. Regression Test-case selection is another process wherein, the testers would like to ensure that test-cases which are obsolete due to the changes in the system should not be considered for further testing. This is the Regression Test-case Selection problem. Although existing research has addressed many related problems, most of the existing regression test-case selection techniques cater to procedural systems. Being academic, they lack the scalability and detail to cater to multi-tier applications. Such techniques can be employed for procedural systems, usually mathematical applications. Enterprise applications have become complex and distributed leading to component-based architectures. Thus, inter-process communication has become a very important activity of any such system. Messaging is the most widely employed intermodule interaction mechanism. Today\u27s systems, being heavily internet dependent, are Web-Services based which utilize XML for messaging. We propose an RTS technique which is specifically targeted at enterprise applications

    Überblick zur Softwareentwicklung in Wissenschaftlichen Anwendungen

    Get PDF
    Viele wissenschaftliche Disziplinen mĂŒssen heute immer komplexer werdende numerische Probleme lösen. Die KomplexitĂ€t der benutzten wissenschaftlichen Software steigt dabei kontinuierlich an. Diese KomplexitĂ€tssteigerung wird durch eine ganze Reihe sich Ă€ndernder Anforderungen verursacht: Die Betrachtung gekoppelter PhĂ€nomene gewinnt Aufmerksamkeit und gleichzeitig mĂŒssen neue Technologien wie das Grid-Computing oder neue Multiprozessorarchitekturen genutzt werden, um weiterhin in angemessener Zeit zu Berechnungsergebnissen zu kommen. Diese FĂŒlle an neuen Anforderungen kann nicht mehr von kleinen spezialisierten Wissenschaftlergruppen in Isolation bewĂ€ltigt werden. Die Entwicklung wissenschaftlicher Software muss vielmehr in interdisziplinĂ€ren Gruppen geschehen, was neue Herausforderungen in der Softwareentwicklung induziert. Ein Paradigmenwechsel zu einer stĂ€rkeren Separation von Verantwortlichkeiten innerhalb interdisziplinĂ€rer Entwicklergruppen ist bis jetzt in vielen FĂ€llen nur in AnsĂ€tzen erkennbar. Die Kopplung partitioniert durchgefĂŒhrter Simulationen physikalischer PhĂ€nomene ist ein wichtiges Beispiel fĂŒr softwaretechnisch herausfordernde Aufgaben im Gebiet des wissenschaftlichen Rechnens. In diesem Kontext modellieren verschiedene Simulationsprogramme unterschiedliche Teile eines komplexeren gekoppelten Systems. Die vorliegende Arbeit gibt einen Überblick ĂŒber Paradigmen, die darauf abzielen Softwareentwicklung fĂŒr Berechnungsprogramme verlĂ€sslicher und weniger abhĂ€ngig voneinander zu machen. Ein spezielles Augenmerk liegt auf der Entwicklung gekoppelter Simulationen.Fields of modern science and engineering are in need of solving more and more complex numerical problems. The complexity of scientiïŹc software thereby rises continuously. This growth is caused by a number of changing requirements. Coupled phenomena gain importance and new technologies like the computational Grid, graphical and heterogeneous multi-core processors have to be used to achieve high-performance. The amount of additional complexity can not be handled by small groups of specialised scientists. The interdiciplinary nature of scientiïŹc software thereby presents new challanges for software engineering. A paradigm shift towards a stronger separation of concerns becomes necessary in the development of future scientiïŹc software. The coupling of independently simulated physical phenomena is an important example for a software-engineering concern in the domain of computational science. In this context, different simulation-programs model only a part of a more complex coupled system. The present work gives overview on paradigms which aim at making software-development in computational sciences more reliable and less interdependent. A special focus is put on the development of coupled simulations
    • 

    corecore