3,400 research outputs found

    Optimization of adaptive test design methods for the determination of steady-state data-driven models in terms of combustion engine calibration

    Get PDF
    This thesis deals with the development of a model-based adaptive test design strategy with a focus on steady-state combustion engine calibration. The first research topic investigates the question how to handle limits in the input domain during an adaptive test design procedure. The second area of scope aims at identifying the test design method providing the best model quality improvement in terms of overall model prediction error. To consider restricted areas in the input domain, a convex hull-based solution involving a convex cone algorithm is developed, the outcome of which serves as a boundary model for a test point search. A solution is derived to enable the application of the boundary model to high-dimensional problems without calculating the exact convex hull and cones. Furthermore, different data-driven engine modeling methods are compared, resulting in the Gaussian process model as the most suitable one for a model-based calibration. To determine an appropriate test design method for a Gaussian process model application, two new strategies are developed and compared to state-of-the-art methods. A simulation-based study shows the most benefit applying a modified mutual information test design, followed by a newly developed relevance-based test design with less computational effort. The boundary model and the relevance-based test design are integrated into a multicriterial test design strategy that is tailored to match the requirements of combustion engine test bench measurements. A simulation-based study with seven and nine input parameters and four outputs each offered an average model quality improvement of 36 % and an average measured input area volume increase of 65 % compared to a non-adaptive space-filling test design. The multicriterial test design was applied to a test bench measurement with seven inputs for verification. Compared to a space-filling test design measurement, the improvement could be confirmed with an average model quality increase of 17 % over eight outputs and a 34 % larger measured input area

    Efficient processing of large-scale spatio-temporal data

    Get PDF
    Millionen Geräte, wie z.B. Mobiltelefone, Autos und Umweltsensoren senden ihre Positionen zusammen mit einem Zeitstempel und weiteren Nutzdaten an einen Server zu verschiedenen Analysezwecken. Die Positionsinformationen und übertragenen Ereignisinformationen werden als Punkte oder Polygone dargestellt. Eine weitere Art räumlicher Daten sind Rasterdaten, die zum Beispiel von Kameras und Sensoren produziert werden. Diese großen räumlich-zeitlichen Datenmengen können nur auf skalierbaren Plattformen wie Hadoop und Apache Spark verarbeitet werden, die jedoch z.B. die Nachbarschaftsinformation nicht ausnutzen können - was die Ausführung bestimmter Anfragen praktisch unmöglich macht. Die wiederholten Ausführungen der Analyseprogramme während ihrer Entwicklung und durch verschiedene Nutzer resultieren in langen Ausführungszeiten und hohen Kosten für gemietete Ressourcen, die durch die Wiederverwendung von Zwischenergebnissen reduziert werden können. Diese Arbeit beschäftigt sich mit den beiden oben beschriebenen Herausforderungen. Wir präsentieren zunächst das STARK Framework für die Verarbeitung räumlich-zeitlicher Vektor- und Rasterdaten in Apache Spark. Wir identifizieren verschiedene Algorithmen für Operatoren und analysieren, wie diese von den Eigenschaften der zugrundeliegenden Plattform profitieren können. Weiterhin wird untersucht, wie Indexe in der verteilten und parallelen Umgebung realisiert werden können. Außerdem vergleichen wir Partitionierungsmethoden, die unterschiedlich gut mit ungleichmäßiger Datenverteilung und der Größe der Datenmenge umgehen können und präsentieren einen Ansatz um die auf Operatorebene zu verarbeitende Datenmenge frühzeitig zu reduzieren. Um die Ausführungszeit von Programmen zu verkürzen, stellen wir einen Ansatz zur transparenten Materialisierung von Zwischenergebnissen vor. Dieser Ansatz benutzt ein Entscheidungsmodell, welches auf den tatsächlichen Operatorkosten basiert. In der Evaluierung vergleichen wir die verschiedenen Implementierungs- sowie Konfigurationsmöglichkeiten in STARK und identifizieren Szenarien wann Partitionierung und Indexierung eingesetzt werden sollten. Außerdem vergleichen wir STARK mit verwandten Systemen. Im zweiten Teil der Evaluierung zeigen wir, dass die transparente Wiederverwendung der materialisierten Zwischenergebnisse die Ausführungszeit der Programme signifikant verringern kann.Millions of location-aware devices, such as mobile phones, cars, and environmental sensors constantly report their positions often in combination with a timestamp to a server for different kinds of analyses. While the location information of the devices and reported events is represented as points and polygons, raster data is another type of spatial data, which is for example produced by cameras and sensors. This Big spatio-temporal Data needs to be processed on scalable platforms, such as Hadoop and Apache Spark, which, however, are unaware of, e.g., spatial neighborhood, what makes them practically impossible to use for this kind of data. The repeated executions of the programs during development and by different users result in long execution times and potentially high costs in rented clusters, which can be reduced by reusing commonly computed intermediate results. Within this thesis, we tackle the two challenges described above. First, we present the STARK framework for processing spatio-temporal vector and raster data on the Apache Spark stack. For operators, we identify several possible algorithms and study how they can benefit from the underlying platform's properties. We further investigate how indexes can be realized in the distributed and parallel architecture of Big Data processing engines and compare methods for data partitioning, which perform differently well with respect to data skew and data set size. Furthermore, an approach to reduce the amount of data to process at operator level is presented. In order to reduce the execution times, we introduce an approach to transparently recycle intermediate results of dataflow programs, based on operator costs. To compute the costs, we instrument the programs with profiling code to gather the execution time and result size of the operators. In the evaluation, we first compare the various implementation and configuration possibilities in STARK and identify scenarios when and how partitioning and indexing should be applied. We further compare STARK to related systems and show that we can achieve significantly better execution times, not only when exploiting existing partitioning information. In the second part of the evaluation, we show that with the transparent cost-based materialization and recycling of intermediate results, the execution times of programs can be reduced significantly

    Optimization of adaptive test design methods for the determination of steady-state data-driven models in terms of combustion engine calibration

    Get PDF
    This thesis deals with the development of a model-based adaptive test design strategy with a focus on steady-state combustion engine calibration. The first research topic investigates the question how to handle limits in the input domain during an adaptive test design procedure. The second area of scope aims at identifying the test design method providing the best model quality improvement in terms of overall model prediction error. To consider restricted areas in the input domain, a convex hull-based solution involving a convex cone algorithm is developed, the outcome of which serves as a boundary model for a test point search. A solution is derived to enable the application of the boundary model to high-dimensional problems without calculating the exact convex hull and cones. Furthermore, different data-driven engine modeling methods are compared, resulting in the Gaussian process model as the most suitable one for a model-based calibration. To determine an appropriate test design method for a Gaussian process model application, two new strategies are developed and compared to state-of-the-art methods. A simulation-based study shows the most benefit applying a modified mutual information test design, followed by a newly developed relevance-based test design with less computational effort. The boundary model and the relevance-based test design are integrated into a multicriterial test design strategy that is tailored to match the requirements of combustion engine test bench measurements. A simulation-based study with seven and nine input parameters and four outputs each offered an average model quality improvement of 36 % and an average measured input area volume increase of 65 % compared to a non-adaptive space-filling test design. The multicriterial test design was applied to a test bench measurement with seven inputs for verification. Compared to a space-filling test design measurement, the improvement could be confirmed with an average model quality increase of 17 % over eight outputs and a 34 % larger measured input area

    Geometric Approaches to Big Data Modeling and Performance Prediction

    Get PDF
    Big Data frameworks (e.g., Spark) have many configuration parameters, such as memory size, CPU allocation, and the number of nodes (parallelism). Regular users and even expert administrators struggle to understand the relationship between different parameter configurations and the overall performance of the system. In this work, we address this challenge by proposing a performance prediction framework to build performance models with varied configurable parameters on Spark. We take inspiration from the field of Computational Geometry to construct a d-dimensional mesh using Delaunay Triangulation over a selected set of features. From this mesh, we predict execution time for unknown feature configurations. To minimize the time and resources spent in building a model, we propose an adaptive sampling technique to allow us to collect as few training points as required. Our evaluation on a cluster of computers using several workloads shows that our prediction error is lower than the state-of-art methods while having fewer samples to train

    Reliable massively parallel symbolic computing : fault tolerance for a distributed Haskell

    Get PDF
    As the number of cores in manycore systems grows exponentially, the number of failures is also predicted to grow exponentially. Hence massively parallel computations must be able to tolerate faults. Moreover new approaches to language design and system architecture are needed to address the resilience of massively parallel heterogeneous architectures. Symbolic computation has underpinned key advances in Mathematics and Computer Science, for example in number theory, cryptography, and coding theory. Computer algebra software systems facilitate symbolic mathematics. Developing these at scale has its own distinctive set of challenges, as symbolic algorithms tend to employ complex irregular data and control structures. SymGridParII is a middleware for parallel symbolic computing on massively parallel High Performance Computing platforms. A key element of SymGridParII is a domain specific language (DSL) called Haskell Distributed Parallel Haskell (HdpH). It is explicitly designed for scalable distributed-memory parallelism, and employs work stealing to load balance dynamically generated irregular task sizes. To investigate providing scalable fault tolerant symbolic computation we design, implement and evaluate a reliable version of HdpH, HdpH-RS. Its reliable scheduler detects and handles faults, using task replication as a key recovery strategy. The scheduler supports load balancing with a fault tolerant work stealing protocol. The reliable scheduler is invoked with two fault tolerance primitives for implicit and explicit work placement, and 10 fault tolerant parallel skeletons that encapsulate common parallel programming patterns. The user is oblivious to many failures, they are instead handled by the scheduler. An operational semantics describes small-step reductions on states. A simple abstract machine for scheduling transitions and task evaluation is presented. It defines the semantics of supervised futures, and the transition rules for recovering tasks in the presence of failure. The transition rules are demonstrated with a fault-free execution, and three executions that recover from faults. The fault tolerant work stealing has been abstracted in to a Promela model. The SPIN model checker is used to exhaustively search the intersection of states in this automaton to validate a key resiliency property of the protocol. It asserts that an initially empty supervised future on the supervisor node will eventually be full in the presence of all possible combinations of failures. The performance of HdpH-RS is measured using five benchmarks. Supervised scheduling achieves a speedup of 757 with explicit task placement and 340 with lazy work stealing when executing Summatory Liouville up to 1400 cores of a HPC architecture. Moreover, supervision overheads are consistently low scaling up to 1400 cores. Low recovery overheads are observed in the presence of frequent failure when lazy on-demand work stealing is used. A Chaos Monkey mechanism has been developed for stress testing resiliency with random failure combinations. All unit tests pass in the presence of random failure, terminating with the expected results

    Cloud Based IoT Architecture

    Get PDF
    The Internet of Things (IoT) and cloud computing have grown in popularity over the past decade as the internet becomes faster and more ubiquitous. Cloud platforms are well suited to handle IoT systems as they are accessible and resilient, and they provide a scalable solution to store and analyze large amounts of IoT data. IoT applications are complex software systems and software developers need to have a thorough understanding of the capabilities, limitations, architecture, and design patterns of cloud platforms and cloud-based IoT tools to build an efficient, maintainable, and customizable IoT application. As the IoT landscape is constantly changing, research into cloud-based IoT platforms is either lacking or out of date. The goal of this thesis is to describe the basic components and requirements for a cloud-based IoT platform, to provide useful insights and experiences in implementing a cloud-based IoT solution using Microsoft Azure, and to discuss some of the shortcomings when combining IoT with a cloud platform

    Design and development of auxiliary components for a new two-stroke, stratified-charge, lean-burn gasoline engine

    Get PDF
    A unique stepped-piston engine was developed by a group of research engineers at Universiti Teknologi Malaysia (UTM), from 2003 to 2005. The development work undertaken by them engulfs design, prototyping and evaluation over a predetermined period of time which was iterative and challenging in nature. The main objective of the program is to demonstrate local R&D capabilities on small engine work that is able to produce mobile powerhouse of comparable output, having low-fuel consumption and acceptable emission than its crankcase counterpart of similar displacement. A two-stroke engine work was selected as it posses a number of technological challenges, increase in its thermal efficiency, which upon successful undertakings will be useful in assisting the group in future powertrain undertakings in UTM. In its carbureted version, the single-cylinder aircooled engine incorporates a three-port transfer system and a dedicated crankcase breather. These features will enable the prototype to have high induction efficiency and to behave very much a two-stroke engine but equipped with a four-stroke crankcase lubrication system. After a series of analytical work the engine was subjected to a series of laboratory trials. It was also tested on a small watercraft platform with promising indication of its flexibility of use as a prime mover in mobile platform. In an effort to further enhance its technology features, the researchers have also embarked on the development of an add-on auxiliary system. The system comprises of an engine control unit (ECU), a directinjector unit, a dedicated lubricant dispenser unit and an embedded common rail fuel unit. This support system was incorporated onto the engine to demonstrate the finer points of environmental-friendly and fuel economy features. The outcome of this complete package is described in the report, covering the methodology and the final characteristics of the mobile power plant
    corecore