92 research outputs found
Group-based parallel multi-scheduler for grid computing
With the advent in multicore computers, the scheduling of Grid jobs can be made more effective if scaled to fully utilize the underlying hardware, and parallelized to benefit from the exploitation of multicores. The fact that sequential algorithms do not scale with multicore systems nor benefit from parallelism remains a major obstacle to scheduling in the Grid. As multicore systems become ever more pervasive in our computing lives, over reliance on such systems for passive parallelism does not offer the best option in harnessing the benefits of their multiprocessors for Grid scheduling. An explicit means of exploiting parallelism for Grid scheduling is required. The Group-based Parallel Multi-scheduler, introduced in this paper, is aimed at effectively exploiting the benefits of multicore systems for Grid scheduling by splitting jobs and machines into paired groups and independently scheduling jobs in parallel from those groups. We implemented two job grouping methods, Execution Time Balanced (ETB) and Execution Time Sorted then Balanced (ETSB), and two machine grouping methods, Evenly Distributed (EvenDist) and Similar Together (SimTog). For each method, we varied the number of groups between 2, 4 and 8. We then executed the MinMin Grid scheduling algorithm independently within the groups. We demonstrated that by sharing jobs and machines into groups before scheduling, the computation time for the scheduling process drastically improved by magnitudes of 85% over the ordinary MinMin algorithm when implemented on a HPC system. We also found that our balanced group based approach achieved better results than our previous Priority based grouping approach
Priority-grouping method for parallel multi-scheduling in Grid
With the advent in multicore computers, the scheduling of Grid jobs can be made more effective if scaled to fully utilize the underlying hardware, and parallelized to benefit from the exploitation of multicores. The fact that sequential algorithms do not scale with multicore systems nor benefit from parallelism remains a major obstacle to scheduling in the Grid. As multicore systems become ever more pervasive in our computing lives, over reliance on such systems for passive parallelism does not offer the best option in harnessing the benefits of their multiprocessors for Grid scheduling. An explicit means of exploiting parallelism for Grid scheduling is required. The Group-based Parallel Multi-scheduler, introduced in this paper, is aimed at effectively exploiting the benefits of multicore systems for Grid scheduling by splitting jobs and machines into paired groups and independently scheduling jobs in parallel from those groups. We implemented two job grouping methods, Execution Time Balanced (ETB) and Execution Time Sorted then Balanced (ETSB), and two machine grouping methods, Evenly Distributed (EvenDist) and Similar Together (SimTog). For each method, we varied the number of groups between 2, 4 and 8. We then executed the MinMin Grid scheduling algorithm independently within the groups. We demonstrated that by sharing jobs and machines into groups before scheduling, the computation time for the scheduling process drastically improved by magnitudes of 85% over the ordinary MinMin algorithm when implemented on a HPC system. We also found that our balanced group based approach achieved better results than our previous Priority based grouping approach
Automation and analysis of high-dimensionality experiments in biocatalytic reaction screening
Biological catalysts are increasingly used in industry in high-throughput screening for
drug discovery or for the biocatalytic synthesis of active pharmaceutical
intermediates (APIs). Their activity is dependent on high-dimensionality
physiochemical processes which are affected by numerous potentially interacting
factors such as temperature, pH, substrates, solvents, salinity, and so on. To generate
accurate models that map the performance of such systems, it is critical to developing
effective experimental and analytical frameworks. However, investigating numerous
factors of interest can become unfeasible for conventional manual experimentation
which can be time-consuming and prone to human error.
In this thesis, an effective framework for the execution and analysis of highdimensionality experiments that implement a Design of Experiments (DoE)
methodology was created. DoE applies a statistical framework to the simultaneous
investigation of multiple factors of interest. To convert the DoE design into a
physically executable experiment, the Synthace Life Sciences R&D cloud platform was
used where experimental conditions were translated into liquid handling instructions
and executed on multiple automated devices. The framework was exemplified by
quantifying the activity of an industrially relevant biocatalyst, the CV2025 ωtransaminase enzyme from Chromobacterium violaceum, for the conversion of Smethylbenzylamine (MBA) and pyruvate into acetophenone and sodium alanine.
The automation and analysis of high-dimensionality experiments for screening of the
CV2025 TAm biocatalytic reaction were carried out in three sequential stages. In the
first stage, the basic process of Synthace-driven automated DoE execution was
demonstrated by executing traditional DoE studies. This comprised of a screening
study that investigated the impact of nine factors of interest, after which an
optimisation study was conducted by taking forward five factors of interest using two
automated devices to optimise assay conditions further. In total, 480 experimental
conditions were executed and analysed to generate mathematical models that
identified an optimum. Robust assay conditions were identified which increased
enzyme activity >3-fold over the starting conditions. In the second stage, nonbiological considerations that impact absorbance-based assay performance were
systematically investigated. These considerations were critical to ensuring reliable
and precise data generation from future high-dimensionality experiments and
include confirming spectrophotometer settings, selecting microplate type and reaction volume, testing device precision, and managing evaporation as a function of
time.
The final stage of the work involved development of a framework for the
implementation of a modern type of DoE design called a space-filling design (SFD).
SFDs sample factors of interest at numerous settings and can provide a fine-grained
characterisation of high-dimensional systems in a single experimental run. However,
they are rarely used in biological research due to a large number of experiments
required and their demanding, highly variable pipetting requirements. The
established framework enabled the execution and analysis of an automated end-toend SFD where 3,456 experimental conditions were prepared to investigate a 12-
dimensional space characterising CV2025 TAm activity. Factors of interest included
temperature, pH, buffering agent types, enzyme stability, co-factor, substrate, salt,
and solvent concentrations. MATLAB scripts were developed to calculate important
biocatalysis metrics of product yield and initial rate which were then used to build
mathematical models that were physically validated to confirm successful model
prediction. The implementation of the framework provided greater insight into
numerous factors influencing CV2025 TAm activity in more dimensions than what
was previously reported in the literature and to our knowledge is the first large-scale
study that employs a SFD for assay characterisation.
The developed framework is generic in nature and represents a powerful tool for
rapid one-step characterisation of high-dimensionality systems. Industrial
implementation of the framework could help reduce the time and costs involved in
the development of high throughput screens and biocatalytic reaction optimisation
UQ and AI: data fusion, inverse identification, and multiscale uncertainty propagation in aerospace components
A key requirement for engineering designs is that they offer good performance across a range of uncertain conditions while exhibiting an admissibly low probability of failure. In order to design components that offer good performance across a range of uncertain conditions, it is necessary to take account of the effect of the uncertainties associated with a candidate design. Uncertainty Quantification (UQ) methods are statistical methods that may be used to quantify the effect of the uncertainties inherent in a system on its performance. This thesis expands the envelope of UQ methods for the design of aerospace components, supporting the integration of UQ methods in product development by addressing four industrial challenges.
Firstly, a method for propagating uncertainty through computational models in a hierachy of scales is described that is based on probabilistic equivalence and Non-Intrusive Polynomial Chaos (NIPC). This problem is relevant to the design of aerospace components as the computational models used to evaluate candidate designs are typically multiscale. This method was then extended to develop a formulation for inverse identification, where the probability distributions for the material properties of a coupon are deduced from measurements of its response. We demonstrate how probabilistic equivalence and the Maximum Entropy Principle (MEP) may be used to leverage data from simulations with scarce experimental data- with the intention of making this stage of product design less expensive and time consuming.
The third contribution of this thesis is to develop two novel meta-modelling strategies to promote the wider exploration of the design space during the conceptual design phase. Design Space Exploration (DSE) in this phase is crucial as decisions made at the early, conceptual stages of an aircraft design can restrict the range of alternative designs available at later stages in the design process, despite limited quantitative knowledge of the interaction between requirements being available at this stage. A histogram interpolation algorithm is presented that allows the designer to interactively explore the design space with a model-free formulation, while a meta-model based on Knowledge Based Neural Networks (KBaNNs) is proposed in which the outputs of a high-level, inexpensive computer code are informed by the outputs of a neural network, in this way addressing the criticism of neural networks that they are purely data-driven and operate as black boxes.
The final challenge addressed by this thesis is how to iteratively improve a meta-model by expanding the dataset used to train it. Given the reliance of UQ methods on meta-models this is an important challenge. This thesis proposes an adaptive learning algorithm for Support Vector Machine (SVM) metamodels, which are used to approximate an unknown function. In particular, we apply the adaptive learning algorithm to test cases in reliability analysis.Open Acces
Simulating Land Use Land Cover Change Using Data Mining and Machine Learning Algorithms
The objectives of this dissertation are to: (1) review the breadth and depth of land use land cover (LUCC) issues that are being addressed by the land change science community by discussing how an existing model, Purdue\u27s Land Transformation Model (LTM), has been used to better understand these very important issues; (2) summarize the current state-of-the-art in LUCC modeling in an attempt to provide a context for the advances in LUCC modeling presented here; (3) use a variety of statistical, data mining and machine learning algorithms to model single LUCC transitions in diverse regions of the world (e.g. United States and Africa) in order to determine which tools are most effective in modeling common LUCC patterns that are nonlinear; (4) develop new techniques for modeling multiple class (MC) transitions at the same time using existing LUCC models as these models are rare and in great demand; (5) reconfigure the existing LTM for urban growth boundary (UGB) simulation because UGB modeling has been ignored by the LUCC modeling community, and (6) compare two rule based models for urban growth boundary simulation for use in UGB land use planning.
The review of LTM applications during the last decade indicates that a model like the LTM has addressed a majority of land change science issues although it has not explicitly been used to study terrestrial biodiversity issues. The review of the existing LUCC models indicates that there is no unique typology to differentiate between LUCC model structures and no models exist for UGB. Simulations designed to compare multiple models show that ANN-based LTM results are similar to Multivariate Adaptive Regression Spline (MARS)-based models and both ANN and MARS-based models outperform Classification and Regression Tree (CART)-based models for modeling single LULC transition; however, for modeling MC, an ANN-based LTM-MC is similar in goodness of fit to CART and both models outperform MARS in different regions of the world. In simulations across three regions (two in United States and one in Africa), the LTM had better goodness of fit measures while the outcome of CART and MARS were more interpretable and understandable than the ANN-based LTM. Modeling MC LUCC require the examination of several class separation rules and is thus more complicated than single LULC transition modeling; more research is clearly needed in this area. One of the greatest challenges identified with MC modeling is evaluating error distributions and map accuracies for multiple classes. A modified ANN-based LTM and a simple rule based UGBM outperformed a null model in all cardinal directions. For UGBM model to be useful for planning, other factors need to be considered including a separate routine that would determine urban quantity over time
Feasibility analysis of using special purpose machines for drilling-related operations
This work focuses on special purpose machine tools (SPMs), providing a modular platform for performing drilling-related operations. One of the main challenges in using SPMs is selecting the most appropriate machine tool among many alternatives. This thesis introduces a feasibility analysis procedure developed to support decision-making through the assessment of the strengths and limitations of SPMs. To achieve this, technical and economic feasibility analyses, a sensitivity analysis, and an optimisation model were developed and a case study was provided for each analysis. The results indicated that although technical feasibility analysis leads decision-makers to select a feasible machine tool, complementary analyses are required for making an informed decision and improving profitability. Accordingly, a mathematical cost model was developed to perform economic and sensitivity analyses and investigate the profitability of any selected SPM configuration. In addition, an optimisation procedure was applied to the cost model in order to investigate the effect of process parameters and the SPM configuration on the decision-making. Finally, the developed analyses were then integrated into a model in a proper sequence that can evaluate whether the SPM is appropriate for producing the given part and achieving higher productivity. To validate this integrated model three different case studies were presented and results were discussed. The results showed that the developed model is a very useful tool in assisting manufacturers to evaluate the performance of SPMs in comparison with other alternatives considered from different perspectives
Scalable Task Schedulers for Many-Core Architectures
This thesis develops schedulers for many-cores with different optimization objectives. The proposed schedulers are designed to be scale up as the number of cores in many-cores increase while continuing to provide guarantees on the quality of the schedule
Validierung von MultiView-basierten Prozessmodellen mit grafischen Validierungsregeln
Die Bedeutung und Verbreitung von Software wächst im betrieblichen und privaten Umfeld stetig. Das primäre Ziel bei der Verwendung von Software ist die Optimierung manueller oder bereits (teil-) automatisierter Problem- bzw. Aufgabenstellungen.
Der zentrale Bezugspunkt bei der Entwicklung der Software ist die Softwarespezifikation. Diese beinhaltet im Idealfall alle für die Softwarelösung relevanten Anforderungen.
Ein an Bedeutung gewinnender Bestandteil der Spezifikation sind Geschäftsprozessmodelle. Diese beschreiben dabei die Abläufe der zu entwickelnden Softwarelösung in Form von grafischen Prozessdarstellungen. Aufgrund der zunehmenden Anreicherung der Prozessmodelle mit Anforderungen und Informationen wie bspw. gesetzlichen Bestimmungen oder Details für die modellgetriebene Softwareentwicklung erwachsen aus einfachen Ablaufdarstellungen komplexe und umfangreiche Geschäftsprozessmodelle.
Unabhängig davon, ob Geschäftsprozessmodelle zur reinen Spezifikation bzw. Dokumentation dienen oder für die modellgetriebene Softwareentwicklung eingesetzt werden, ist ein zentrales Ziel die Sicherstellung der inhaltlichen Korrektheit der Geschäftsprozessmodelle und damit der darin modellierten Anforderungen. In aktuellen Softwareentwicklungsprozessen werden dazu häufig manuelle Prüfverfahren eingesetzt, welche jedoch häufig sowohl zeit- als auch kostenintensiv und zudem fehleranfällig sind. Automatisierbare Verfahren benötigen allerdings formale Spezifikationssprachen. Diese werden aber aufgrund ihrer mathematisch anmutenden textuellen Darstellung im Umfeld der Geschäftsprozessmodellierung meist abgelehnt. Im Gegensatz zu textuellen Darstellungen sind grafische Repräsentationen häufig leichter verständlich und werden vor allem im Bereich der Geschäftsprozessmodellierung eher akzeptiert.
Im Rahmen der Arbeit wird daher ein auf formalen grafischen Validierungsregeln basierendes Konzept zur Überprüfung der inhaltlichen Korrektheit von Geschäftsprozessmodellen vorgestellt. Das Konzept ist dabei unabhängig von der Modellierungssprache der Geschäftsprozessmodelle sowie von der Spezifikationssprache der Validierungsregeln.
Zur Verbesserung der Beherrschbarkeit der zunehmend komplexen und umfangreichen Geschäftsprozessmodelle wird zudem ein als MultiVview bezeichnetes Sichtenkonzept vorgestellt. Dies dient zur Reduzierung der grafischen Komplexität und zur Zuordnung von Aufgaben- und Verantwortungsbereichen (beispielsweise Datenschutz- und Sicherheitsmodellierung) bei der Geschäftsprozessmodellierung.
Das Gesamtkonzept wurde prototypisch in der Software ARIS Business Architect und als Plug-in für die Entwicklungsumgebung Eclipse realisiert. Eine Evaluation erfolgt zum einen an dem Eclipse Plug-in anhand eines Requirements Engineering Tool Evaluation Framework und zum anderen anhand von Anwendungsfällen aus dem Bereich der öffentlichen Verwaltung, der ELSTER-Steuererklärung und SAP-Referenzprozessen
- …