258 research outputs found

    State-of-the-art in aerodynamic shape optimisation methods

    Get PDF
    Aerodynamic optimisation has become an indispensable component for any aerodynamic design over the past 60 years, with applications to aircraft, cars, trains, bridges, wind turbines, internal pipe flows, and cavities, among others, and is thus relevant in many facets of technology. With advancements in computational power, automated design optimisation procedures have become more competent, however, there is an ambiguity and bias throughout the literature with regards to relative performance of optimisation architectures and employed algorithms. This paper provides a well-balanced critical review of the dominant optimisation approaches that have been integrated with aerodynamic theory for the purpose of shape optimisation. A total of 229 papers, published in more than 120 journals and conference proceedings, have been classified into 6 different optimisation algorithm approaches. The material cited includes some of the most well-established authors and publications in the field of aerodynamic optimisation. This paper aims to eliminate bias toward certain algorithms by analysing the limitations, drawbacks, and the benefits of the most utilised optimisation approaches. This review provides comprehensive but straightforward insight for non-specialists and reference detailing the current state for specialist practitioners

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Probabilistic modelling of oil rig drilling operations for business decision support: a real world application of Bayesian networks and computational intelligence.

    Get PDF
    This work investigates the use of evolved Bayesian networks learning algorithms based on computational intelligence meta-heuristic algorithms. These algorithms are applied to a new domain provided by the exclusive data, available to this project from an industry partnership with ODS-Petrodata, a business intelligence company in Aberdeen, Scotland. This research proposes statistical models that serve as a foundation for building a novel operational tool for forecasting the performance of rig drilling operations. A prototype for a tool able to forecast the future performance of a drilling operation is created using the obtained data, the statistical model and the experts' domain knowledge. This work makes the following contributions: applying K2GA and Bayesian networks to a real-world industry problem; developing a well-performing and adaptive solution to forecast oil drilling rig performance; using the knowledge of industry experts to guide the creation of competitive models; creating models able to forecast oil drilling rig performance consistently with nearly 80% forecast accuracy, using either logistic regression or Bayesian network learning using genetic algorithms; introducing the node juxtaposition analysis graph, which allows the visualisation of the frequency of nodes links appearing in a set of orderings, thereby providing new insights when analysing node ordering landscapes; exploring the correlation factors between model score and model predictive accuracy, and showing that the model score does not correlate with the predictive accuracy of the model; exploring a method for feature selection using multiple algorithms and drastically reducing the modelling time by multiple factors; proposing new fixed structure Bayesian network learning algorithms for node ordering search-space exploration. Finally, this work proposes real-world applications for the models based on current industry needs, such as recommender systems, an oil drilling rig selection tool, a user-ready rig performance forecasting software and rig scheduling tools

    A framework for automation of data recording, modelling, and optimal statistical control of production lines

    Get PDF
    Unarguably, the automation of data collection and subsequent statistical treatment enhance the quality of industrial management systems. The rise of accessible digital technologies has enabled the introduction of the Industry 4.0 pillars in Cariri local companies. Particularly, such practice positively contributes to the triple bottom line of sustainable development: People, Environment, and Economy. The present work aims to provide a general automated framework for data recording and statistical control of conveyor belts in production lines. The software has been developed in three layers: graphical user interface, in PHP language; database collection, search, and safeguard, in MySQL; computational statistics, in R; and hardware control, in C. The computational statistics are based on the combination of artificial neural nets and autoregressive integrated and moving average models, via minimal variance method. The hardware components are composed by open source hardware as Arduino based boards and modular or industrial sensors. Specifically, the embedded system is designed to constantly monitor and record a number of measurable characteristics of the conveyor belts (e.g. electric consumption and temperature), via a number of sensors, allowing both the computation of statistical control metrics and the evaluation of the quality of the production system. As a case study, the project makes use of a laminated limestone production line, located at the Mineral Technology Center, Nova Olinda, Ceará state, Brazil.Indiscutivelmente, a automação da coleta de dados e o subsequente tratamento estatístico aumentam a qualidade dos sistemas de gestão industrial. O surgimento de tecnologias digitais acessíveis possibilitou a introdução dos pilares da Indústria 4.0 nas empresas locais do Cariri. Particularmente, tal prática contribui positivamente para o triplo resultado do desenvolvimento sustentável: Pessoas, Meio Ambiente e Economia. O presente trabalho tem como objetivo fornecer um Framework geral automatizado para registro de dados e controle estatístico de esteiras transportadoras em linhas de produção. O software foi desenvolvido em três camadas: interface gráfica do usuário, em linguagem PHP; coleta, pesquisa e proteção de banco de dados em MySQL; estatística computacional, em R; e controle de hardware, em C. As estatísticas computacionais são baseadas na combinação de redes neurais artificiais e modelos autorregressivos integrados e de média móvel, via método de mínima variância. Os componentes de hardware são compostos por hardware open source como placas baseadas em Arduino e sensores modulares ou industriais. Especificamente, o sistema embarcado é projetado para monitorar e registrar constantemente uma série de características mensuráveis das esteiras transportadoras (por exemplo, consumo elétrico e temperatura), por meio de uma série de sensores, permitindo tanto o cálculo de métricas de controle estatístico quanto a avaliação da qualidade do sistema de produção. Como estudo de caso, o projeto utiliza uma linha de produção de calcário laminado, localizada no Centro de Tecnologia Mineral, Nova Olinda, Ceará, Brasil

    Bayesian networks for spatio-temporal integrated catchment assessment

    Get PDF
    Includes abstract.Includes bibliographical references (leaves 181-203).In this thesis, a methodology for integrated catchment water resources assessment using Bayesian Networks was developed. A custom made software application that combines Bayesian Networks with GIS was used to facilitate data pre-processing and spatial modelling. Dynamic Bayesian Networks were implemented in the software for time-series modelling

    Machine learning of genomic profiles

    Get PDF
    Gegenstand dieser Arbeit ist das maschinelle Lernen und seine Anwendung auf genomische Profile. Maschinelles Lernen ist ein Teilbereich der Informatik, der sich mit der Analyse und dem Design von Algorithmen beschaftigt, die Regeln und Muster aus Datensätzen ableiten. Genomische Profile beschreiben Veränderungen der DNA, z.B. der Anzahl ihrer Kopien. Tumorerkrankungen werden oftmals von diesen genomischen Veränderungen hervorgerufen. Es werden verschiedene Verfahren des maschinellen Lernens auf ihre Anwendbarkeit in Bezug auf genomische Profile untersucht. Des Weiteren wird eine Verlustfunktion für Überlebenszeitdaten entworfen. Anschließend wird ein analytischer Bezugsrahmen entwickelt, um Aberrationsmuster zu finden, die mit einer speziellen Tumorerkrankung assoziiert sind. Der Bezugsrahmen umfaßt die Vorverarbeitung, Merkmalsselektion und Diskretisierung von genomischen Profilen sowie Strategien zum Umgang mit fehlenden Werten und eine mehrdimensionale Analyse. Abschließend folgen das Training und die Analyse des Klassifikators. In dieser Arbeit wird weiterhin eine Erklärungskomponente vorgestellt, die wichtige Merkmale für die Klassifikation eines Falles identifiziert und ein Maß für die Richtigkeit einer Klassifikation liefert. Solch eine Erklärungskomponente kann die Basis für die Integration eines Klassifikators , z.B. einer Support-Vektor-Maschine, in ein entscheidungsunterstützendes System sein. Die im Rahmen dieser Arbeit entwickelten Methoden wurden erfolgreich zur Beantwortung von biologischen Fragestellungen wie der frühen Metastasierung oder der Mikrometastasierung angewandt und führten zur Entdeckung bisher unbekannter Tumormarker. Zusammenfassend zeigen die Ergebnisse der vorliegenden Arbeit, dass Verfahren des maschinellen Lernens zum Erkenntnisgewinn in Bezug auf genomische Veränderungen beitragen und Möglichkeiten zu einer weiteren Verbesserung der Therapie für Tumorpatienten aufzeigen

    Novel sampling techniques for reservoir history matching optimisation and uncertainty quantification in flow prediction

    Get PDF
    Modern reservoir management has an increasing focus on accurately predicting the likely range of field recoveries. A variety of assisted history matching techniques has been developed across the research community concerned with this topic. These techniques are based on obtaining multiple models that closely reproduce the historical flow behaviour of a reservoir. The set of resulted history matched models is then used to quantify uncertainty in predicting the future performance of the reservoir and providing economic evaluations for different field development strategies. The key step in this workflow is to employ algorithms that sample the parameter space in an efficient but appropriate manner. The algorithm choice has an impact on how fast a model is obtained and how well the model fits the production data. The sampling techniques that have been developed to date include, among others, gradient based methods, evolutionary algorithms, and ensemble Kalman filter (EnKF). This thesis has investigated and further developed the following sampling and inference techniques: Particle Swarm Optimisation (PSO), Hamiltonian Monte Carlo, and Population Markov Chain Monte Carlo. The inspected techniques have the capability of navigating the parameter space and producing history matched models that can be used to quantify the uncertainty in the forecasts in a faster and more reliable way. The analysis of these techniques, compared with Neighbourhood Algorithm (NA), has shown how the different techniques affect the predicted recovery from petroleum systems and the benefits of the developed methods over the NA. The history matching problem is multi-objective in nature, with the production data possibly consisting of multiple types, coming from different wells, and collected at different times. Multiple objectives can be constructed from these data and explicitly be optimised in the multi-objective scheme. The thesis has extended the PSO to handle multi-objective history matching problems in which a number of possible conflicting objectives must be satisfied simultaneously. The benefits and efficiency of innovative multi-objective particle swarm scheme (MOPSO) are demonstrated for synthetic reservoirs. It is demonstrated that the MOPSO procedure can provide a substantial improvement in finding a diverse set of good fitting models with a fewer number of very costly forward simulations runs than the standard single objective case, depending on how the objectives are constructed. The thesis has also shown how to tackle a large number of unknown parameters through the coupling of high performance global optimisation algorithms, such as PSO, with model reduction techniques such as kernel principal component analysis (PCA), for parameterising spatially correlated random fields. The results of the PSO-PCA coupling applied to a recent SPE benchmark history matching problem have demonstrated that the approach is indeed applicable for practical problems. A comparison of PSO with the EnKF data assimilation method has been carried out and has concluded that both methods have obtained comparable results on the example case. This point reinforces the need for using a range of assisted history matching algorithms for more confidence in predictions

    Learning to hash for large scale image retrieval

    Get PDF
    This thesis is concerned with improving the effectiveness of nearest neighbour search. Nearest neighbour search is the problem of finding the most similar data-points to a query in a database, and is a fundamental operation that has found wide applicability in many fields. In this thesis the focus is placed on hashing-based approximate nearest neighbour search methods that generate similar binary hashcodes for similar data-points. These hashcodes can be used as the indices into the buckets of hashtables for fast search. This work explores how the quality of search can be improved by learning task specific binary hashcodes. The generation of a binary hashcode comprises two main steps carried out sequentially: projection of the image feature vector onto the normal vectors of a set of hyperplanes partitioning the input feature space followed by a quantisation operation that uses a single threshold to binarise the resulting projections to obtain the hashcodes. The degree to which these operations preserve the relative distances between the datapoints in the input feature space has a direct influence on the effectiveness of using the resulting hashcodes for nearest neighbour search. In this thesis I argue that the retrieval effectiveness of existing hashing-based nearest neighbour search methods can be increased by learning the thresholds and hyperplanes based on the distribution of the input data. The first contribution is a model for learning multiple quantisation thresholds. I demonstrate that the best threshold positioning is projection specific and introduce a novel clustering algorithm for threshold optimisation. The second contribution extends this algorithm by learning the optimal allocation of quantisation thresholds per hyperplane. In doing so I argue that some hyperplanes are naturally more effective than others at capturing the distribution of the data and should therefore attract a greater allocation of quantisation thresholds. The third contribution focuses on the complementary problem of learning the hashing hyperplanes. I introduce a multi-step iterative model that, in the first step, regularises the hashcodes over a data-point adjacency graph, which encourages similar data-points to be assigned similar hashcodes. In the second step, binary classifiers are learnt to separate opposing bits with maximum margin. This algorithm is extended to learn hyperplanes that can generate similar hashcodes for similar data-points in two different feature spaces (e.g. text and images). Individually the performance of these algorithms is often superior to competitive baselines. I unify my contributions by demonstrating that learning hyperplanes and thresholds as part of the same model can yield an additive increase in retrieval effectiveness

    Large-scale simulations of intrinsic parameter fluctuations in nano-scale MOSFETs

    Get PDF
    Intrinsic parameter fluctuations have become a serious obstacle to the continued scaling of MOSFET devices, particularly in the sub-100 nm regime. The increase in intrinsic parameter fluctuations means that simulations on a statistical scale are necessary to capture device parameter distributions. In this work, large-scale simulations of samples of 100,000s of devices are carried out in order to accurately characterise statistical variability of the threshold voltage in a real 35 nm MOSFET. Simulations were performed for the two dominant sources of statistical variability – random discrete dopants (RDD) and line edge roughness (LER). In total ∼400,000 devices have been simulated, taking approximately 500,000 CPU hours (60 CPU years). The results reveal the true shape of the distribution of threshold voltage, which is shown to be positively skewed for random dopants and negatively skewed for line edge roughness. Through further statistical analysis and data mining, techniques for reconstructing the distributions of the threshold voltage are developed. By using these techniques, methods are demonstrated that allow statistical enhancement of random dopant and line edge roughness simulations, thereby reducing the computational expense necessary to accurately characterise their effects. The accuracy of these techniques is analysed and they are further verified against scaled and alternative device architectures. The combined effects of RDD and LER are also investigated and it is demonstrated that the statistical combination of the individual RDD and LER-induced distributions of threshold voltage closely matches that obtained from simulations. By applying the statistical enhancement techniques developed for RDD and LER, it is shown that the computational cost of characterising their effects can be reduced by 1–2 orders of magnitude
    corecore