2,344 research outputs found

    The discovery of new functional oxides using combinatorial techniques and advanced data mining algorithms

    Get PDF
    Electroceramic materials research is a wide ranging field driven by device applications. For many years, the demand for new materials was addressed largely through serial processing and analysis of samples often similar in composition to those already characterised. The Functional Oxide Discovery project (FOXD) is a combinatorial materials discovery project combining high-throughput synthesis and characterisation with advanced data mining to develop novel materials. Dielectric ceramics are of interest for use in telecommunications equipment; oxygen ion conductors are examined for use in fuel cell cathodes. Both applications are subject to ever increasing industry demands and materials designs capable of meeting the stringent requirements are urgently required. The London University Search Instrument (LUSI) is a combinatorial robot employed for materials synthesis. Ceramic samples are produced automatically using an ink-jet printer which mixes and prints inks onto alumina slides. The slides are transferred to a furnace for sintering and transported to other locations for analysis. Production and analysis data are stored in the project database. The database forms a valuable resource detailing the progress of the project and forming a basis for data mining. Materials design is a two stage process. The first stage, forward prediction, is accomplished using an artificial neural network, a Baconian, inductive technique. In a second stage, the artificial neural network is inverted using a genetic algorithm. The artificial neural network prediction, stoichiometry and prediction reliability form objectives for the genetic algorithm which results in a selection of materials designs. The full potential of this approach is realised through the manufacture and characterisation of the materials. The resulting data improves the prediction algorithms, permitting iterative improvement to the designs and the discovery of completely new materials

    Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility

    Full text link
    This article studies the infinite-width limit of deep feedforward neural networks whose weights are dependent, and modelled via a mixture of Gaussian distributions. Each hidden node of the network is assigned a nonnegative random variable that controls the variance of the outgoing weights of that node. We make minimal assumptions on these per-node random variables: they are iid and their sum, in each layer, converges to some finite random variable in the infinite-width limit. Under this model, we show that each layer of the infinite-width neural network can be characterised by two simple quantities: a non-negative scalar parameter and a L\'evy measure on the positive reals. If the scalar parameters are strictly positive and the L\'evy measures are trivial at all hidden layers, then one recovers the classical Gaussian process (GP) limit, obtained with iid Gaussian weights. More interestingly, if the L\'evy measure of at least one layer is non-trivial, we obtain a mixture of Gaussian processes (MoGP) in the large-width limit. The behaviour of the neural network in this regime is very different from the GP regime. One obtains correlated outputs, with non-Gaussian distributions, possibly with heavy tails. Additionally, we show that, in this regime, the weights are compressible, and some nodes have asymptotically non-negligible contributions, therefore representing important hidden features. Many sparsity-promoting neural network models can be recast as special cases of our approach, and we discuss their infinite-width limits; we also present an asymptotic analysis of the pruning error. We illustrate some of the benefits of the MoGP regime over the GP regime in terms of representation learning and compressibility on simulated, MNIST and Fashion MNIST datasets.Comment: 96 pages, 15 figures, 9 table

    On a Bayesian Approach to Malware Detection and Classification through nn-gram Profiles

    Full text link
    Detecting and correctly classifying malicious executables has become one of the major concerns in cyber security, especially because traditional detection systems have become less effective with the increasing number and danger of threats found nowadays. One way to differentiate benign from malicious executables is to leverage on their hexadecimal representation by creating a set of binary features that completely characterise each executable. In this paper we present a novel supervised learning Bayesian nonparametric approach for binary matrices, that provides an effective probabilistic approach for malware detection. Moreover, and due to the model's flexible assumptions, we are able to use it in a multi-class framework where the interest relies in classifying malware into known families. Finally, a generalisation of the model which provides a deeper understanding of the behaviour across groups for each feature is also developed

    Machine learning algorithms for efficient process optimisation of variable geometries at the example of fabric forming

    Get PDF
    Für einen optimalen Betrieb erfordern moderne Produktionssysteme eine sorgfältige Einstellung der eingesetzten Fertigungsprozesse. Physikbasierte Simulationen können die Prozessoptimierung wirksam unterstützen, jedoch sind deren Rechenzeiten oft eine erhebliche Hürde. Eine Möglichkeit, Rechenzeit einzusparen sind surrogate-gestützte Optimierungsverfahren (SBO1). Surrogates sind recheneffiziente, datengetriebene Ersatzmodelle, die den Optimierer im Suchraum leiten. Sie verbessern in der Regel die Konvergenz, erweisen sich aber bei veränderlichen Optimierungsaufgaben, etwa häufigen Bauteilanpassungen nach Kundenwunsch, als unhandlich. Um auch solche variablen Optimierungsaufgaben effizient zu lösen, untersucht die vorliegende Arbeit, wie jüngste Fortschritte im Maschinenlernen (ML) – im Speziellen bei neuronalen Netzen – bestehende SBO-Techniken ergänzen können. Dabei werden drei Hauptaspekte betrachtet: erstens, ihr Potential als klassisches Surrogate für SBO, zweitens, ihre Eignung zur effiziente Bewertung der Herstellbarkeit neuer Bauteilentwürfe und drittens, ihre Möglichkeiten zur effizienten Prozessoptimierung für variable Bauteilgeometrien. Diese Fragestellungen sind grundsätzlich technologieübergreifend anwendbar und werden in dieser Arbeit am Beispiel der Textilumformung untersucht. Der erste Teil dieser Arbeit (Kapitel 3) diskutiert die Eignung tiefer neuronaler Netze als Surrogates für SBO. Hierzu werden verschiedene Netzarchitekturen untersucht und mehrere Möglichkeiten verglichen, sie in ein SBO-Framework einzubinden. Die Ergebnisse weisen ihre Eignung für SBO nach: Für eine feste Beispielgeometrie minimieren alle Varianten erfolgreich und schneller als ein Referenzalgorithmus (genetischer Algorithmus) die Zielfunktion. Um die Herstellbarkeit variabler Bauteilgeometrien zu bewerten, untersucht Kapitel 4 anschließend, wie Geometrieinformationen in ein Prozess-Surrogate eingebracht werden können. Hierzu werden zwei ML-Ansätze verglichen, ein merkmals- und ein rasterbasierter Ansatz. Der merkmalsbasierte Ansatz scannt ein Bauteil nach einzelnen, prozessrelevanten Geometriemerkmalen, der rasterbasierte Ansatz hingegen interpretiert die Geometrie als Ganzes. Beide Ansätze können das Prozessverhalten grundsätzlich erlernen, allerdings erweist sich der rasterbasierte Ansatz als einfacher übertragbar auf neue Geometrievarianten. Die Ergebnisse zeigen zudem, dass hauptsächlich die Vielfalt und weniger die Menge der Trainingsdaten diese Übertragbarkeit bestimmt. Abschließend verbindet Kapitel 5 die Surrogate-Techniken für flexible Geometrien mit variablen Prozessparametern, um eine effiziente Prozessoptimierung für variable Bauteile zu erreichen. Hierzu interagiert ein ML-Algorithmus in einer Simulationsumgebung mit generischen Geometriebeispielen und lernt, welche Geometrie, welche Umformparameter erfordert. Nach dem Training ist der Algorithmus in der Lage, auch für nicht-generische Bauteilgeometrien brauchbare Empfehlungen auszugeben. Weiter zeigt sich, dass die Empfehlungen mit ähnlicher Geschwindigkeit wie die klassische SBO zum tatsächlichen Prozessoptimum konvergieren, jedoch kein bauteilspezifisches A-priori-Sampling nötig ist. Einmal trainiert, ist der entwickelte Ansatz damit effizienter. Insgesamt zeigt diese Arbeit, wie ML-Techniken gegenwärtige SBOMethoden erweitern und so die Prozess- und Produktoptimierung zu frühen Entwicklungszeitpunkten effizient unterstützen können. Die Ergebnisse der Untersuchungen münden in Folgefragen zur Weiterentwicklung der Methoden, etwa die Integration physikalischer Bilanzgleichungen, um die Modellprognosen physikalisch konsistenter zu machen

    Calibrating the dice loss to handle neural network overconfidence for biomedical image segmentation

    Get PDF
    The Dice similarity coefficient (DSC) is both a widely used metric and loss function for biomedical image segmentation due to its robustness to class imbalance. However, it is well known that the DSC loss is poorly calibrated, resulting in overconfident predictions that cannot be usefully interpreted in biomedical and clinical practice. Performance is often the only metric used to evaluate segmentations produced by deep neural networks, and calibration is often neglected. However, calibration is important for translation into biomedical and clinical practice, providing crucial contextual information to model predictions for interpretation by scientists and clinicians. In this study, we provide a simple yet effective extension of the DSC loss, named the DSC++ loss, that selectively modulates the penalty associated with overconfident, incorrect predictions. As a standalone loss function, the DSC++ loss achieves significantly improved calibration over the conventional DSC loss across six well-validated open-source biomedical imaging datasets, including both 2D binary and 3D multi-class segmentation tasks. Similarly, we observe significantly improved calibration when integrating the DSC++ loss into four DSC-based loss functions. Finally, we use softmax thresholding to illustrate that well calibrated outputs enable tailoring of recall-precision bias, which is an important post-processing technique to adapt the model predictions to suit the biomedical or clinical task. The DSC++ loss overcomes the major limitation of the DSC loss, providing a suitable loss function for training deep learning segmentation models for use in biomedical and clinical practice. Source code is available at https://github.com/mlyg/DicePlusPlus

    Retrogressive Thaw Slump identification using U-Net and Satellite Image Inputs - Remote Sensing Imagery Segmentation using Deep Learning techniques

    Get PDF
    Global warming has been a topic of discussion for many decades, however its impact on the thaw of permafrost and vice-versa has not been very well captured or documented in the past. This may be due to most permafrost being in the Arctic and similarly vast remote areas, which makes data collection difficult and costly. A partial solution to this problem is the use of Remote Sensing imagery, which has been widely used for decades in documenting the changes in permafrost regions. Despite its many benefits, this methodology still required a manual assessment of images, which could be a slow and laborious task for researchers. Over the last decade, the growth of Deep Learning has helped address these limitations. The use of Deep Learning on Remote Sensing imagery has risen in popularity, mainly due to the increased availability and scale of Remote Sensing data. This has been fuelled in the last few years by open-source multi-spectral high spatial resolution data, such as the Sentinel-2 data used in this project. Notwithstanding the growth of Deep Learning for Remote Sensing Imagery, its use for the particular case of identifying the thaw of permafrost, addressed in this project, has not been widely studied. To address this gap, the semantic segmentation model proposed in this project performs pixel-wise classification on the satellite images for the identification of Retrogressive Thaw Slumps (RTSs), using a U-Net architecture. In this project, the successful identification of RTSs using Satellite Images is achieved with an average of 95% Dice score for the 39 test images evaluated, concluding that it is possible to pre-process said images and achieve satisfactory results using 10-meter spatial resolution and as little as 4 spectral bands. Since these landforms can be a proxy for the thaw of permafrost, the aim is that this project can help make progress towards the mitigation of the impact of such a powerful geophysical phenomenon
    corecore