2,344 research outputs found
The discovery of new functional oxides using combinatorial techniques and advanced data mining algorithms
Electroceramic materials research is a wide ranging field driven by device applications.
For many years, the demand for new materials was addressed largely
through serial processing and analysis of samples often similar in composition to
those already characterised. The Functional Oxide Discovery project (FOXD) is a
combinatorial materials discovery project combining high-throughput synthesis and
characterisation with advanced data mining to develop novel materials.
Dielectric ceramics are of interest for use in telecommunications equipment; oxygen
ion conductors are examined for use in fuel cell cathodes. Both applications
are subject to ever increasing industry demands and materials designs capable of
meeting the stringent requirements are urgently required.
The London University Search Instrument (LUSI) is a combinatorial robot employed
for materials synthesis. Ceramic samples are produced automatically using
an ink-jet printer which mixes and prints inks onto alumina slides. The slides are
transferred to a furnace for sintering and transported to other locations for analysis.
Production and analysis data are stored in the project database. The database
forms a valuable resource detailing the progress of the project and forming a basis
for data mining.
Materials design is a two stage process. The first stage, forward prediction, is accomplished
using an artificial neural network, a Baconian, inductive technique. In a
second stage, the artificial neural network is inverted using a genetic algorithm. The
artificial neural network prediction, stoichiometry and prediction reliability form objectives
for the genetic algorithm which results in a selection of materials designs.
The full potential of this approach is realised through the manufacture and characterisation
of the materials. The resulting data improves the prediction algorithms,
permitting iterative improvement to the designs and the discovery of completely
new materials
Recommended from our members
Levenberg-Marquardt optimised neural networks for trajectory tracking of autonomous ground vehicles
Trajectory tracking is an essential capability of robotics operation in industrial automation. In this article, an artificial neural controller is proposed to tackle trajectory-tracking problem of an autonomous ground vehicle (AGV). The controller is implemented based on fractional order proportional integral derivative (FOPID) control that was already designed in an earlier work. A non-holonomic model type of AGV is analysed and presented. The model includes the kinematic, dynamic characteristics and the actuation system of the VGA. The artificial neural controller consists of two artificial neural networks (ANNs) that are designed to control the inputs of the AGV. In order to train the two artificial neural networks,
Levenberg-Marquardt (LM) algorithm was used to obtain the parameters of the ANNs. The validation of the proposed controller has been verified through a given reference trajectory. The obtained results show a considerable improvement in term of minimising trajectory tracking error
over the FOPID controller
Deep neural networks with dependent weights: Gaussian Process mixture limit, heavy tails, sparsity and compressibility
This article studies the infinite-width limit of deep feedforward neural
networks whose weights are dependent, and modelled via a mixture of Gaussian
distributions. Each hidden node of the network is assigned a nonnegative random
variable that controls the variance of the outgoing weights of that node. We
make minimal assumptions on these per-node random variables: they are iid and
their sum, in each layer, converges to some finite random variable in the
infinite-width limit. Under this model, we show that each layer of the
infinite-width neural network can be characterised by two simple quantities: a
non-negative scalar parameter and a L\'evy measure on the positive reals. If
the scalar parameters are strictly positive and the L\'evy measures are trivial
at all hidden layers, then one recovers the classical Gaussian process (GP)
limit, obtained with iid Gaussian weights. More interestingly, if the L\'evy
measure of at least one layer is non-trivial, we obtain a mixture of Gaussian
processes (MoGP) in the large-width limit. The behaviour of the neural network
in this regime is very different from the GP regime. One obtains correlated
outputs, with non-Gaussian distributions, possibly with heavy tails.
Additionally, we show that, in this regime, the weights are compressible, and
some nodes have asymptotically non-negligible contributions, therefore
representing important hidden features. Many sparsity-promoting neural network
models can be recast as special cases of our approach, and we discuss their
infinite-width limits; we also present an asymptotic analysis of the pruning
error. We illustrate some of the benefits of the MoGP regime over the GP regime
in terms of representation learning and compressibility on simulated, MNIST and
Fashion MNIST datasets.Comment: 96 pages, 15 figures, 9 table
On a Bayesian Approach to Malware Detection and Classification through -gram Profiles
Detecting and correctly classifying malicious executables has become one of
the major concerns in cyber security, especially because traditional detection
systems have become less effective with the increasing number and danger of
threats found nowadays. One way to differentiate benign from malicious
executables is to leverage on their hexadecimal representation by creating a
set of binary features that completely characterise each executable. In this
paper we present a novel supervised learning Bayesian nonparametric approach
for binary matrices, that provides an effective probabilistic approach for
malware detection. Moreover, and due to the model's flexible assumptions, we
are able to use it in a multi-class framework where the interest relies in
classifying malware into known families. Finally, a generalisation of the model
which provides a deeper understanding of the behaviour across groups for each
feature is also developed
Machine learning algorithms for efficient process optimisation of variable geometries at the example of fabric forming
Für einen optimalen Betrieb erfordern moderne Produktionssysteme eine sorgfältige Einstellung der eingesetzten Fertigungsprozesse. Physikbasierte Simulationen können die Prozessoptimierung wirksam unterstützen, jedoch sind deren Rechenzeiten oft eine erhebliche Hürde. Eine Möglichkeit, Rechenzeit einzusparen sind surrogate-gestützte Optimierungsverfahren (SBO1). Surrogates sind recheneffiziente, datengetriebene Ersatzmodelle, die den Optimierer im Suchraum leiten. Sie verbessern in der Regel die Konvergenz, erweisen sich aber bei veränderlichen Optimierungsaufgaben, etwa häufigen Bauteilanpassungen nach Kundenwunsch, als unhandlich.
Um auch solche variablen Optimierungsaufgaben effizient zu lösen, untersucht die vorliegende Arbeit, wie jüngste Fortschritte im Maschinenlernen (ML) – im Speziellen bei neuronalen Netzen – bestehende SBO-Techniken ergänzen können. Dabei werden drei Hauptaspekte betrachtet: erstens, ihr Potential als klassisches Surrogate für SBO, zweitens, ihre Eignung zur effiziente Bewertung der Herstellbarkeit neuer Bauteilentwürfe und drittens, ihre Möglichkeiten zur effizienten Prozessoptimierung für variable Bauteilgeometrien. Diese Fragestellungen sind grundsätzlich technologieübergreifend anwendbar und werden in dieser Arbeit am Beispiel der Textilumformung untersucht.
Der erste Teil dieser Arbeit (Kapitel 3) diskutiert die Eignung tiefer neuronaler Netze als Surrogates für SBO. Hierzu werden verschiedene Netzarchitekturen untersucht und mehrere Möglichkeiten verglichen, sie in ein SBO-Framework einzubinden. Die Ergebnisse weisen ihre Eignung für SBO nach: Für eine feste Beispielgeometrie minimieren alle Varianten erfolgreich und schneller als ein Referenzalgorithmus (genetischer Algorithmus) die Zielfunktion.
Um die Herstellbarkeit variabler Bauteilgeometrien zu bewerten, untersucht Kapitel 4 anschließend, wie Geometrieinformationen in ein Prozess-Surrogate eingebracht werden können. Hierzu werden zwei ML-Ansätze verglichen, ein merkmals- und ein rasterbasierter Ansatz. Der merkmalsbasierte Ansatz scannt ein Bauteil nach einzelnen, prozessrelevanten Geometriemerkmalen, der rasterbasierte Ansatz hingegen interpretiert die Geometrie als Ganzes. Beide Ansätze können das Prozessverhalten grundsätzlich erlernen, allerdings erweist sich der rasterbasierte Ansatz als einfacher übertragbar auf neue Geometrievarianten. Die Ergebnisse zeigen zudem, dass hauptsächlich die Vielfalt und weniger die Menge der Trainingsdaten diese Übertragbarkeit bestimmt.
Abschließend verbindet Kapitel 5 die Surrogate-Techniken für flexible Geometrien mit variablen Prozessparametern, um eine effiziente Prozessoptimierung für variable Bauteile zu erreichen. Hierzu interagiert ein ML-Algorithmus in einer Simulationsumgebung mit generischen Geometriebeispielen und lernt, welche Geometrie, welche Umformparameter erfordert. Nach dem Training ist der Algorithmus in der Lage, auch für nicht-generische Bauteilgeometrien brauchbare Empfehlungen auszugeben. Weiter zeigt sich, dass die Empfehlungen mit ähnlicher Geschwindigkeit wie die klassische SBO zum tatsächlichen Prozessoptimum konvergieren, jedoch kein bauteilspezifisches A-priori-Sampling nötig ist. Einmal trainiert, ist der entwickelte Ansatz damit effizienter.
Insgesamt zeigt diese Arbeit, wie ML-Techniken gegenwärtige SBOMethoden erweitern und so die Prozess- und Produktoptimierung zu frühen Entwicklungszeitpunkten effizient unterstützen können. Die Ergebnisse der Untersuchungen münden in Folgefragen zur Weiterentwicklung der Methoden, etwa die Integration physikalischer Bilanzgleichungen, um die Modellprognosen physikalisch konsistenter zu machen
Calibrating the dice loss to handle neural network overconfidence for biomedical image segmentation
The Dice similarity coefficient (DSC) is both a widely used metric and loss function for biomedical image segmentation due to its robustness to class imbalance. However, it is well known that the DSC loss is poorly calibrated, resulting in overconfident predictions that cannot be usefully interpreted in biomedical and clinical practice. Performance is often the only metric used to evaluate segmentations produced by deep neural networks, and calibration is often neglected. However, calibration is important for translation into biomedical and clinical practice, providing crucial contextual information to model predictions for interpretation by scientists and clinicians. In this study, we provide a simple yet effective extension of the DSC loss, named the DSC++ loss, that selectively modulates the penalty associated with overconfident, incorrect predictions. As a standalone loss function, the DSC++ loss achieves significantly improved calibration over the conventional DSC loss across six well-validated open-source biomedical imaging datasets, including both 2D binary and 3D multi-class segmentation tasks. Similarly, we observe significantly improved calibration when integrating the DSC++ loss into four DSC-based loss functions. Finally, we use softmax thresholding to illustrate that well calibrated outputs enable tailoring of recall-precision bias, which is an important post-processing technique to adapt the model predictions to suit the biomedical or clinical task. The DSC++ loss overcomes the major limitation of the DSC loss, providing a suitable loss function for training deep learning segmentation models for use in biomedical and clinical practice. Source code is available at https://github.com/mlyg/DicePlusPlus
Retrogressive Thaw Slump identification using U-Net and Satellite Image Inputs - Remote Sensing Imagery Segmentation using Deep Learning techniques
Global warming has been a topic of discussion for many decades, however its impact
on the thaw of permafrost and vice-versa has not been very well captured or
documented in the past. This may be due to most permafrost being in the Arctic and
similarly vast remote areas, which makes data collection difficult and costly.
A partial solution to this problem is the use of Remote Sensing imagery, which
has been widely used for decades in documenting the changes in permafrost regions.
Despite its many benefits, this methodology still required a manual assessment of images,
which could be a slow and laborious task for researchers. Over the last decade,
the growth of Deep Learning has helped address these limitations. The use of Deep
Learning on Remote Sensing imagery has risen in popularity, mainly due to the increased
availability and scale of Remote Sensing data. This has been fuelled in the
last few years by open-source multi-spectral high spatial resolution data, such as the
Sentinel-2 data used in this project.
Notwithstanding the growth of Deep Learning for Remote Sensing Imagery, its use
for the particular case of identifying the thaw of permafrost, addressed in this project,
has not been widely studied. To address this gap, the semantic segmentation model
proposed in this project performs pixel-wise classification on the satellite images for
the identification of Retrogressive Thaw Slumps (RTSs), using a U-Net architecture.
In this project, the successful identification of RTSs using Satellite Images is achieved
with an average of 95% Dice score for the 39 test images evaluated, concluding that it
is possible to pre-process said images and achieve satisfactory results using 10-meter
spatial resolution and as little as 4 spectral bands. Since these landforms can be a
proxy for the thaw of permafrost, the aim is that this project can help make progress
towards the mitigation of the impact of such a powerful geophysical phenomenon
- …