11,965 research outputs found

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    Integrated Optical Fiber Sensor for Simultaneous Monitoring of Temperature, Vibration, and Strain in High Temperature Environment

    Full text link
    Important high-temperature parts of an aero-engine, especially the power-related fuel system and rotor system, are directly related to the reliability and service life of the engine. The working environment of these parts is extremely harsh, usually overloaded with high temperature, vibration and strain which are the main factors leading to their failure. Therefore, the simultaneous measurement of high temperature, vibration, and strain is essential to monitor and ensure the safe operation of an aero-engine. In my thesis work, I have focused on the research and development of two new sensors for fuel and rotor systems of an aero-engine that need to withstand the same high temperature condition, typically at 900 °C or above, but with different requirements for vibration and strain measurement. Firstly, to meet the demand for high temperature operation, high vibration sensitivity, and high strain resolution in fuel systems, an integrated sensor based on two fiber Bragg gratings in series (Bi-FBG sensor) to simultaneously measure temperature, strain, and vibration is proposed and demonstrated. In this sensor, an L-shaped cantilever is introduced to improve the vibration sensitivity. By converting its free end displacement into a stress effect on the FBG, the sensitivity of the L-shaped cantilever is improved by about 400% compared with that of straight cantilevers. To compensate for the strain sensitivity of FBGs, a spring-beam strain sensitization structure is designed and the sensitivity is increased to 5.44 pm/με by concentrating strain deformation. A novel decoupling method ‘Steps Decoupling and Temperature Compensation (SDTC)’ is proposed to address the interference between temperature, vibration, and strain. A model of sensing characteristics and interference of different parameters is established to achieve accurate signal decoupling. Experimental tests have been performed and demonstrated the good performance of the sensor. Secondly, a sensor based on cascaded three fiber Fabry-Pérot interferometers in series (Tri-FFPI sensor) for multiparameter measurement is designed and demonstrated for engine rotor systems that require higher vibration frequencies and greater strain measurement requirements. In this sensor, the cascaded-FFPI structure is introduced to ensure high temperature and large strain simultaneous measurement. An FFPI with a cantilever for high vibration frequency measurement is designed with a miniaturized size and its geometric parameters optimization model is established to investigate the influencing factors of sensing characteristics. A cascaded-FFPI preparation method with chemical etching and offset fusion is proposed to maintain the flatness and high reflectivity of FFPIs’ surface, which contributes to the improvement of measurement accuracy. A new high-precision cavity length demodulation method is developed based on vector matching and clustering-competition particle swarm optimization (CCPSO) to improve the demodulation accuracy of cascaded-FFPI cavity lengths. By investigating the correlation relationship between the cascaded-FFPI spectral and multidimensional space, the cavity length demodulation is transformed into a search for the highest correlation value in space, solving the problem that the cavity length demodulation accuracy is limited by the resolution of spectral wavelengths. Different clustering and competition characteristics are designed in CCPSO to reduce the demodulation error by 87.2% compared with the commonly used particle swarm optimization method. Good performance and multiparameter decoupling have been successfully demonstrated in experimental tests

    Technology for Low Resolution Space Based RSO Detection and Characterisation

    Get PDF
    Space Situational Awareness (SSA) refers to all activities to detect, identify and track objects in Earth orbit. SSA is critical to all current and future space activities and protect space assets by providing access control, conjunction warnings, and monitoring status of active satellites. Currently SSA methods and infrastructure are not sufficient to account for the proliferations of space debris. In response to the need for better SSA there has been many different areas of research looking to improve SSA most of the requiring dedicated ground or space-based infrastructure. In this thesis, a novel approach for the characterisation of RSO’s (Resident Space Objects) from passive low-resolution space-based sensors is presented with all the background work performed to enable this novel method. Low resolution space-based sensors are common on current satellites, with many of these sensors being in space using them passively to detect RSO’s can greatly augment SSA with out expensive infrastructure or long lead times. One of the largest hurtles to overcome with research in the area has to do with the lack of publicly available labelled data to test and confirm results with. To overcome this hurtle a simulation software, ORBITALS, was created. To verify and validate the ORBITALS simulator it was compared with the Fast Auroral Imager images, which is one of the only publicly available low-resolution space-based images found with auxiliary data. During the development of the ORBITALS simulator it was found that the generation of these simulated images are computationally intensive when propagating the entire space catalog. To overcome this an upgrade of the currently used propagation method, Specialised General Perturbation Method 4th order (SGP4), was performed to allow the algorithm to run in parallel reducing the computational time required to propagate entire catalogs of RSO’s. From the results it was found that the standard facet model with a particle swarm optimisation performed the best estimating an RSO’s attitude with a 0.66 degree RMSE accuracy across a sequence, and ~1% MAPE accuracy for the optical properties. This accomplished this thesis goal of demonstrating the feasibility of low-resolution passive RSO characterisation from space-based platforms in a simulated environment

    DATA AUGMENTATION FOR SYNTHETIC APERTURE RADAR USING ALPHA BLENDING AND DEEP LAYER TRAINING

    Get PDF
    Human-based object detection in synthetic aperture RADAR (SAR) imagery is complex and technical, laboriously slow but time critical—the perfect application for machine learning (ML). Training an ML network for object detection requires very large image datasets with imbedded objects that are accurately and precisely labeled. Unfortunately, no such SAR datasets exist. Therefore, this paper proposes a method to synthesize wide field of view (FOV) SAR images by combining two existing datasets: SAMPLE, which is composed of both real and synthetic single-object chips, and MSTAR Clutter, which is composed of real wide-FOV SAR images. Synthetic objects are extracted from SAMPLE using threshold-based segmentation before being alpha-blended onto patches from MSTAR Clutter. To validate the novel synthesis method, individual object chips are created and classified using a simple convolutional neural network (CNN); testing is performed against the measured SAMPLE subset. A novel technique is also developed to investigate training activity in deep layers. The proposed data augmentation technique produces a 17% increase in the accuracy of measured SAR image classification. This improvement shows that any residual artifacts from segmentation and blending do not negatively affect ML, which is promising for future use in wide-area SAR synthesis.Outstanding ThesisMajor, United States Air ForceApproved for public release. Distribution is unlimited

    Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)

    Full text link
    The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment. FMER is a subset of image processing and it is a multidisciplinary topic to analysis. So, it requires familiarity with other topics of Artifactual Intelligence (AI) such as machine learning, digital image processing, psychology and more. So, it is a great opportunity to write a book which covers all of these topics for beginner to professional readers in the field of AI and even without having background of AI. Our goal is to provide a standalone introduction in the field of MFER analysis in the form of theorical descriptions for readers with no background in image processing with reproducible Matlab practical examples. Also, we describe any basic definitions for FMER analysis and MATLAB library which is used in the text, that helps final reader to apply the experiments in the real-world applications. We believe that this book is suitable for students, researchers, and professionals alike, who need to develop practical skills, along with a basic understanding of the field. We expect that, after reading this book, the reader feels comfortable with different key stages such as color and depth image processing, color and depth image representation, classification, machine learning, facial micro-expressions recognition, feature extraction and dimensionality reduction. The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment.Comment: This is the second edition of the boo

    Detection of entangled states supported by reinforcement learning

    Full text link
    Discrimination of entangled states is an important element of quantum enhanced metrology. This typically requires low-noise detection technology. Such a challenge can be circumvented by introducing nonlinear readout process. Traditionally, this is realized by reversing the very dynamics that generates the entangled state, which requires a full control over the system evolution. In this work, we present nonlinear readout of highly entangled states by employing reinforcement learning (RL) to manipulate the spin-mixing dynamics in a spin-1 atomic condensate. The RL found results in driving the system towards an unstable fixed point, whereby the (to be sensed) phase perturbation is amplified by the subsequent spin-mixing dynamics. Working with a condensate of 10900 {87}^Rb atoms, we achieve a metrological gain of 6.97 dB beyond the classical precision limit. Our work would open up new possibilities in unlocking the full potential of entanglement caused quantum enhancement in experiments

    Leveraging a machine learning based predictive framework to study brain-phenotype relationships

    Get PDF
    An immense collective effort has been put towards the development of methods forquantifying brain activity and structure. In parallel, a similar effort has focused on collecting experimental data, resulting in ever-growing data banks of complex human in vivo neuroimaging data. Machine learning, a broad set of powerful and effective tools for identifying multivariate relationships in high-dimensional problem spaces, has proven to be a promising approach toward better understanding the relationships between the brain and different phenotypes of interest. However, applied machine learning within a predictive framework for the study of neuroimaging data introduces several domain-specific problems and considerations, leaving the overarching question of how to best structure and run experiments ambiguous. In this work, I cover two explicit pieces of this larger question, the relationship between data representation and predictive performance and a case study on issues related to data collected from disparate sites and cohorts. I then present the Brain Predictability toolbox, a soft- ware package to explicitly codify and make more broadly accessible to researchers the recommended steps in performing a predictive experiment, everything from framing a question to reporting results. This unique perspective ultimately offers recommen- dations, explicit analytical strategies, and example applications for using machine learning to study the brain

    Application of Track Geometry Deterioration Modelling and Data Mining in Railway Asset Management

    Get PDF
    Modernin rautatiejärjestelmän hallinnassa rahankäyttö kohdistuu valtaosin nykyisen rataverkon korjauksiin ja parannuksiin ennemmin kuin uusien ratojen rakentamiseen. Nykyisen rataverkon kunnossapitotyöt aiheuttavat suurten kustannusten lisäksi myös usein liikennerajoitteita tai yhteyksien väliaikaisia sulkemisia, jotka heikentävät rataverkon käytettävyyttä Siispä oikea-aikainen ja pitkäaikaisia parannuksia aikaansaava kunnossapito ovat edellytyksiä kilpailukykyisille ja täsmällisille rautatiekuljetuksille. Tällainen kunnossapito vaatii vankan tietopohjan radan nykyisestä kunnosta päätöksenteon tueksi. Ratainfran omistajat teettävät päätöksenteon tueksi useita erilaisia radan kuntoa kuvaavia mittauksia ja ylläpitävät kattavia omaisuustietorekistereitä. Kenties tärkein näistä datalähteistä on koneellisen radantarkastuksen tuottamat mittaustulokset, jotka kuvastavat radan geometrian kuntoa. Nämä mittaustulokset ovat tärkeitä, koska ne tuottavat luotettavaa kuntotietoa: mittaukset tehdään toistuvasti, 2–6 kertaa vuodessa Suomessa rataosasta riippuen, mittausvaunu pysyy useita vuosia samana, tulokset ovat hyvin toistettavia ja ne antavat hyvän yleiskuvan radan kunnosta. Vaikka laadukasta dataa on paljon saatavilla, käytännön omaisuudenhallinnassa on merkittäviä haasteita datan analysoinnissa, sillä vakiintuneita menetelmiä siihen on vähän. Käytännössä seurataan usein vain mittaustulosten raja-arvojen ylittymistä ja pyritään subjektiivisesti arvioimaan rakenteiden kunnon kehittymistä ja korjaustarpeita. Kehittyneen analytiikan puutteet estävät kuntotietojen laajamittaisen hyödyntämisen kunnossapidon suunnittelussa, mikä vaikeuttaa päätöksentekoa. Tämän väitöskirjatutkimuksen päätavoitteita olivat kehittää ratageometrian heikkenemiseen mallintamismenetelmiä, soveltaa tiedonlouhintaa saatavilla olevan omaisuusdatan analysointiin sekä jalkauttaa kyseiset tutkimustulokset käytännön rataomaisuudenhallintaan. Ratageometrian heikkenemisen mallintamismenetelmien kehittämisessä keskityttiin tuottamaan nykyisin saatavilla olevasta datasta uutta tietoa radan kunnon kehityksestä, tehdyn kunnossapidon tehokkuudesta sekä tulevaisuuden kunnossapitotarpeista. Tiedonlouhintaa sovellettiin ratageometrian heikkenemisen juurisyiden selvittämiseen rataomaisuusdatan perusteella. Lopuksi hyödynnettiin kypsyysmalleja perustana ratageometrian heikkenemisen mallinnuksen ja rataomaisuusdatan analytiikan käytäntöön viennille. Tutkimustulosten perusteella suomalainen radantarkastus- ja rataomaisuusdata olivat riittäviä tavoiteltuihin analyyseihin. Tulokset osoittivat, että robusti lineaarinen optimointi soveltuu hyvin suomalaisen rataverkon ratageometrian heikkenemisen mallinnukseen. Mallinnuksen avulla voidaan tuottaa tunnuslukuja, jotka kuvaavat rakenteen kuntoa, kunnossapidon tehokkuutta ja tulevaa kunnossapitotarvetta, sekä muodostaa havainnollistavia visualisointeja datasta. Rataomaisuusdatan eksploratiiviseen tiedonlouhintaan käytetyn GUHA-menetelmän avulla voitiin selvittää mielenkiintoisia ja vaikeasti havaittavia korrelaatioita datasta. Näiden tulosten avulla saatiin uusia havaintoja ongelmallisista ratarakennetyypeistä. Havaintojen avulla voitiin kohdentaa jatkotutkimuksia näihin rakenteisiin, mikä ei olisi ollut mahdollista, jollei tiedonlouhinnan avulla olisi ensin tunnistettu näitä rakennetyyppejä. Kypsyysmallin soveltamisen avulla luotiin puitteet ratageometrian heikkenemisen mallintamisen ja rataomaisuusdatan analytiikan kehitykselle Suomen rataomaisuuden hallinnassa. Kypsyysmalli tarjosi käytännöllisen tavan lähestyä tarvittavaa kehitystyötä, kun eteneminen voitiin jaotella neljään eri kypsyystasoon, jotka loivat selkeitä välitavoitteita. Kypsyysmallin ja asetettujen välitavoitteiden avulla kehitys on suunniteltua ja edistystä voidaan jaotella, mikä antaa edellytykset tämän laajamittaisen kehityksen onnistuneelle läpiviennille. Tämän väitöskirjatutkimuksen tulokset osoittavat, miten nykyisin saatavilla olevasta datasta saadaan täysin uutta ja merkityksellistä tietoa, kun sitä käsitellään kehittyneen analytiikan avulla. Tämä väitöskirja tarjoaa datankäsittelyratkaisujen luomisen ja soveltamisen lisäksi myös keinoja niiden käytäntöönpanolle, sillä tietopohjaisen päätöksenteon todelliset hyödyt saavutetaan vasta käytännön radanpidossa.In the management of a modern European railway system, spending is predominantly allocated to maintaining and renewing the existing rail network rather than constructing completely new lines. In addition to major costs, the maintenance and renewals of the existing rail network often cause traffic restrictions or line closures, which decrease the usability of the rail network. Therefore, timely maintenance that achieves long-lasting improvements is imperative for achieving competitive and punctual rail traffic. This kind of maintenance requires a strong knowledge base for decision making regarding the current condition of track structures. Track owners commission several different measurements that depict the condition of track structures and have comprehensive asset management data repositories. Perhaps one of the most important data sources is the track recording car measurement history, which depicts the condition of track geometry at different times. These measurement results are important because they offer a reliable condition database; the measurements are done recurrently, two to six times a year in Finland depending on the track section; the same recording car is used for many years; the results are repeatable; and they provide a good overall idea of the condition of track structures. However, although high-quality data is available, there are major challenges in analysing the data in practical asset management because there are few established methods for analytics. Practical asset management typically only monitors whether given threshold values are exceeded and subjectively assesses maintenance needs and development in the condition of track structures. The lack of advanced analytics prevents the full utilisation of the available data in maintenance planning which hinders decision making. The main goals of this dissertation study were to develop track geometry deterioration modelling methods, apply data mining in analysing currently available railway asset data, and implement the results from these studies into practical railway asset management. The development of track geometry deterioration modelling methods focused on utilising currently available data for producing novel information on the development in the condition of track structures, past maintenance effectiveness, and future maintenance needs. Data mining was applied in investigating the root causes of track geometry deterioration based on asset data. Finally, maturity models were applied as the basis for implementing track geometry deterioration modelling and track asset data analytics into practice. Based on the research findings, currently available Finnish measurement and asset data was sufficient for the desired analyses. For the Finnish track inspection data, robust linear optimisation was developed for track geometry deterioration modelling. The modelling provided key figures, which depict the condition of structures, maintenance effectiveness, and future maintenance needs. Moreover, visualisations were created from the modelling to enable the practical use of the modelling results. The applied exploratory data mining method, General Unary Hypotheses Automaton (GUHA), could find interesting and hard-to-detect correlations within asset data. With these correlations, novel observations on problematic track structure types were made. The observations could be utilised for allocating further research for problematic track structures, which would not have been possible without using data mining to identify these structures. The implementation of track geometry deterioration and asset data analytics into practice was approached by applying maturity models. The use of maturity models offered a practical way of approaching future development, as the development could be divided into four maturity levels, which created clear incremental goals for development. The maturity model and the incremental goals enabled wide-scale development planning, in which the progress can be segmented and monitored, which enhances successful project completion. The results from these studies demonstrate how currently available data can be used to provide completely new and meaningful information, when advanced analytics are used. In addition to novel solutions for data analytics, this dissertation research also provided methods for implementing the solutions, as the true benefits of knowledge-based decision making are obtained in only practical railway asset management

    A New Paradigm for Knowledge Discovery and Design in Nanophotonics Based on Artificial Intelligence

    Get PDF
    The design of photonic devices in the nanoscale regime outperforming the bulky optical components has been a long-lasting challenge in state-of-the-art applications. Accordingly, devising a comprehensive model to understand and explain the physics and dynamics of light-matter interaction in these nanostructures is a substantial step toward realizing novel photonic devices. This thesis presents a new paradigm based on leveraging the intelligent aspect of artificial intelligence (AI) to design nanostructure and understand the underlying physics of light-matter interactions. Considering a large number of design parameters and the complex and non-unique nature of the input-output relations in nanophotonic structures, conventional approaches cannot be used for their design and analysis. The dimensionality reduction (DR) techniques in this research considerably reduce the computing requirements. This thesis also focuses on developing a reliable inverse design approach by overcoming the non-uniqueness challenge. This thesis presents a double-step DR technique to reduce the complexity of the inverse design problem while preserving the necessary information for finding the optimum nanostructure for the desired functionality. I established an approach based on defining physics-driven metrics to explore the low-dimensional manifold of design-response space and provide a sweet region in the reduced design space for the desired functionality. In the later part of the thesis, we have shown that we achieved the optimum nanostructure for a particular desired response by employing manifold learning while minimizing the geometrical complexity. Also, in this thesis, we have developed a manifold learning-based technique for accelerating the design of nanostructures focusing on selecting the optimum material and geometric parameters.Ph.D
    corecore