4,919 research outputs found

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    The State of the Art in Deep Learning Applications, Challenges, and Future Prospects::A Comprehensive Review of Flood Forecasting and Management

    Get PDF
    Floods are a devastating natural calamity that may seriously harm both infrastructure and people. Accurate flood forecasts and control are essential to lessen these effects and safeguard populations. By utilizing its capacity to handle massive amounts of data and provide accurate forecasts, deep learning has emerged as a potent tool for improving flood prediction and control. The current state of deep learning applications in flood forecasting and management is thoroughly reviewed in this work. The review discusses a variety of subjects, such as the data sources utilized, the deep learning models used, and the assessment measures adopted to judge their efficacy. It assesses current approaches critically and points out their advantages and disadvantages. The article also examines challenges with data accessibility, the interpretability of deep learning models, and ethical considerations in flood prediction. The report also describes potential directions for deep-learning research to enhance flood predictions and control. Incorporating uncertainty estimates into forecasts, integrating many data sources, developing hybrid models that mix deep learning with other methodologies, and enhancing the interpretability of deep learning models are a few of these. These research goals can help deep learning models become more precise and effective, which will result in better flood control plans and forecasts. Overall, this review is a useful resource for academics and professionals working on the topic of flood forecasting and management. By reviewing the current state of the art, emphasizing difficulties, and outlining potential areas for future study, it lays a solid basis. Communities may better prepare for and lessen the destructive effects of floods by implementing cutting-edge deep learning algorithms, thereby protecting people and infrastructure

    Demonstration of a Response Time Based Remaining Useful Life (RUL) Prediction for Software Systems

    Full text link
    Prognostic and Health Management (PHM) has been widely applied to hardware systems in the electronics and non-electronics domains but has not been explored for software. While software does not decay over time, it can degrade over release cycles. Software health management is confined to diagnostic assessments that identify problems, whereas prognostic assessment potentially indicates when in the future a problem will become detrimental. Relevant research areas such as software defect prediction, software reliability prediction, predictive maintenance of software, software degradation, and software performance prediction, exist, but all of these represent diagnostic models built upon historical data, none of which can predict an RUL for software. This paper addresses the application of PHM concepts to software systems for fault predictions and RUL estimation. Specifically, this paper addresses how PHM can be used to make decisions for software systems such as version update and upgrade, module changes, system reengineering, rejuvenation, maintenance scheduling, budgeting, and total abandonment. This paper presents a method to prognostically and continuously predict the RUL of a software system based on usage parameters (e.g., the numbers and categories of releases) and performance parameters (e.g., response time). The model developed has been validated by comparing actual data, with the results that were generated by predictive models. Statistical validation (regression validation, and k-fold cross validation) has also been carried out. A case study, based on publicly available data for the Bugzilla application is presented. This case study demonstrates that PHM concepts can be applied to software systems and RUL can be calculated to make system management decisions.Comment: This research methodology has opened up new and practical applications in the software domain. In the coming decades, we can expect a significant amount of attention and practical implementation in this area worldwid

    Understanding Data Manipulation and How to Leverage it To Improve Generalization

    Get PDF
    Augmentations and other transformations of data, either in the input or latent space, are a critical component of modern machine learning systems. While these techniques are widely used in practice and known to provide improved generalization in many cases, it is still unclear how data manipulation impacts learning and generalization. To take a step toward addressing the problem, this thesis focuses on understanding and leveraging data augmentation and alignment for improving machine learning performance and transfer. In the first part of the thesis, we establish a novel theoretical framework to understand how data augmentation (DA) impacts learning in linear regression and classification tasks. The results demonstrate how the augmented transformed data spectrum plays a key role in characterizing the behavior of different augmentation strategies, especially in the overparameterized regime. The tools developed in this aim provide simple guidelines to build new augmentation strategies and a simple framework for comparing the generalization of different types of DA. In the second part of the thesis, we demonstrate how latent data alignment can be used to tackle the domain transfer problem, where training and testing datasets vary in distribution. Our algorithm builds upon joint clustering and data-matching through optimal transport, and outperforms the pure matching algorithm baselines in both synthetic and real datasets. Extension of the generalization analysis and algorithm design for data augmentation and alignment for nonlinear models such as artificial neural networks and random feature models are discussed. This thesis provides tools and analyses for better data manipulation design, which benefit both supervised and unsupervised learning schemes.Ph.D

    Leveraging a machine learning based predictive framework to study brain-phenotype relationships

    Get PDF
    An immense collective effort has been put towards the development of methods forquantifying brain activity and structure. In parallel, a similar effort has focused on collecting experimental data, resulting in ever-growing data banks of complex human in vivo neuroimaging data. Machine learning, a broad set of powerful and effective tools for identifying multivariate relationships in high-dimensional problem spaces, has proven to be a promising approach toward better understanding the relationships between the brain and different phenotypes of interest. However, applied machine learning within a predictive framework for the study of neuroimaging data introduces several domain-specific problems and considerations, leaving the overarching question of how to best structure and run experiments ambiguous. In this work, I cover two explicit pieces of this larger question, the relationship between data representation and predictive performance and a case study on issues related to data collected from disparate sites and cohorts. I then present the Brain Predictability toolbox, a soft- ware package to explicitly codify and make more broadly accessible to researchers the recommended steps in performing a predictive experiment, everything from framing a question to reporting results. This unique perspective ultimately offers recommen- dations, explicit analytical strategies, and example applications for using machine learning to study the brain

    Application of Track Geometry Deterioration Modelling and Data Mining in Railway Asset Management

    Get PDF
    Modernin rautatiejärjestelmän hallinnassa rahankäyttö kohdistuu valtaosin nykyisen rataverkon korjauksiin ja parannuksiin ennemmin kuin uusien ratojen rakentamiseen. Nykyisen rataverkon kunnossapitotyöt aiheuttavat suurten kustannusten lisäksi myös usein liikennerajoitteita tai yhteyksien väliaikaisia sulkemisia, jotka heikentävät rataverkon käytettävyyttä Siispä oikea-aikainen ja pitkäaikaisia parannuksia aikaansaava kunnossapito ovat edellytyksiä kilpailukykyisille ja täsmällisille rautatiekuljetuksille. Tällainen kunnossapito vaatii vankan tietopohjan radan nykyisestä kunnosta päätöksenteon tueksi. Ratainfran omistajat teettävät päätöksenteon tueksi useita erilaisia radan kuntoa kuvaavia mittauksia ja ylläpitävät kattavia omaisuustietorekistereitä. Kenties tärkein näistä datalähteistä on koneellisen radantarkastuksen tuottamat mittaustulokset, jotka kuvastavat radan geometrian kuntoa. Nämä mittaustulokset ovat tärkeitä, koska ne tuottavat luotettavaa kuntotietoa: mittaukset tehdään toistuvasti, 2–6 kertaa vuodessa Suomessa rataosasta riippuen, mittausvaunu pysyy useita vuosia samana, tulokset ovat hyvin toistettavia ja ne antavat hyvän yleiskuvan radan kunnosta. Vaikka laadukasta dataa on paljon saatavilla, käytännön omaisuudenhallinnassa on merkittäviä haasteita datan analysoinnissa, sillä vakiintuneita menetelmiä siihen on vähän. Käytännössä seurataan usein vain mittaustulosten raja-arvojen ylittymistä ja pyritään subjektiivisesti arvioimaan rakenteiden kunnon kehittymistä ja korjaustarpeita. Kehittyneen analytiikan puutteet estävät kuntotietojen laajamittaisen hyödyntämisen kunnossapidon suunnittelussa, mikä vaikeuttaa päätöksentekoa. Tämän väitöskirjatutkimuksen päätavoitteita olivat kehittää ratageometrian heikkenemiseen mallintamismenetelmiä, soveltaa tiedonlouhintaa saatavilla olevan omaisuusdatan analysointiin sekä jalkauttaa kyseiset tutkimustulokset käytännön rataomaisuudenhallintaan. Ratageometrian heikkenemisen mallintamismenetelmien kehittämisessä keskityttiin tuottamaan nykyisin saatavilla olevasta datasta uutta tietoa radan kunnon kehityksestä, tehdyn kunnossapidon tehokkuudesta sekä tulevaisuuden kunnossapitotarpeista. Tiedonlouhintaa sovellettiin ratageometrian heikkenemisen juurisyiden selvittämiseen rataomaisuusdatan perusteella. Lopuksi hyödynnettiin kypsyysmalleja perustana ratageometrian heikkenemisen mallinnuksen ja rataomaisuusdatan analytiikan käytäntöön viennille. Tutkimustulosten perusteella suomalainen radantarkastus- ja rataomaisuusdata olivat riittäviä tavoiteltuihin analyyseihin. Tulokset osoittivat, että robusti lineaarinen optimointi soveltuu hyvin suomalaisen rataverkon ratageometrian heikkenemisen mallinnukseen. Mallinnuksen avulla voidaan tuottaa tunnuslukuja, jotka kuvaavat rakenteen kuntoa, kunnossapidon tehokkuutta ja tulevaa kunnossapitotarvetta, sekä muodostaa havainnollistavia visualisointeja datasta. Rataomaisuusdatan eksploratiiviseen tiedonlouhintaan käytetyn GUHA-menetelmän avulla voitiin selvittää mielenkiintoisia ja vaikeasti havaittavia korrelaatioita datasta. Näiden tulosten avulla saatiin uusia havaintoja ongelmallisista ratarakennetyypeistä. Havaintojen avulla voitiin kohdentaa jatkotutkimuksia näihin rakenteisiin, mikä ei olisi ollut mahdollista, jollei tiedonlouhinnan avulla olisi ensin tunnistettu näitä rakennetyyppejä. Kypsyysmallin soveltamisen avulla luotiin puitteet ratageometrian heikkenemisen mallintamisen ja rataomaisuusdatan analytiikan kehitykselle Suomen rataomaisuuden hallinnassa. Kypsyysmalli tarjosi käytännöllisen tavan lähestyä tarvittavaa kehitystyötä, kun eteneminen voitiin jaotella neljään eri kypsyystasoon, jotka loivat selkeitä välitavoitteita. Kypsyysmallin ja asetettujen välitavoitteiden avulla kehitys on suunniteltua ja edistystä voidaan jaotella, mikä antaa edellytykset tämän laajamittaisen kehityksen onnistuneelle läpiviennille. Tämän väitöskirjatutkimuksen tulokset osoittavat, miten nykyisin saatavilla olevasta datasta saadaan täysin uutta ja merkityksellistä tietoa, kun sitä käsitellään kehittyneen analytiikan avulla. Tämä väitöskirja tarjoaa datankäsittelyratkaisujen luomisen ja soveltamisen lisäksi myös keinoja niiden käytäntöönpanolle, sillä tietopohjaisen päätöksenteon todelliset hyödyt saavutetaan vasta käytännön radanpidossa.In the management of a modern European railway system, spending is predominantly allocated to maintaining and renewing the existing rail network rather than constructing completely new lines. In addition to major costs, the maintenance and renewals of the existing rail network often cause traffic restrictions or line closures, which decrease the usability of the rail network. Therefore, timely maintenance that achieves long-lasting improvements is imperative for achieving competitive and punctual rail traffic. This kind of maintenance requires a strong knowledge base for decision making regarding the current condition of track structures. Track owners commission several different measurements that depict the condition of track structures and have comprehensive asset management data repositories. Perhaps one of the most important data sources is the track recording car measurement history, which depicts the condition of track geometry at different times. These measurement results are important because they offer a reliable condition database; the measurements are done recurrently, two to six times a year in Finland depending on the track section; the same recording car is used for many years; the results are repeatable; and they provide a good overall idea of the condition of track structures. However, although high-quality data is available, there are major challenges in analysing the data in practical asset management because there are few established methods for analytics. Practical asset management typically only monitors whether given threshold values are exceeded and subjectively assesses maintenance needs and development in the condition of track structures. The lack of advanced analytics prevents the full utilisation of the available data in maintenance planning which hinders decision making. The main goals of this dissertation study were to develop track geometry deterioration modelling methods, apply data mining in analysing currently available railway asset data, and implement the results from these studies into practical railway asset management. The development of track geometry deterioration modelling methods focused on utilising currently available data for producing novel information on the development in the condition of track structures, past maintenance effectiveness, and future maintenance needs. Data mining was applied in investigating the root causes of track geometry deterioration based on asset data. Finally, maturity models were applied as the basis for implementing track geometry deterioration modelling and track asset data analytics into practice. Based on the research findings, currently available Finnish measurement and asset data was sufficient for the desired analyses. For the Finnish track inspection data, robust linear optimisation was developed for track geometry deterioration modelling. The modelling provided key figures, which depict the condition of structures, maintenance effectiveness, and future maintenance needs. Moreover, visualisations were created from the modelling to enable the practical use of the modelling results. The applied exploratory data mining method, General Unary Hypotheses Automaton (GUHA), could find interesting and hard-to-detect correlations within asset data. With these correlations, novel observations on problematic track structure types were made. The observations could be utilised for allocating further research for problematic track structures, which would not have been possible without using data mining to identify these structures. The implementation of track geometry deterioration and asset data analytics into practice was approached by applying maturity models. The use of maturity models offered a practical way of approaching future development, as the development could be divided into four maturity levels, which created clear incremental goals for development. The maturity model and the incremental goals enabled wide-scale development planning, in which the progress can be segmented and monitored, which enhances successful project completion. The results from these studies demonstrate how currently available data can be used to provide completely new and meaningful information, when advanced analytics are used. In addition to novel solutions for data analytics, this dissertation research also provided methods for implementing the solutions, as the true benefits of knowledge-based decision making are obtained in only practical railway asset management

    Knowledge-based Modelling of Additive Manufacturing for Sustainability Performance Analysis and Decision Making

    Get PDF
    Additiivista valmistusta on pidetty käyttökelpoisena monimutkaisissa geometrioissa, topologisesti optimoiduissa kappaleissa ja kappaleissa joita on muuten vaikea valmistaa perinteisillä valmistusprosesseilla. Eduista huolimatta, yksi additiivisen valmistuksen vallitsevista haasteista on ollut heikko kyky tuottaa toimivia osia kilpailukykyisillä tuotantomäärillä perinteisen valmistuksen kanssa. Mallintaminen ja simulointi ovat tehokkaita työkaluja, jotka voivat auttaa lyhentämään suunnittelun, rakentamisen ja testauksen sykliä mahdollistamalla erilaisten tuotesuunnitelmien ja prosessiskenaarioiden nopean analyysin. Perinteisten ja edistyneiden valmistusteknologioiden mahdollisuudet ja rajoitukset määrittelevät kuitenkin rajat uusille tuotekehityksille. Siksi on tärkeää, että suunnittelijoilla on käytettävissään menetelmät ja työkalut, joiden avulla he voivat mallintaa ja simuloida tuotteen suorituskykyä ja siihen liittyvän valmistusprosessin suorituskykyä, toimivien korkea arvoisten tuotteiden toteuttamiseksi. Motivaation tämän väitöstutkimuksen tekemiselle on, meneillään oleva kehitystyö uudenlaisen korkean lämpötilan suprajohtavan (high temperature superconducting (HTS)) magneettikokoonpanon kehittämisessä, joka toimii kryogeenisissä lämpötiloissa. Sen monimutkaisuus edellyttää monitieteisen asiantuntemuksen lähentymistä suunnittelun ja prototyyppien valmistuksen aikana. Tutkimus hyödyntää tietopohjaista mallinnusta valmistusprosessin analysoinnin ja päätöksenteon apuna HTS-magneettien mekaanisten komponenttien suunnittelussa. Tämän lisäksi, tutkimus etsii mahdollisuuksia additiivisen valmistuksen toteutettavuuteen HTS-magneettikokoonpanon tuotannossa. Kehitetty lähestymistapa käyttää fysikaalisiin kokeisiin perustuvaa tuote-prosessi-integroitua mallinnusta tuottamaan kvantitatiivista ja laadullista tietoa, joka määrittelee prosessi-rakenne-ominaisuus-suorituskyky-vuorovaikutuksia tietyille materiaali-prosessi-yhdistelmille. Tuloksina saadut vuorovaikutukset integroidaan kaaviopohjaiseen malliin, joka voi auttaa suunnittelutilan tutkimisessa ja täten auttaa varhaisessa suunnittelu- ja valmistuspäätöksenteossa. Tätä varten testikomponentit valmistetaan käyttämällä kahta metallin additiivista valmistus prosessia: lankakaarihitsaus additiivista valmistusta (wire arc additive manufacturing) ja selektiivistä lasersulatusta (selective laser melting). Rakenteellisissa sovelluksissa yleisesti käytetyistä metalliseoksista (ruostumaton teräs, pehmeä teräs, luja niukkaseosteinen teräs, alumiini ja kupariseokset) testataan niiden mekaaniset, lämpö- ja sähköiset ominaisuudet. Lisäksi tehdään metalliseosten mikrorakenteen karakterisointi, jotta voidaan ymmärtää paremmin valmistusprosessin parametrien vaikutusta materiaalin ominaisuuksiin. Integroitu mallinnustapa yhdistää kerätyn kokeellisen tiedon, olemassa olevat analyyttiset ja empiiriset vuorovaikutus suhteet, sekä muut tietopohjaiset mallit (esim. elementtimallit, koneoppimismallit) päätöksenteon tukijärjestelmän muodossa, joka mahdollistaa optimaalisen materiaalin, valmistustekniikan, prosessiparametrien ja muitten ohjausmuuttujien valinnan, lopullisen 3d-tulosteun komponentin halutun rakenteen, ominaisuuksien ja suorituskyvyn saavuttamiseksi. Valmistuspäätöksenteko tapahtuu todennäköisyysmallin, eli Bayesin verkkomallin toteuttamisen kautta, joka on vankka, modulaarinen ja sovellettavissa muihin valmistusjärjestelmiin ja tuotesuunnitelmiin. Väitöstyössä esitetyn mallin kyky parantaa additiivisien valmistusprosessien suorituskykyä ja laatua, täten edistää kestävän tuotannon tavoitteita.Additive manufacturing (AM) has been considered viable for complex geometries, topology optimized parts, and parts that are otherwise difficult to produce using conventional manufacturing processes. Despite the advantages, one of the prevalent challenges in AM has been the poor capability of producing functional parts at production volumes that are competitive with traditional manufacturing. Modelling and simulation are powerful tools that can help shorten the design-build-test cycle by enabling rapid analysis of various product designs and process scenarios. Nevertheless, the capabilities and limitations of traditional and advanced manufacturing technologies do define the bounds for new product development. Thus, it is important that the designers have access to methods and tools that enable them to model and simulate product performance and associated manufacturing process performance to realize functional high value products. The motivation for this dissertation research stems from ongoing development of a novel high temperature superconducting (HTS) magnet assembly, which operates in cryogenic environment. Its complexity requires the convergence of multidisciplinary expertise during design and prototyping. The research applies knowledge-based modelling to aid manufacturing process analysis and decision making in the design of mechanical components of the HTS magnet. Further, it explores the feasibility of using AM in the production of the HTS magnet assembly. The developed approach uses product-process integrated modelling based on physical experiments to generate quantitative and qualitative information that define process-structure-property-performance interactions for given material-process combinations. The resulting interactions are then integrated into a graph-based model that can aid in design space exploration to assist early design and manufacturing decision-making. To do so, test components are fabricated using two metal AM processes: wire and arc additive manufacturing and selective laser melting. Metal alloys (stainless steel, mild steel, high-strength low-alloyed steel, aluminium, and copper alloys) commonly used in structural applications are tested for their mechanical-, thermal-, and electrical properties. In addition, microstructural characterization of the alloys is performed to further understand the impact of manufacturing process parameters on material properties. The integrated modelling approach combines the collected experimental data, existing analytical and empirical relationships, and other data-driven models (e.g., finite element models, machine learning models) in the form of a decision support system that enables optimal selection of material, manufacturing technology, process parameters, and other control variables for attaining desired structure, property, and performance characteristics of the final printed component. The manufacturing decision making is performed through implementation of a probabilistic model i.e., a Bayesian network model, which is robust, modular, and can be adapted for other manufacturing systems and product designs. The ability of the model to improve throughput and quality of additive manufacturing processes will boost sustainable manufacturing goals

    Govwise procurement vocabulary (GPV) - An alternative to the Common Procurement Vocabulary (CPV)

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceIn recent years, the world has witnessed emerging legislation on open data. Some of the main goals include stimulating economic growth with the re-use of the data, addressing societal challenges, enhancing evidence-based policymaking, and increasing efficiency in the public administrations, fostering the development of new technologies, such as AI, along with the enhanced participation of the citizens in political decisions and its transparency (European Commission, Open Data, 2021). Govwise is an Advanced Analytics Platform developed to provide a wide range of data analytics to governmental organizations, via a SaaS model. The goal is to make use of the open data policies, by producing valuable information, tackling the challenges that such a data deluge arises. These challenges constitute the scope of the internship here reported, the whole process is described, starting with the data sources and the respective ETL process, on a more high-level structure, until the production of the analysis and dashboards, that constitute the product of Govwise. The focus will, however, be on the classification model developed to address a major necessity of the company to cluster the portuguese public procurement contracts. These are initially classified with a CPV code (common procurement vocabulary code), which does not satisfy the needs of Govwise. Therefore, the end goal of the model is to generate an alternative classification for contracts and tenders of the portuguese public procurement, the GPV (Govwise procurement vocabulary)
    corecore