11,447 research outputs found

    Machine learning in solar physics

    Full text link
    The application of machine learning in solar physics has the potential to greatly enhance our understanding of the complex processes that take place in the atmosphere of the Sun. By using techniques such as deep learning, we are now in the position to analyze large amounts of data from solar observations and identify patterns and trends that may not have been apparent using traditional methods. This can help us improve our understanding of explosive events like solar flares, which can have a strong effect on the Earth environment. Predicting hazardous events on Earth becomes crucial for our technological society. Machine learning can also improve our understanding of the inner workings of the sun itself by allowing us to go deeper into the data and to propose more complex models to explain them. Additionally, the use of machine learning can help to automate the analysis of solar data, reducing the need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a Living Review in Solar Physics (LRSP

    Non-parametric online market regime detection and regime clustering for multidimensional and path-dependent data structures

    Full text link
    In this work we present a non-parametric online market regime detection method for multidimensional data structures using a path-wise two-sample test derived from a maximum mean discrepancy-based similarity metric on path space that uses rough path signatures as a feature map. The latter similarity metric has been developed and applied as a discriminator in recent generative models for small data environments, and has been optimised here to the setting where the size of new incoming data is particularly small, for faster reactivity. On the same principles, we also present a path-wise method for regime clustering which extends our previous work. The presented regime clustering techniques were designed as ex-ante market analysis tools that can identify periods of approximatively similar market activity, but the new results also apply to path-wise, high dimensional-, and to non-Markovian settings as well as to data structures that exhibit autocorrelation. We demonstrate our clustering tools on easily verifiable synthetic datasets of increasing complexity, and also show how the outlined regime detection techniques can be used as fast on-line automatic regime change detectors or as outlier detection tools, including a fully automated pipeline. Finally, we apply the fine-tuned algorithms to real-world historical data including high-dimensional baskets of equities and the recent price evolution of crypto assets, and we show that our methodology swiftly and accurately indicated historical periods of market turmoil.Comment: 65 pages, 52 figure

    ABC: Adaptive, Biomimetic, Configurable Robots for Smart Farms - From Cereal Phenotyping to Soft Fruit Harvesting

    Get PDF
    Currently, numerous factors, such as demographics, migration patterns, and economics, are leading to the critical labour shortage in low-skilled and physically demanding parts of agriculture. Thus, robotics can be developed for the agricultural sector to address these shortages. This study aims to develop an adaptive, biomimetic, and configurable modular robotics architecture that can be applied to multiple tasks (e.g., phenotyping, cutting, and picking), various crop varieties (e.g., wheat, strawberry, and tomato) and growing conditions. These robotic solutions cover the entire perception–action–decision-making loop targeting the phenotyping of cereals and harvesting fruits in a natural environment. The primary contributions of this thesis are as follows. a) A high-throughput method for imaging field-grown wheat in three dimensions, along with an accompanying unsupervised measuring method for obtaining individual wheat spike data are presented. The unsupervised method analyses the 3D point cloud of each trial plot, containing hundreds of wheat spikes, and calculates the average size of the wheat spike and total spike volume per plot. Experimental results reveal that the proposed algorithm can effectively identify spikes from wheat crops and individual spikes. b) Unlike cereal, soft fruit is typically harvested by manual selection and picking. To enable robotic harvesting, the initial perception system uses conditional generative adversarial networks to identify ripe fruits using synthetic data. To determine whether the strawberry is surrounded by obstacles, a cluster complexity-based perception system is further developed to classify the harvesting complexity of ripe strawberries. c) Once the harvest-ready fruit is localised using point cloud data generated by a stereo camera, the platform’s action system can coordinate the arm to reach/cut the stem using the passive motion paradigm framework, as inspired by studies on neural control of movement in the brain. Results from field trials for strawberry detection, reaching/cutting the stem of the fruit with a mean error of less than 3 mm, and extension to analysing complex canopy structures/bimanual coordination (searching/picking) are presented. Although this thesis focuses on strawberry harvesting, ongoing research is heading toward adapting the architecture to other crops. The agricultural food industry remains a labour-intensive sector with a low margin, and cost- and time-efficiency business model. The concepts presented herein can serve as a reference for future agricultural robots that are adaptive, biomimetic, and configurable

    Space‐Scale Resolved Surface Fluxes Across a Heterogeneous, Mid‐Latitude Forested Landscape

    Get PDF
    The Earth\u27s surface is heterogeneous at multiple scales owing to spatial variability in various properties. The atmospheric responses to these heterogeneities through fluxes of energy, water, carbon, and other scalars are scale-dependent and nonlinear. Although these exchanges can be measured using the eddy covariance technique, widely used tower-based measurement approaches suffer from spectral losses in lower frequencies when using typical averaging times. However, spatially resolved measurements such as airborne eddy covariance measurements can detect such larger scale (meso-β, meso-γ) transport. To evaluate the prevalence and magnitude of these flux contributions, we applied wavelet analysis to airborne flux measurements over a heterogeneous mid-latitude forested landscape, interspersed with open water bodies and wetlands. The measurements were made during the Chequamegon Heterogeneous Ecosystem Energy-balance Study Enabled by a High-density Extensive Array of Detectors intensive field campaign. We ask, how do spatial scales of surface-atmosphere fluxes vary over heterogeneous surfaces across the day and across seasons? Measured fluxes were separated into smaller-scale turbulent and larger-scale mesoscale contributions. We found significant mesoscale contributions to sensible and latent heat fluxes through summer to autumn which would not be resolved in single-point tower measurements through traditional time-domain half-hourly Reynolds decomposition. We report scale-resolved flux transitions associated with seasonal and diurnal changes of the heterogeneous study domain. This study adds to our understanding of surface-atmospheric interactions over unstructured heterogeneities and can help inform multi-scale model-data integration of weather and climate models at a sub-grid scale

    Colour technologies for content production and distribution of broadcast content

    Get PDF
    The requirement of colour reproduction has long been a priority driving the development of new colour imaging systems that maximise human perceptual plausibility. This thesis explores machine learning algorithms for colour processing to assist both content production and distribution. First, this research studies colourisation technologies with practical use cases in restoration and processing of archived content. The research targets practical deployable solutions, developing a cost-effective pipeline which integrates the activity of the producer into the processing workflow. In particular, a fully automatic image colourisation paradigm using Conditional GANs is proposed to improve content generalisation and colourfulness of existing baselines. Moreover, a more conservative solution is considered by providing references to guide the system towards more accurate colour predictions. A fast-end-to-end architecture is proposed to improve existing exemplar-based image colourisation methods while decreasing the complexity and runtime. Finally, the proposed image-based methods are integrated into a video colourisation pipeline. A general framework is proposed to reduce the generation of temporal flickering or propagation of errors when such methods are applied frame-to-frame. The proposed model is jointly trained to stabilise the input video and to cluster their frames with the aim of learning scene-specific modes. Second, this research explored colour processing technologies for content distribution with the aim to effectively deliver the processed content to the broad audience. In particular, video compression is tackled by introducing a novel methodology for chroma intra prediction based on attention models. Although the proposed architecture helped to gain control over the reference samples and better understand the prediction process, the complexity of the underlying neural network significantly increased the encoding and decoding time. Therefore, aiming at efficient deployment within the latest video coding standards, this work also focused on the simplification of the proposed architecture to obtain a more compact and explainable model

    Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse

    Get PDF
    This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses. This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups. In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in users’ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018—6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena

    Object Segmentation and Reconstruction Using Infrastructure Sensor Nodes for Autonomous Mobility

    Get PDF
    This thesis focuses on the Lidar point cloud processing for the infrastructure sensor node that serves as the perception system for autonomous robots with general mobility in indoor applications. Compared with typical schemes mounting sensors on the robots, the method acquires data from infrastructure sensor nodes, providing a more comprehensive view of the environment, which benefits the robot's navigation. The number of sensors would not need to be increased even for multiple robots, significantly reducing costs. In addition, with a central perception system using the infrastructure sensor nodes navigating every robot, a more comprehensive understanding of the current environment and all the robots' locations can be obtained for the control and operation of the autonomous robots. For a robot in the detection range of the sensor node, the sensor node can detect and segment obstacles in its driveable area and reconstruct the incomplete, sparse point cloud of objects upon their movement. The complete shape by the reconstruction benefits the localization and path planning which follows the perception part of the robot's system. Considering the sparse Lidar data and the variety of object categories in the environment, a model-free scheme is selected for object segmentation. Point segmentation starts with background filtering. Considering the complexity of the indoor environment, a depth-matching-based background removal approach is first proposed. However, later tests imply that the method is adequate but not time-efficient. Therefore, based on the depth matching-based method, a process that only focuses on the drive-able area of the robot is proposed, and the computational complexity is significantly reduced. With optimization, the computation time for processing one frame of data can be greatly increased, from 0.2 second by the first approach to 0.01 second by the second approach. After background filtering, the remaining points for occurring objects are segmented as separate clusters using an object clustering algorithm. With independent clusters of objects, an object tracking algorithm is followed to allocate the point clusters with IDs and arrange the clusters in a time sequence. With a stream of clusters for a specific object in a time sequence, point registration is deployed to aggregate the clusters into a complete shape. And as noticed during the experiment, one of the differences between indoor and outdoor environments is that contact between objects in the indoor environment is much more common. The objects in contact are likely to be segmented as a single cluster by the model-free clustering algorithm, which needs to be avoided in the reconstruction process. Therefore an improvement is made in the tracking algorithm when contact happens. The algorithms in this thesis have been experimentally evaluated and presented

    Application of Track Geometry Deterioration Modelling and Data Mining in Railway Asset Management

    Get PDF
    Modernin rautatiejärjestelmän hallinnassa rahankäyttö kohdistuu valtaosin nykyisen rataverkon korjauksiin ja parannuksiin ennemmin kuin uusien ratojen rakentamiseen. Nykyisen rataverkon kunnossapitotyöt aiheuttavat suurten kustannusten lisäksi myös usein liikennerajoitteita tai yhteyksien väliaikaisia sulkemisia, jotka heikentävät rataverkon käytettävyyttä Siispä oikea-aikainen ja pitkäaikaisia parannuksia aikaansaava kunnossapito ovat edellytyksiä kilpailukykyisille ja täsmällisille rautatiekuljetuksille. Tällainen kunnossapito vaatii vankan tietopohjan radan nykyisestä kunnosta päätöksenteon tueksi. Ratainfran omistajat teettävät päätöksenteon tueksi useita erilaisia radan kuntoa kuvaavia mittauksia ja ylläpitävät kattavia omaisuustietorekistereitä. Kenties tärkein näistä datalähteistä on koneellisen radantarkastuksen tuottamat mittaustulokset, jotka kuvastavat radan geometrian kuntoa. Nämä mittaustulokset ovat tärkeitä, koska ne tuottavat luotettavaa kuntotietoa: mittaukset tehdään toistuvasti, 2–6 kertaa vuodessa Suomessa rataosasta riippuen, mittausvaunu pysyy useita vuosia samana, tulokset ovat hyvin toistettavia ja ne antavat hyvän yleiskuvan radan kunnosta. Vaikka laadukasta dataa on paljon saatavilla, käytännön omaisuudenhallinnassa on merkittäviä haasteita datan analysoinnissa, sillä vakiintuneita menetelmiä siihen on vähän. Käytännössä seurataan usein vain mittaustulosten raja-arvojen ylittymistä ja pyritään subjektiivisesti arvioimaan rakenteiden kunnon kehittymistä ja korjaustarpeita. Kehittyneen analytiikan puutteet estävät kuntotietojen laajamittaisen hyödyntämisen kunnossapidon suunnittelussa, mikä vaikeuttaa päätöksentekoa. Tämän väitöskirjatutkimuksen päätavoitteita olivat kehittää ratageometrian heikkenemiseen mallintamismenetelmiä, soveltaa tiedonlouhintaa saatavilla olevan omaisuusdatan analysointiin sekä jalkauttaa kyseiset tutkimustulokset käytännön rataomaisuudenhallintaan. Ratageometrian heikkenemisen mallintamismenetelmien kehittämisessä keskityttiin tuottamaan nykyisin saatavilla olevasta datasta uutta tietoa radan kunnon kehityksestä, tehdyn kunnossapidon tehokkuudesta sekä tulevaisuuden kunnossapitotarpeista. Tiedonlouhintaa sovellettiin ratageometrian heikkenemisen juurisyiden selvittämiseen rataomaisuusdatan perusteella. Lopuksi hyödynnettiin kypsyysmalleja perustana ratageometrian heikkenemisen mallinnuksen ja rataomaisuusdatan analytiikan käytäntöön viennille. Tutkimustulosten perusteella suomalainen radantarkastus- ja rataomaisuusdata olivat riittäviä tavoiteltuihin analyyseihin. Tulokset osoittivat, että robusti lineaarinen optimointi soveltuu hyvin suomalaisen rataverkon ratageometrian heikkenemisen mallinnukseen. Mallinnuksen avulla voidaan tuottaa tunnuslukuja, jotka kuvaavat rakenteen kuntoa, kunnossapidon tehokkuutta ja tulevaa kunnossapitotarvetta, sekä muodostaa havainnollistavia visualisointeja datasta. Rataomaisuusdatan eksploratiiviseen tiedonlouhintaan käytetyn GUHA-menetelmän avulla voitiin selvittää mielenkiintoisia ja vaikeasti havaittavia korrelaatioita datasta. Näiden tulosten avulla saatiin uusia havaintoja ongelmallisista ratarakennetyypeistä. Havaintojen avulla voitiin kohdentaa jatkotutkimuksia näihin rakenteisiin, mikä ei olisi ollut mahdollista, jollei tiedonlouhinnan avulla olisi ensin tunnistettu näitä rakennetyyppejä. Kypsyysmallin soveltamisen avulla luotiin puitteet ratageometrian heikkenemisen mallintamisen ja rataomaisuusdatan analytiikan kehitykselle Suomen rataomaisuuden hallinnassa. Kypsyysmalli tarjosi käytännöllisen tavan lähestyä tarvittavaa kehitystyötä, kun eteneminen voitiin jaotella neljään eri kypsyystasoon, jotka loivat selkeitä välitavoitteita. Kypsyysmallin ja asetettujen välitavoitteiden avulla kehitys on suunniteltua ja edistystä voidaan jaotella, mikä antaa edellytykset tämän laajamittaisen kehityksen onnistuneelle läpiviennille. Tämän väitöskirjatutkimuksen tulokset osoittavat, miten nykyisin saatavilla olevasta datasta saadaan täysin uutta ja merkityksellistä tietoa, kun sitä käsitellään kehittyneen analytiikan avulla. Tämä väitöskirja tarjoaa datankäsittelyratkaisujen luomisen ja soveltamisen lisäksi myös keinoja niiden käytäntöönpanolle, sillä tietopohjaisen päätöksenteon todelliset hyödyt saavutetaan vasta käytännön radanpidossa.In the management of a modern European railway system, spending is predominantly allocated to maintaining and renewing the existing rail network rather than constructing completely new lines. In addition to major costs, the maintenance and renewals of the existing rail network often cause traffic restrictions or line closures, which decrease the usability of the rail network. Therefore, timely maintenance that achieves long-lasting improvements is imperative for achieving competitive and punctual rail traffic. This kind of maintenance requires a strong knowledge base for decision making regarding the current condition of track structures. Track owners commission several different measurements that depict the condition of track structures and have comprehensive asset management data repositories. Perhaps one of the most important data sources is the track recording car measurement history, which depicts the condition of track geometry at different times. These measurement results are important because they offer a reliable condition database; the measurements are done recurrently, two to six times a year in Finland depending on the track section; the same recording car is used for many years; the results are repeatable; and they provide a good overall idea of the condition of track structures. However, although high-quality data is available, there are major challenges in analysing the data in practical asset management because there are few established methods for analytics. Practical asset management typically only monitors whether given threshold values are exceeded and subjectively assesses maintenance needs and development in the condition of track structures. The lack of advanced analytics prevents the full utilisation of the available data in maintenance planning which hinders decision making. The main goals of this dissertation study were to develop track geometry deterioration modelling methods, apply data mining in analysing currently available railway asset data, and implement the results from these studies into practical railway asset management. The development of track geometry deterioration modelling methods focused on utilising currently available data for producing novel information on the development in the condition of track structures, past maintenance effectiveness, and future maintenance needs. Data mining was applied in investigating the root causes of track geometry deterioration based on asset data. Finally, maturity models were applied as the basis for implementing track geometry deterioration modelling and track asset data analytics into practice. Based on the research findings, currently available Finnish measurement and asset data was sufficient for the desired analyses. For the Finnish track inspection data, robust linear optimisation was developed for track geometry deterioration modelling. The modelling provided key figures, which depict the condition of structures, maintenance effectiveness, and future maintenance needs. Moreover, visualisations were created from the modelling to enable the practical use of the modelling results. The applied exploratory data mining method, General Unary Hypotheses Automaton (GUHA), could find interesting and hard-to-detect correlations within asset data. With these correlations, novel observations on problematic track structure types were made. The observations could be utilised for allocating further research for problematic track structures, which would not have been possible without using data mining to identify these structures. The implementation of track geometry deterioration and asset data analytics into practice was approached by applying maturity models. The use of maturity models offered a practical way of approaching future development, as the development could be divided into four maturity levels, which created clear incremental goals for development. The maturity model and the incremental goals enabled wide-scale development planning, in which the progress can be segmented and monitored, which enhances successful project completion. The results from these studies demonstrate how currently available data can be used to provide completely new and meaningful information, when advanced analytics are used. In addition to novel solutions for data analytics, this dissertation research also provided methods for implementing the solutions, as the true benefits of knowledge-based decision making are obtained in only practical railway asset management
    corecore