988 research outputs found

    Prediction of poor health in small ruminants and companion animals with accelerometers and machine learning

    Get PDF
    Global warming is one of the biggest challenge of our times, and significant efforts are being undertaken by academics, industries and other actors to tackle the problem. In the agricultural field precision farming is part of the solution to environmental sustainability and has been researched increasingly in recent years. Indeed, it has the potential to effectively increase livestock yield and decrease production carbon footprint while maintaining welfare. The thesis begins with a review of developments in automated animal monitoring and then moves on to a case study of a health monitoring system for small-ruminant in South Africa. As a demonstration and validation of the potential use case of the system, the method we propose is then applied to another study which aims to study feline health. Lower and Middle Income countries will be strongly affected by the changing climate and its impacts. We devise our method based on two South African small scale sheep and goat farms where assessment of the health status of individual animals is a key step in the timely and targeted treatment of infections, which is critical in the fight against anthelmintic and antimicrobial resistance. The FAMACHA scoring system has been used successfully to detect anaemia caused by infection with the parasitic nematode Haemonchus contortus in small ruminants and is an effective way to identify individuals in need of treatment. However, assessing FAMACHA is labour-intensive and costly as individuals must be manually examined at frequent intervals. Here, we used accelerometers to measure the individual activity of extensively grazed small ruminants exposed to natural Haemonchus contortus worm infection in southern Africa over long time scales (13+ months). When combined with machine learning for missing data imputation and classification, we find that this activity data can predict poorer health as well as those individuals that respond to treatment, with precision up to 80%. We demonstrate that these classifiers remain robust over time. Interpretation of trained classifiers reveals that poorer health can be predicted mainly by the night-time activity levels in the sheep. Our study reveals behavioural patterns across two small ruminant species, which low-cost biologgers can exploit to detect subtle changes in animal health and enable timely and targeted intervention. This has real potential to improve economic outcomes and animal welfare as well as limit the use of anthelmintic drugs and diminish pressures on anthelmintic resistance in both commercial and resource-poor communal farming. The validation of the proposed techniques with a different study group will be discussed in the latter part of the thesis. We used the accelerometry data of indoor cats equipped with wearable accelerometers in conjunction with their health status to detect signs of degenerative joint disease, and adapted our machine-learning pipeline to analyse bursts of high activity in the cats. We were able to classify high-activity events with precision up to 70% despite the relatively small dataset adding further evidence to the viability of animal health monitoring with accelerometers

    IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective

    Full text link
    With the wide spread of sensors and smart devices in recent years, the data generation speed of the Internet of Things (IoT) systems has increased dramatically. In IoT systems, massive volumes of data must be processed, transformed, and analyzed on a frequent basis to enable various IoT services and functionalities. Machine Learning (ML) approaches have shown their capacity for IoT data analytics. However, applying ML models to IoT data analytics tasks still faces many difficulties and challenges, specifically, effective model selection, design/tuning, and updating, which have brought massive demand for experienced data scientists. Additionally, the dynamic nature of IoT data may introduce concept drift issues, causing model performance degradation. To reduce human efforts, Automated Machine Learning (AutoML) has become a popular field that aims to automatically select, construct, tune, and update machine learning models to achieve the best performance on specified tasks. In this paper, we conduct a review of existing methods in the model selection, tuning, and updating procedures in the area of AutoML in order to identify and summarize the optimal solutions for every step of applying ML algorithms to IoT data analytics. To justify our findings and help industrial users and researchers better implement AutoML approaches, a case study of applying AutoML to IoT anomaly detection problems is conducted in this work. Lastly, we discuss and classify the challenges and research directions for this domain.Comment: Published in Engineering Applications of Artificial Intelligence (Elsevier, IF:7.8); Code/An AutoML tutorial is available at Github link: https://github.com/Western-OC2-Lab/AutoML-Implementation-for-Static-and-Dynamic-Data-Analytic

    Social networks & price forecasting: The case of Bitcoins

    Get PDF
    Treballs Finals de Grau en Estadística UB-UPC, Facultat d'Economia i Empresa (UB) i Facultat de Matemàtiques i Estadística (UPC), Curs: 2017-2018, Tutor: Karina Gibert; Montserrat Guillén(eng) The main conceptual element this thesis orbits around is the idea of using social networks as a data source. First, classical trading theory and current usage of data obtained from social networks is reviewed. Taking all this information into account, a forecasting of the Bitcoin price is performed using both classical methods and machine learning Neural Networks. In order to obtain data from social networks, another complexity layer needs to be added by accessing the sources through APIs and directly web-scrapping the net. The results of all of this complex implementation are given with a strong focus on visualisation using several different techniques. Finally, after a critical discussion a Future Work chapter is introduced, where many possible follow-ups are drawn up.(cat) El principal element conceptual al voltant del qual gira aquest Treball de Fi de Grau és la idea d’utilitzar les xarxes socials com a font d’informació. D’entrada, s’analitza tant la teoria clàssica d’inversió com l’ús actual de les xarxes socials com a font d’informació. Tenint en compte tot això, es procedeix a modelitzar i predir el preu del Bitcoin mitjançant tant mètodes classics com Xarxes Neuronals Artificials. Per tal d’obtenir dades a partir de xarxes socials, cal afegir una capa de complexitat al treball mitjançant l’access a les fonts a través d’APIs i directament scrapejant les webs. Els resultats obtinguts a partir d’aquesta complexa implementació es mostren en un format explícitament visual utilitzant diferents tècniques. Finalment, després d’una discussió crítica, es procedeix al capítol de Futur del Projecte on es plantegen varies possibles vies de continuació del treball

    Robust data cleaning procedure for large scale medium voltage distribution networks feeders

    Get PDF
    Relatively little attention has been given to the short-term load forecasting problem of primary substations mainly because load forecasts were not essential to secure the operation of passive distribution networks. With the increasing uptake of intermittent generations, distribution networks are becoming active since power flows can change direction in a somewhat volatile fashion. The volatility of power flows introduces operational constraints on voltage control, system fault levels, thermal constraints, systems losses and high reverse power flows. Today, greater observability of the networks is essential to maintain a safe overall system and to maximise the utilisation of existing assets. Hence, to identify and anticipate for any forthcoming critical operational conditions, networks operators are compelled to broaden their visibility of the networks to time horizons that include not only real-time information but also hour-ahead and day-ahead forecasts. With this change in paradigm, progressively, large scales of short-term load forecasters is integrated as an essential component of distribution networks' control and planning tools. The data acquisition of large scale real-world data is prone to errors; anomalies in data sets can lead to erroneous forecasting outcomes. Hence, data cleansing is an essential first step in data-driven learning techniques. Data cleansing is a labour-intensive and time-consuming task for the following reasons: 1) to select a suitable cleansing method is not trivial 2) to generalise or automate a cleansing procedure is challenging, 3) there is a risk to introduce new errors in the data. This thesis attempts to maximise the performance of large scale forecasting models by addressing the quality of the modelling data. Thus, the objectives of this research are to identify the bad data quality causes, design an automatic data cleansing procedure suitable for large scale distribution network datasets and, to propose a rigorous framework for modelling MV distribution network feeders time series with deep learning architecture. The thesis discusses in detail the challenges in handling and modelling real-world distribution feeders time series. It also discusses a robust technique to detect outliers in the presence of level-shifts, and suitable missing values imputation techniques. All the concepts have been demonstrated on large real-world distribution network data.Open Acces

    Quantificação dos componentes de vazão por meio de filtros recursivos: estudo de caso da Bacia do Rio Paracatu (SF-7), Brasil

    Get PDF
    A quantificação dos componentes de vazão é importante para a gestão de recursos hídricos. Empregaram-se, neste estudo, filtros recursivos de sinais para delimitação dos fluxos de base, interfluxo e rápido de sub-bacias aninhadas na Bacia do Rio Paracatu (SF-7). Preliminarmente, os dados de vazão tiveram sua consistência avaliada por meio de análises de estacionalidade e preenchimento multivariado das lacunas de dados. Apresenta-se uma metodologia em que os filtros são calibrados pela influência do escoamento superficial e pela inflexão na curva de recessão do período sazonal de seca. Os filtros foram aprimorados por uma restrição lógica, que limita a sobre-estimação da vazão à cada iteração do algoritmo. Os resultados foram condizentes com estudos prévios e com mapeamentos hidrogeológicos e climatológicos.The quantification of flow components is important for water resource management. The base, interflow, and runoff flows of nested basins of the Paracatu River (SF-7) were estimated with recursive signal filters in this study. At first, stationary analysis and multivariate gap filling were applied to the flow data. A methodology is presented, proposing filter calibration with the runoff influence and inflection of the recession curve for dry season. Filters are improved with a logic constraint that limits water flow overestimation within algorithm iteration. The results were coherent with previous studies and with hydrogeological and climatological maps

    Reinforcement learning in ophthalmology: potential applications and challenges to implementation

    Get PDF
    Reinforcement learning is a subtype of machine learning in which a virtual agent, functioning within a set of predefined rules, aims to maximise a specified outcome or reward. This agent can consider multiple variables and many parallel actions at once to optimise its reward, thereby solving complex, sequential problems. Clinical decision making requires physicians to optimise patient outcomes within a set practice framework and, thus, presents considerable opportunity for the implementation of reinforcement learning-driven solutions. We provide an overview of reinforcement learning, and focus on potential applications within ophthalmology. We also explore the challenges associated with development and implementation of reinforcement learning solutions and discuss possible approaches to address them

    Design and validation of novel methods for long-term road traffic forecasting

    Get PDF
    132 p.Road traffic management is a critical aspect for the design and planning of complex urban transport networks for which vehicle flow forecasting is an essential component. As a testimony of its paramount relevance in transport planning and logistics, thousands of scientific research works have covered the traffic forecasting topic during the last 50 years. In the beginning most approaches relied on autoregressive models and other analysis methods suited for time series data. During the last two decades, the development of new technology, platforms and techniques for massive data processing under the Big Data umbrella, the availability of data from multiple sources fostered by the Open Data philosophy and an ever-growing need of decision makers for accurate traffic predictions have shifted the spotlight to data-driven procedures. Even in this convenient context, with abundance of open data to experiment and advanced techniques to exploit them, most predictive models reported in literature aim for shortterm forecasts, and their performance degrades when the prediction horizon is increased. Long-termforecasting strategies are more scarce, and commonly based on the detection and assignment to patterns. These approaches can perform reasonably well unless an unexpected event provokes non predictable changes, or if the allocation to a pattern is inaccurate.The main core of the work in this Thesis has revolved around datadriven traffic forecasting, ultimately pursuing long-term forecasts. This has broadly entailed a deep analysis and understanding of the state of the art, and dealing with incompleteness of data, among other lesser issues. Besides, the second part of this dissertation presents an application outlook of the developed techniques, providing methods and unexpected insights of the local impact of traffic in pollution. The obtained results reveal that the impact of vehicular emissions on the pollution levels is overshadowe

    Design and validation of novel methods for long-term road traffic forecasting

    Get PDF
    132 p.Road traffic management is a critical aspect for the design and planning of complex urban transport networks for which vehicle flow forecasting is an essential component. As a testimony of its paramount relevance in transport planning and logistics, thousands of scientific research works have covered the traffic forecasting topic during the last 50 years. In the beginning most approaches relied on autoregressive models and other analysis methods suited for time series data. During the last two decades, the development of new technology, platforms and techniques for massive data processing under the Big Data umbrella, the availability of data from multiple sources fostered by the Open Data philosophy and an ever-growing need of decision makers for accurate traffic predictions have shifted the spotlight to data-driven procedures. Even in this convenient context, with abundance of open data to experiment and advanced techniques to exploit them, most predictive models reported in literature aim for shortterm forecasts, and their performance degrades when the prediction horizon is increased. Long-termforecasting strategies are more scarce, and commonly based on the detection and assignment to patterns. These approaches can perform reasonably well unless an unexpected event provokes non predictable changes, or if the allocation to a pattern is inaccurate.The main core of the work in this Thesis has revolved around datadriven traffic forecasting, ultimately pursuing long-term forecasts. This has broadly entailed a deep analysis and understanding of the state of the art, and dealing with incompleteness of data, among other lesser issues. Besides, the second part of this dissertation presents an application outlook of the developed techniques, providing methods and unexpected insights of the local impact of traffic in pollution. The obtained results reveal that the impact of vehicular emissions on the pollution levels is overshadowe
    • …
    corecore