Search CORE

198 research outputs found

Anomaly Detection of Smart Meter Data

Author: Maatug Fadwa
Publication venue: 'Saint Louis University'
Publication date: 01/01/2021
Field of study

Presently, households and buildings use almost one-third of total energy consumption among all the power consumption sources. This trend is continuing to rise as more and more buildings install smart meter sensors and connect to the Smart Grid. Smart Grid uses sensors and ICT technologies to achieve an uninterrupted power supply and minimize power wastage. Abnormalities in sensors and faults lead to power wastage. Along with that studying the consumption pattern of a building can lead to a substantial reduction in power wastage which can save millions of dollars. According to studies, 20\% of energy consumed by buildings are wasted due to the above factors. In this work, we propose an anomaly detection approach for detecting anomalies in the power consumption of smart meter data from an open dataset of 10 houses from Ausgrid Corporation Australia. Since the power consumption may be affected by various factors such as weather conditions during the year, it was necessary to search for a way to discover the anomalies, considering seasonal periods such as weather seasons, day/night and holidays. Consequently, the first part of this thesis is to identify the outliers and obtain data with labels (normal or anomalous). We use Facebook prophet algorithm along with power consumption domain knowledge to detect anomalies for two years of half-hour sampled data. After generating the dataset with anomaly labels, we proposed a method to classify future power consumptions as anomalous or normal. We use four different approaches using machine learning for classifying anomalies. We also measure the run-time of different classification algorithms. We are able to achieve a G-mean score of 97 per cent

NORA - Norwegian Open Research Archives

UiS Brage

Turn-Level Active Learning for Dialogue State Tracking

Author: Chen Ling
Fang Meng
Namazi-Rad Mohammad-Reza
Ye Fanghua
Zhang Zihan
Publication venue
Publication date: 22/10/2023
Field of study

Dialogue state tracking (DST) plays an important role in task-oriented dialogue systems. However, collecting a large amount of turn-by-turn annotated dialogue data is costly and inefficient. In this paper, we propose a novel turn-level active learning framework for DST to actively select turns in dialogues to annotate. Given the limited labelling budget, experimental results demonstrate the effectiveness of selective annotation of dialogue turns. Additionally, our approach can effectively achieve comparable DST performance to traditional training approaches with significantly less annotated data, which provides a more efficient way to annotate new dialogue data.Comment: EMNLP 2023 Main Conferenc

arXiv.org e-Print Archive

Scalable Hierarchical Gaussian Process Models for Regression and Pattern Classification

Author: Nguyen Thi Nhat Anh
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2019
Field of study

Gaussian processes, which are distributions over functions, are powerful nonparametric tools for the two major machine learning tasks: regression and classification. Both tasks are concerned with learning input-output mappings from example input-output pairs. In Gaussian process (GP) regression and classification, such mappings are modeled by Gaussian processes. In GP regression, the likelihood is Gaussian for continuous outputs, and hence closed-form solutions for prediction and model selection can be obtained. In GP classification, the likelihood is non-Gaussian for discrete/categorical outputs, and hence closed-form solutions are not available, and approximate inference methods must be resorted

Research Online

Optimal use of computing equipment in an automated industrial inspection context

Author: Jubb Matthew James
Publication venue
Publication date: 01/01/1995
Field of study

This thesis deals with automatic defect detection. The objective was to develop the techniques required by a small manufacturing business to make cost-efficient use of inspection technology. In our work on inspection techniques we discuss image acquisition and the choice between custom and general-purpose processing hardware. We examine the classes of general-purpose computer available and study popular operating systems in detail. We highlight the advantages of a hybrid system interconnected via a local area network and develop a sophisticated suite of image-processing software based on it. We quantitatively study the performance of elements of the TCP/IP networking protocol suite and comment on appropriate protocol selection for parallel distributed applications. We implement our own distributed application based on these findings. In our work on inspection algorithms we investigate the potential uses of iterated function series and Fourier transform operators when preprocessing images of defects in aluminium plate acquired using a linescan camera. We employ a multi-layer perceptron neural network trained by backpropagation as a classifier. We examine the effect on the training process of the number of nodes in the hidden layer and the ability of the network to identify faults in images of aluminium plate. We investigate techniques for introducing positional independence into the network's behaviour. We analyse the pattern of weights induced in the network after training in order to gain insight into the logic of its internal representation. We conclude that the backpropagation training process is sufficiently computationally intensive so as to present a real barrier to further development in practical neural network techniques and seek ways to achieve a speed-up. Weconsider the training process as a search problem and arrive at a process involving multiple, parallel search "vectors" and aspects of genetic algorithms. We implement the system as the mentioned distributed application and comment on its performance

Durham e-Theses

Can Threshold-Based Sensor Alerts be Analysed to Detect Faults in a District Heating Network?

Author: Cantwell Liam
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2018
Field of study

Older IoT “smart sensors” create system alerts from threshold rules on reading values. These simple thresholds are not very flexible to changes in the network. Due to the large number of false positives generated, these alerts are often ignored by network operators. Current state-of-the-art analytical models typically create alerts using raw sensor readings as the primary input. However, as greater numbers of sensors are being deployed, the growth in the number of readings that must be processed becomes problematic. The number of analytic models deployed to each of these systems is also increasing as analysis is broadened. This study aims to investigate if alerts created using threshold rules can be used to predict network faults. By using threshold-based alerts instead of raw continuous readings, the amount of data that the analytic models need to process is greatly reduced. The study was done using alert data from a European city’s District Heating network. The alerts were generated by “smart sensors” that used threshold rules. Analytic models were tested to find the most accurate prediction of a network fault. Work order (maintenance) records were used as the target variable indicating a fault had occurred at the same time and location as the alert was active. The target variable was highly imbalanced (96:4) with a minority class being when a Work Order was required. The decision tree model developed used misclassification costs to achieve a reasonable accuracy with a trade-off between precision (.63) and recall (.56). The sparse nature of the alert data may be to blame for this result. The results show promise that this method could work well on datasets with better sensor coverage

Arrow@TUDublin

COMPUTATIONAL MODELLING OF HUMAN AESTHETIC PREFERENCES IN THE VISUAL DOMAIN: A BRAIN-INSPIRED APPROACH

Author: Lemarchand François
Publication venue: 'University of Plymouth'
Publication date: 01/01/2018
Field of study

Following the rise of neuroaesthetics as a research domain, computational aesthetics has also known a regain in popularity over the past decade with many works using novel computer vision and machine learning techniques to evaluate the aesthetic value of visual information. This thesis presents a new approach where low-level features inspired from the human visual system are extracted from images to train a machine learning-based system to classify visual information depending on its aesthetics, regardless of the type of visual media. Extensive tests are developed to highlight strengths and weaknesses of such low-level features while establishing good practices in the domain of study of computational aesthetics. The aesthetic classification system is not only tested on the most widely used dataset of photographs, called AVA, on which it is trained initially, but also on other photographic datasets to evaluate the robustness of the learnt aesthetic preferences over other rating communities. The system is then assessed in terms of aesthetic classification on other types of visual media to investigate whether the learnt aesthetic preferences represent photography rules or more general aesthetic rules. The skill transfer from aesthetic classification of photos to videos demonstrates a satisfying correct classification rate of videos without any prior training on the test set created by Tzelepis et al. Moreover, the initial photograph classifier can also be used on feature films to investigate the classifier’s learnt visual preferences, due to films providing a large number of frames easily labellable. The study on aesthetic classification of videos concludes with a case study on the work by an online content creator. The classifier recognised a significantly greater percentage of aesthetically high frames in videos filmed in studios than on-the-go. The results obtained across datasets containing videos of diverse natures manifest the extent of the system’s aesthetic knowledge. To conclude, the evolution of low-level visual features is studied in popular culture such as in paintings and brand logos. The work attempts to link aesthetic preferences during contemplation tasks such as aesthetic rating of photographs with preferred low-level visual features in art creation. It questions whether favoured visual features usage varies over the life of a painter, implicitly showing a relationship with artistic expertise. Findings display significant changes in use of universally preferred features over influential vi abstract painters’ careers such an increase in cardinal lines and the colour blue; changes that were not observed in landscape painters. Regarding brand logos, only a few features evolved in a significant manner, most of them being colour-related features. Despite the incredible amount of data available online, phenomena developing over an entire life are still complicated to study. These computational experiments show that simple approaches focusing on the fundamentals instead of high-level measures allow to analyse artists’ visual preferences, as well as extract a community’s visual preferences from photos or videos while limiting impact from cultural and personal experiences

Plymouth Electronic Archive and Research Library

Estimating UK House Prices using Machine Learning

Author: Awonaike A.
Awonaike A.
Publication venue: University of East London
Publication date: 01/01/2022
Field of study

House price estimation is an important subject for property owners, property developers, investors and buyers. It has featured in many academic research papers and some government and commercial reports. The price of a house may vary depending on several features including geographic location, tenure, age, type, size, market, etc. Existing studies have largely focused on applying single or multiple machine learning techniques to single or groups of datasets to identify the best performing algorithms, models and/or most important predictors, but this paper proposes a cumulative layering approach to what it describes as a Multi-feature House Price Estimation (MfHPE) framework. The MfHPE is a process-oriented, data-driven and machine learning based framework that does not just identify the best performing algorithms or features that drive the accuracy of models but also exploits a cumulative multi-feature layering approach to creating machine learning models, optimising and evaluating them so as to produce tangible insights that enable the decision-making process for stakeholders within the housing ecosystem for a more realistic estimation of house prices. Fundamentally, the MfHPE framework development leverages the Design Science Research Methodology (DSRM) and HM Land Registry’s Price Paid Data is ingested as the base transactions data. 1.1 million London-based transaction records between January 2011 and December 2020 have been exploited for model design, optimisation and evaluation, while 84,051 2021 transactions have been used for model validation. With the capacity for updates to existing datasets and the introduction of new datasets and algorithms, the proposed framework has also leveraged a range of neighbourhood and macroeconomic features including the location of rail stations, supermarkets, bus stops, inflation rate, GDP, employment rate, Consumer Price Index (CPIH) and unemployment rate to explore their impact on the estimation of house prices and their influence on the behaviours of machine learning algorithms. Five machine learning algorithms have been exploited and three evaluation metrics have been used. Results show that the layered introduction of new variety of features in multiple tiers led to improved performance in 50% of models, a change in the best performing models as new variety of features are introduced, and that the choice of evaluation metrics should not just be based on technical problem types but on three components: (i) critical business objectives or project goals; (ii) variety of features; and (iii) machine learning algorithms

UEL Research Repository at University of East London

Recommended from our members

Discriminative methods for statistical spoken dialogue systems

Author: Henderson Matthew S.
Publication venue: University of Cambridge
Publication date: 30/06/2015
Field of study

Dialogue promises a natural and effective method for users to interact with and obtain information from computer systems. Statistical spoken dialogue systems are able to disambiguate in the presence of errors by maintaining probability distributions over what they believe to be the state of a dialogue. However, traditionally these distributions have been derived using generative models, which do not directly optimise for the criterion of interest and cannot easily exploit arbitrary information that may potentially be useful. This thesis presents how discriminative methods can overcome these problems in Spoken Language Understanding (SLU) and Dialogue State Tracking (DST). A robust method for SLU is proposed, based on features extracted from the full posterior distribution of recognition hypotheses encoded in the form of word confusion networks. This method uses discriminative classifiers, trained on unaligned input/output pairs. Performance is evaluated on both an off-line corpus, and on-line in a live user trial. It is shown that a statistical discriminative approach to SLU operating on the full posterior ASR output distribution can substantially improve performance in terms of both accuracy and overall dialogue reward. Furthermore, additional gains can be obtained by incorporating features from the system's output. For DST, a new word-based tracking method is presented that maps directly from the speech recognition results to the dialogue state without using an explicit semantic decoder. The method is based on a recurrent neural network structure that is capable of generalising to unseen dialogue state hypotheses, and requires very little feature engineering. The method is evaluated in the second and third Dialog State Tracking Challenges, as well as in a live user trial. The results demonstrate consistently high performance across all of the off-line metrics and a substantial increase in the quality of the dialogues in the live trial. The proposed method is shown to be readily applied to expanding dialogue domains, by exploiting robust features and a new method for online unsupervised adaptation. It is shown how the neural network structure can be adapted to output structured joint distributions, giving an improvement over estimating the dialogue state as a product of marginal distributions

Apollo (Cambridge)

Progame:event-based machine learning approach for in-game marketing

Author: Miettinen M. (Mauri)
Publication venue: University of Oulu
Publication date: 23/06/2020
Field of study

Abstract. There’s been a significant growth in the gaming industry, which has lead to an increased number of collected player and usage data, including game events, player interactions, the connections between players and individual preferences. Such big data has many use cases such as the identification of gaming bottlenecks, detection and prediction of anomalies and suspicious usage patterns for security, and real time offer specification via fine-grained user profiling based on their interest profiles. Offering personalized offer timing could reduce product cannibalization, and ethical methods increase the trust of customers. The goal of this thesis is to predict the value and time of the next in-game purchase in a mobile game. Using data aggregation, event-based purchase data, daily in-game behaviour metrics and session data are combined into a single data table, from which samples of 50 000 data points are taken. The features are analyzed for linear correlation with the labels, and their combinations are used as input for three machine learning algorithms: Random Forest, Support Vector Machine and Multi-Layer Perceptron. Both purchase value and purchase time are correlated with features related to previous purchase behaviour. Multi-Layer Perceptron showed the lowest error in predicting both labels, showing an improvement of 22,0% for value in USD and 20,7% for days until purchase compared to a trivial baseline predictor. For ethical customer behaviour prediction, sharing of research knowledge and customer involvement in the data analysis process is suggested to build awareness.Progame : tapahtumapohjainen koneoppimisjärjestelmä pelinsisäiseen markkinointiin. Tiivistelmä. Peliteollisuuden kasvu on johtanut kerättävän pelaaja- ja käyttödatan määrään nousuun, koostuen mm. pelitapahtumista, interaktiodatasta, pelaajien välisistä yhteyksistä ja henkilökohtaisista mieltymyksistä. Tällaisella massadatalla on monia käyttötarkoituksia kuten tietoliikenteen teknisten rajoitusten tunnistaminen pelikäytössä, käyttäjien tavallisuudesta poikkeavan käytöksen tunnistaminen ja ennustaminen tietoturvatarkoituksiin, sekä reaaliaikainen tarjousten määrittäminen hienovaraisella käyttäjien mieltymysten profiloinnilla. Ostotarjousten henkilökohtaistaminen voi vähentää uusien tuotteiden aiheuttamaa vanhojen tuotteiden myynnin laskua, ja eettiset menetelmät parantavat asiakkaiden luottamusta. Tässä työssä ennustetaan asiakkaan seuraavan pelinsisäisen oston arvoa ja aikaa mobiilipelissä. Tapahtumapohjainen ostodata, päivittäiset pelin sisäiset metriikat ja sessiodata yhdistetään yhdeksi datataulukoksi, josta otetaan kerrallaan 50 000:n datarivin näytteitä. Jokaisen selittävän muuttujan lineaarinen korrelaatio ennustettavan muuttujan kanssa analysoidaan, ja niiden yhdistelmiä käytetään syötteenä kolmelle eri koneoppimismallille: satunnainen metsä (Random Forest), tukivektorikone (Support Vector Machine) ja monikerroksinen perseptroniverkko (Multi-Layer Perceptron). Tutkimuksessa havaittiin, että sekä tulevan oston arvo että ajankohta korreloivat aiemman ostokäyttäytymisen kanssa. Monikerroksisella perseptroniverkolla oli pienin virhe molemmille ennustettaville muuttujille, ja verrattuna triviaaliin vertailuennustimeen, se vähensi virhettä 22,0% arvon ennustamisessa ja 20,7% seuraavaan ostoon jäljellä olevien päivien ennustamisessa. Eettisen asiakkaiden käyttäytymisen ennustamisen varmistamiseksi ja tietoisuuden lisäämiseksi ehdotetaan tutkimustiedon jakamista ja asiakkaan ottamista mukaan analyysin tekemiseen

University of Oulu Repository - Jultika