47 research outputs found

    Computational Intelligence for classification and forecasting of solar photovoltaic energy production and energy consumption in buildings

    Get PDF
    This thesis presents a few novel applications of Computational Intelligence techniques in the field of energy-related problems. More in detail, we refer to the assessment of the energy produced by a solar photovoltaic installation and to the evaluation of building’s energy consumptions. In fact, recently, thanks also to the growing evolution of technologies, the energy sector has drawn the attention of the research community in proposing useful tools to deal with issues of energy efficiency in buildings and with solar energy production management. Thus, we will address two kinds of problem. The first problem is related to the efficient management of solar photovoltaic energy installations, e.g., for efficiently monitoring the performance as well as for finding faults, or for planning the energy distribution in the electrical grid. This problem was faced with two different approaches: a forecasting approach and a fuzzy classification approach for energy production estimation, starting from some knowledge about environmental variables. The forecasting system developed is able to reproduce the instantaneous curve of daily energy produced by the solar panels of the installation, with a forecasting horizon of one day. It combines neural networks and time series analysis models. The fuzzy classification system, rather, extracts some linguistic knowledge about the amount of energy produced by the installation, exploiting an optimal fuzzy rule base and genetic algorithms. The developed model is the result of a novel hierarchical methodology for building fuzzy systems, which may be applied in several areas. The second problem is related to energy efficiency in buildings, for cost reduction and load scheduling purposes, and was tackled by proposing a forecasting system of energy consumption in office buildings. The proposed system exploits a neural network to estimate the energy consumption due to lighting on a time interval of a few hours, starting from considerations on available natural daylight

    Density Preserving Sampling: Robust and Efficient Alternative to Cross-validation for Error Estimation

    Get PDF
    Estimation of the generalization ability of a classi- fication or regression model is an important issue, as it indicates the expected performance on previously unseen data and is also used for model selection. Currently used generalization error estimation procedures, such as cross-validation (CV) or bootstrap, are stochastic and, thus, require multiple repetitions in order to produce reliable results, which can be computationally expensive, if not prohibitive. The correntropy-inspired density- preserving sampling (DPS) procedure proposed in this paper eliminates the need for repeating the error estimation procedure by dividing the available data into subsets that are guaranteed to be representative of the input dataset. This allows the production of low-variance error estimates with an accuracy comparable to 10 times repeated CV at a fraction of the computations required by CV. This method can also be used for model ranking and selection. This paper derives the DPS procedure and investigates its usability and performance using a set of public benchmark datasets and standard classifier

    Recognizing daily and sports activities in two open source machine learning environments using body-worn sensor units

    Get PDF
    This study provides a comparative assessment on the different techniques of classifying human activities performed while wearing inertial and magnetic sensor units on the chest, arms and legs. The gyroscope, accelerometer and the magnetometer in each unit are tri-axial. Naive Bayesian classifier, artificial neural networks (ANNs), dissimilarity-based classifier, three types of decision trees, Gaussian mixture models (GMMs) and support vector machines (SVMs) are considered. A feature set extracted from the raw sensor data using principal component analysis is used for classification. Three different cross-validation techniques are employed to validate the classifiers. A performance comparison of the classifiers is provided in terms of their correct differentiation rates, confusion matrices and computational cost. The highest correct differentiation rates are achieved with ANNs (99.2%), SVMs (99.2%) and a GMM (99.1%). GMMs may be preferable because of their lower computational requirements. Regarding the position of sensor units on the body, those worn on the legs are the most informative. Comparing the different sensor modalities indicates that if only a single sensor type is used, the highest classification rates are achieved with magnetometers, followed by accelerometers and gyroscopes. The study also provides a comparison between two commonly used open source machine learning environments (WEKA and PRTools) in terms of their functionality, manageability, classifier performance and execution times. © 2013 © The British Computer Society 2013. All rights reserved

    Physically inspired methods and development of data-driven predictive systems.

    Get PDF
    Traditionally building of predictive models is perceived as a combination of both science and art. Although the designer of a predictive system effectively follows a prescribed procedure, his domain knowledge as well as expertise and intuition in the field of machine learning are often irreplaceable. However, in many practical situations it is possible to build well–performing predictive systems by following a rigorous methodology and offsetting not only the lack of domain knowledge but also partial lack of expertise and intuition, by computational power. The generalised predictive model development cycle discussed in this thesis is an example of such methodology, which despite being computationally expensive, has been successfully applied to real–world problems. The proposed predictive system design cycle is a purely data–driven approach. The quality of data used to build the system is thus of crucial importance. In practice however, the data is rarely perfect. Common problems include missing values, high dimensionality or very limited amount of labelled exemplars. In order to address these issues, this work investigated and exploited inspirations coming from physics. The novel use of well–established physical models in the form of potential fields, has resulted in derivation of a comprehensive Electrostatic Field Classification Framework for supervised and semi–supervised learning from incomplete data. Although the computational power constantly becomes cheaper and more accessible, it is not infinite. Therefore efficient techniques able to exploit finite amount of predictive information content of the data and limit the computational requirements of the resource–hungry predictive system design procedure are very desirable. In designing such techniques this work once again investigated and exploited inspirations coming from physics. By using an analogy with a set of interacting particles and the resulting Information Theoretic Learning framework, the Density Preserving Sampling technique has been derived. This technique acts as a computationally efficient alternative for cross–validation, which fits well within the proposed methodology. All methods derived in this thesis have been thoroughly tested on a number of benchmark datasets. The proposed generalised predictive model design cycle has been successfully applied to two real–world environmental problems, in which a comparative study of Density Preserving Sampling and cross–validation has also been performed confirming great potential of the proposed methods

    Applications of pattern classification to time-domain signals

    Get PDF
    Many different kinds of physics are used in sensors that produce time-domain signals, such as ultrasonics, acoustics, seismology, and electromagnetics. The waveforms generated by these sensors are used to measure events or detect flaws in applications ranging from industrial to medical and defense-related domains. Interpreting the signals is challenging because of the complicated physics of the interaction of the fields with the materials and structures under study. often the method of interpreting the signal varies by the application, but automatic detection of events in signals is always useful in order to attain results quickly with less human error. One method of automatic interpretation of data is pattern classification, which is a statistical method that assigns predicted labels to raw data associated with known categories. In this work, we use pattern classification techniques to aid automatic detection of events in signals using features extracted by a particular application of the wavelet transform, the Dynamic Wavelet Fingerprint (DWFP), as well as features selected through physical interpretation of the individual applications. The wavelet feature extraction method is general for any time-domain signal, and the classification results can be improved by features drawn for the particular domain. The success of this technique is demonstrated through four applications: the development of an ultrasonographic periodontal probe, the identification of flaw type in Lamb wave tomographic scans of an aluminum pipe, prediction of roof falls in a limestone mine, and automatic identification of individual Radio Frequency Identification (RFID) tags regardless of its programmed code. The method has been shown to achieve high accuracy, sometimes as high as 98%

    The Key Factors in Physical Activity Type Detection Using Real-Life Data: A Systematic Review

    Get PDF
    Background: Physical activity (PA) is paramount for human health and well-being. However, there is a lack of information regarding the types of PA and the way they can exert an influence on functional and mental health as well as quality of life. Studies have measured and classified PA type in controlled conditions, but only provided limited insight into the validity of classifiers under real-life conditions. The advantage of utilizing the type dimension and the significance of real-life study designs for PA monitoring brought us to conduct a systematic literature review on PA type detection (PATD) under real-life conditions focused on three main criteria: methods for detecting PA types, using accelerometer data collected by portable devices, and real-life settings.Method: The search of the databases, Web of Science, Scopus, PsycINFO, and PubMed, identified 1,170 publications. After screening of titles, abstracts and full texts using the above selection criteria, 21 publications were included in this review.Results: This review is organized according to the three key elements constituting the PATD process using real-life datasets, including data collection, preprocessing, and PATD methods. Recommendations regarding these key elements are proposed, particularly regarding two important PA classes, i.e., posture and motion activities. Existing studies generally reported high to near-perfect classification accuracies. However, the data collection protocols and performance reporting schemes used varied significantly between studies, hindering a transparent performance comparison across methods.Conclusion: Generally, considerably less studies focused on PA types, compared to other measures of PA assessment, such as PA intensity, and even less focused on real-life settings. To reliably differentiate the basic postures and motion activities in real life, two 3D accelerometers (thigh and hip) sampling at 20 Hz were found to provide the minimal sensor configuration. Decision trees are the most common classifier used in practical applications with real-life data. Despite the significant progress made over the past year in assessing PA in real-life settings, it remains difficult, if not impossible, to compare the performance of the various proposed methods. Thus, there is an urgent need for labeled, fully documented, and openly available reference datasets including a common evaluation framework

    STACKED REGRESSION WITH A GENERALIZATION OF THE MOORE-PENROSE PSEUDOINVERSE

    Get PDF
    In practice, it often happens that there are a number of classification methods. We are not able to clearly determine which method is optimal. We propose a combined method that allows us to consolidate information from multiple sources in a better classifier. Stacked regression (SR) is a method for forming linear combinations of different classifiers to give improved classification accuracy. The Moore-Penrose (MP) pseudoinverse is a general way to find the solution to a system of linear equations. This paper presents the use of a generalization of the MP pseudoinverse of a matrix in SR. However, for data sets with a greater number of features our exact method is computationally too slow to achieve good results: we propose a genetic approach to solve the problem. Experimental results on various real data sets demonstrate that the improvements are efficient and that this approach outperforms the classical SR method, providing a significant reduction in the mean classification error rate

    Physically inspired methods and development of data-driven predictive systems

    Get PDF
    Traditionally building of predictive models is perceived as a combination of both science and art. Although the designer of a predictive system effectively follows a prescribed procedure, his domain knowledge as well as expertise and intuition in the field of machine learning are often irreplaceable. However, in many practical situations it is possible to build well–performing predictive systems by following a rigorous methodology and offsetting not only the lack of domain knowledge but also partial lack of expertise and intuition, by computational power. The generalised predictive model development cycle discussed in this thesis is an example of such methodology, which despite being computationally expensive, has been successfully applied to real–world problems. The proposed predictive system design cycle is a purely data–driven approach. The quality of data used to build the system is thus of crucial importance. In practice however, the data is rarely perfect. Common problems include missing values, high dimensionality or very limited amount of labelled exemplars. In order to address these issues, this work investigated and exploited inspirations coming from physics. The novel use of well–established physical models in the form of potential fields, has resulted in derivation of a comprehensive Electrostatic Field Classification Framework for supervised and semi–supervised learning from incomplete data. Although the computational power constantly becomes cheaper and more accessible, it is not infinite. Therefore efficient techniques able to exploit finite amount of predictive information content of the data and limit the computational requirements of the resource–hungry predictive system design procedure are very desirable. In designing such techniques this work once again investigated and exploited inspirations coming from physics. By using an analogy with a set of interacting particles and the resulting Information Theoretic Learning framework, the Density Preserving Sampling technique has been derived. This technique acts as a computationally efficient alternative for cross–validation, which fits well within the proposed methodology. All methods derived in this thesis have been thoroughly tested on a number of benchmark datasets. The proposed generalised predictive model design cycle has been successfully applied to two real–world environmental problems, in which a comparative study of Density Preserving Sampling and cross–validation has also been performed confirming great potential of the proposed methods.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Automatic detection of microaneurysms in colour fundus images for diabetic retinopathy screening.

    Get PDF
    Regular eye screening is essential for the early detection and treatment of the diabetic retinopathy. This paper presents a novel automatic screening system for diabetic retinopathy that focuses on the detection of the earliest visible signs of retinopathy, which are microaneurysms. Microaneurysms are small dots on the retina, formed by ballooning out of a weak part of the capillary wall. The detection of the microaneurysms at an early stage is vital, and it is the first step in preventing the diabetic retinopathy. The paper first explores the existing systems and applications related to diabetic retinopathy screening, with a focus on the microaneurysm detection methods. The proposed decision support system consists of an automatic acquisition, screening and classification of diabetic retinopathy colour fundus images, which could assist in the detection and management of the diabetic retinopathy. Several feature extraction methods and the circular Hough transform have been employed in the proposed microaneurysm detection system, alongside the fuzzy histogram equalisation method. The latter method has been applied in the preprocessing stage of the diabetic retinopathy eye fundus images and provided improved results for detecting the microaneurysms

    The characterisation and automatic classification of transmission line faults

    Get PDF
    Includes bibliographical references.A country's ability to sustain and grow its industrial and commercial activities is highly dependent on a reliable electricity supply. Electrical faults on transmission lines are a cause of both interruptions to supply and voltage dips. These are the most common events impacting electricity users and also have the largest financial impact on them. This research focuses on understanding the causes of transmission line faults and developing methods to automatically identify these causes. Records of faults occurring on the South African power transmission system over a 16-year period have been collected and analysed to find statistical relationships between local climate, key design parameters of the overhead lines and the main causes of power system faults. The results characterize the performance of the South African transmission system on a probabilistic basis and illustrate differences in fault cause statistics for the summer and winter rainfall areas of South Africa and for different times of the year and day. This analysis lays a foundation for reliability analysis and fault pattern recognition taking environmental features such as local geography, climate and power system parameters into account. A key aspect of using pattern recognition techniques is selecting appropriate classifying features. Transmission line fault waveforms are characterised by instantaneous symmetrical component analysis to describe the transient and steady state fault conditions. The waveform and environmental features are used to develop single nearest neighbour classifiers to identify the underlying cause of transmission line faults. A classification accuracy of 86% is achieved using a single nearest neighbour classifier. This classification performance is found to be superior to that of decision tree, artificial neural network and naĂŻve Bayes classifiers. The results achieved demonstrate that transmission line faults can be automatically classified according to cause
    corecore