    Machine learning for predicting energy efficiency of buildings: a small data approach

    This paper provides a method for predicting the energy efficiency of buildings using artificial intelligence tools. The scopes is twofold: prediction of the levels of the heating load and cooling load of buildings. A feature of this research is the performance of intellectual analysis in conditions of a limited amount of data when solving the stated tasks. An improved method of augmentation and prediction (input-doubling method) is proposed by processing data within each cluster of the studied dataset. The selection of the latter occurs due to the use of the fast and easy-to-implement k-means method. Next, a prediction is made using the input-doubling method within each separate cluster. The simulation of the method was performed on a real-world dataset of 768 observations. The proposed approach was found to have a high prediction accuracy in the absence of overfitting and high generalization properties of the improved method. Comparison with existing methods showed an increase in accuracy by 40-46% (MSE) compared to SVR with rbf kernel, which is the basis for the improved method, and by 5-12% (MSE) compared to the closest existing hierarchical predictor

    Ансамбль нейромереж GRNN на підставі зміщених поверхонь відгуку для задач електронної комерції

    Solving e-commerce problems being represented in most cases by non-linear response surfaces is an important task. The use of existing computing intelligence methods is not always appropriate due to the significant complexity of training and debugging procedures. Non-iterative tools and neural networks without training also do not provide satisfactory accuracy of the result. The accuracy can be improved using different ensembling techniques. Therefore, the paper describes a new ensemble method based on generalized regression neural networks. The basic idea of the developed ensemble is to linearize the response surface given by the data of the available sample. Therefore, the surface obtained by means of the general regression neural network is given to the input of a linear neural structure. This combination helps improve the accuracy of solving the task by the ensemble. The described ensemble is used to solve the problem of predicting the price of used cars. Application of the ensemble developed enables predicting the price of the used cars based on the most suitable independent attributes. In common practice solving this task requires expert knowledge. The urgency of solving this problem is substantiated. The dataset contains vehicle characteristics and sale prices of 1436 used cars. The main attributes of the considered dataset are provided. The optimal parameters were experimentally selected. Performances of different existing methods were compered. The methods were evaluated by the root mean square error using a test data sample. By comparison with known methods such as General Regression Neural Network, Radial Basis Function Neural Network, Linear Regression, Lasso Regression, and Support Vector Machines Regressor, the highest accuracy of its work is established. The results are compared with Condorcet's jury theorem estimations. The implementation of the proposed method was done on Python programming language. Thus, we can summarize that the developed general regression neural network ensemble based on offsets of response surfaces and with additional use of the linear-type Neural-like Structure of a Successive Geometric Transformations Model can be used to solve various high-precision e‑commerce tasks.Розв'язок задач електронної комерції, які здебільшого характеризуються нелінійними поверхнями відгуку, є важливим завданням. Застосування сучасних засобів обчислювального інтелекту не завжди є доречним зважаючи на складність реалізації процедур навчання і налагодження. Неітеративні засоби та нейронні мережі без навчання також не забезпечують задовільної точності результату. З огляду на це у роботі описано новий ансамбль на підставі нейронних мереж узагальненої регресії. Основна ідея розробленого ансамблю полягає в лінеаризації поверхні відгуку, що задається даними наявної вибірки. Для цього отримана за допомогою мережі GRNN поверхня подається на вхід лінійної нейроподібної структури. Така комбінація забезпечує підвищення точності роботи ансамблю під час розв'язання поставленої задачі. Описаний ансамбль застосовано для розв'язання задачі прогнозування ціни на вживані автомобілі. Експериментальним способом підібрано оптимальні параметри його роботи. Шляхом порівняння із відомими методами встановлено найвищу точність його роботи. Результати експериментальних досліджень порівняно з теоретичними оцінками на підставі висновків теореми Кондорсе про журі присяжних. Розроблений ансамбль нейронних мереж узагальненої регресії на підставі зміщення поверхонь відгуку та з додатковим використанням нейроподібних структур Моделі послідовних геометричних перетворень варто застосовувати під час розв'язання різноманітних задач електронної комерції підвищеної точності


    The subject matter of the article is fuzzy clustering of high-dimensional data based on the ensemble approach, provided that a number and shape of clusters are not known. The goal of the work is to create the neuro-fuzzy approach for clustering data when the data stream is fed for online processing and a number and shape of clusters are unknown. The following tasks are solved in the article - the input feature space is compressed in the online mode; the model of neural network ensembles for data clustering is built; the ensemble of neuro-fuzzy networks for clustering high-dimensional data is developed; the approach for clustering data in the online mode is worked out. The following results are obtained - the main idea of the proposed approach is based on a modification of the fuzzy C-means algorithm. To reduce the dimension of the input space, the modified Hebb-Sanger network is suggested to be used; this net is characterized by the increased speed and is built on the basis of the modified Oja neurons. A speed-optimized learning algorithm for the Oja neuron is proposed. Such a network implements the method of principal components in the online mode with high speed. Conclusions. In the event the reduction-compression procedure cannot be used due to the probability of losing the physical meaning of the original space, a new clustering criterion was introduced; this criterion contains both a well-known polynomial fuzzifier and the weighment of individual components of the deviations of presented images from cluster centroids. The recurrent modification based on the algorithms proposed in this article is introduced. A mathematical model is developed to determine the quality of clustering with the use of the Xi-Beni index, which was modified for the online mode. The experimental results confirm the fact that the proposed system enables solving a wide range of Data Mining tasks when data sets are processed online, provided that a number and shape of clusters are not known and there is a large number of observations as well