1,611 research outputs found

    3rd Workshop in Symbolic Data Analysis: book of abstracts

    Get PDF
    This workshop is the third regular meeting of researchers interested in Symbolic Data Analysis. The main aim of the event is to favor the meeting of people and the exchange of ideas from different fields - Mathematics, Statistics, Computer Science, Engineering, Economics, among others - that contribute to Symbolic Data Analysis

    Verentarpeen ennustaminen: Havaintoja ja toteutus

    Get PDF
    Reliable blood supply chains are critically important for modern medicine. However, blood inventories are perishable, which frames the issue as an inventory management problem with separable supply and demand components. Inventory management can be improved via multiple avenues, but reliable demand estimation is among the most powerful ones, as it helps parties involved in blood collection to scale the collection based on projected demand, thus reducing the amount of outdating units and alleviating shortages. The Finnish Red Cross Blood Service (FRCBS) is responsible for maintaining the blood supply chain in Finland. Currently, operational level (donor mobilization) estimates of demand are created weekly by using in-house expertise and planning level (budgeting) estimates by machine-generated statistical forecasts. This thesis aimed to examine the historical performance of the statistical forecasts used for budgeting and to investigate if they could be improved and expanded to monthly and weekly forecasts for different types of red blood cells. The efforts consisted of reviewing the published literature on short-term and long-term blood demand forecasting, examining the available data, establishing appropriate metrics for evaluation, and trying out better models. We find that that the mean absolute percentage error of the current forecasting methods can be improved by 22.2\% with an additional data preprocessing step and by 50.1\% by changing to a better model. The temporal resolution of forecasting was improved by changing the data source. Also, we discovered that the nature of the blood demand signal changes significantly around 2017, underlining the need to develop forecasting systems with the capability to adapt to changes. Our final implementation is built into an R Markdown file to output an easily accessible HTML for reporting. Further exploration is warranted, especially if the aim is to use forecasting operationally someday.Verihuoltoketjun luotettavuus on kriittisen tärkeä osa modernia lääketiedettä. Veri vanhenee muutamassa päivässä, mikä asettaa huoltoketjuongelman varastonhallinnan viitekehykseen erillisillä kysynnän ja tarjonnan osa-alueilla. Varastonhallintaa voi kehittää useilla eri menetelmillä, mutta kysynnän ennustaminen on menetelmistä tehokkaimpien joukossa, sillä se mahdollistaa veren keräyksen kysynnän perusteella vähentäen erääntyvien veripussien määrää ja riittämättömien varastojen riskiä. Suomen Punaisen Ristin ylläpitämä Veripalvelu vastaa verihuoltoketjun ylläpidosta Suomessa. Nykyisellään operationaalisen tason (luovuttajien kutsuminen) ennusteet tehdään viikoittaisissa kokouksissa asiantuntijoiden kokemusta hyödyntäen. Pitemmän aikavälin suunnitelmalliset (budjetointi) ennusteet tehdään laskennallisesti aikasarja-analyysillä. Tämän opinnäytetyön tavoitteena oli arvioida käytössä olevien laskennallisten ennusteiden historiallista tarkkuutta ja selvittää, voiko tarkkuutta parantaa tai ovatko ennusteet laajennettavissa viikoittaisiin ennusteisiin ja useampiin verityyppeihin. Tavoitetta edistettiin kirjallisuuskatsauksella verentarpeen lyhyen ja pitkän aikavälin ennustamiseen, saatavilla olevan datan tarkastelulla, sopivien tarkkuusmittareiden selvittämisellä ja muiden mallien testaamisella. Työn aikana selvisi, että käytössä olevia ennusteita voidaan parantaa 22,2 prosentilla lisäämällä prosessiin uusi datan esikäsittelyvaihe ja 50,1 prosentilla vaihtamalla käytettävää mallia parempaan. Ennusteen aikatarkkuutta saatiin parannettua vaihtamalla datan lähdettä. Opinnäytetyön päälöydös oli kuitenkin verentarpeen signaalin luonteen merkittävä muutos vuoden 2017 paikkeilla, mikä alleviivaa muutoksiin sopeutuvien ennustejärjestelmien tarpeellisuutta. Lopullinen järjestelmä rakennettiin R Markdown -skriptin sisälle helppolukuista HTML-raportointia varten. Tarpeen ennustamisen jatkotutkimusta tarvitaan, varsinkin jos tavoitteena on ennusteiden käyttö operationaalisesti

    Quantifying Forecast Uncertainty in the Energy Domain

    Get PDF
    This dissertation focuses on quantifying forecast uncertainties in the energy domain, especially for the electricity and natural gas industry. Accurate forecasts help the energy industry minimize their production costs. However, inaccurate weather forecasts, unusual human behavior, sudden changes in economic conditions, unpredictable availability of renewable sources (wind and solar), etc., represent uncertainties in the energy demand-supply chain. In the current smart grid era, total electricity demand from non-renewable sources influences by the uncertainty of the renewable sources. Thus, quantifying forecast uncertainty has become important to improve the quality of forecasts and decision making. In the natural gas industry, the task of the gas controllers is to guide the hourly natural gas flow in such a way that it remains within a certain daily maximum and minimum flow limits to avoid penalties. Due to inherent uncertainties in the natural gas forecasts, setting such maximum and minimum flow limits a day or more in advance is difficult. Probabilistic forecasts (cumulative distribution functions), which quantify forecast uncertainty, are a useful tool to guide gas controllers to make such tough decisions. Three methods (parametric, semi-parametric, and non-parametric) are presented in this dissertation to generate 168-hour horizon probabilistic forecasts for two real utilities (electricity and natural gas) in the US. Probabilistic forecasting is used as a tool to solve a real-life problem in the natural gas industry. A benchmark was created based on the existing solution, which assumes forecast error is normal. Two new probabilistic forecasting methods are implemented in this work without the normality assumption. There is no single popular evaluation technique available to assess probabilistic forecasts, which is one reason for people’s lack of interest in using probabilistic forecasts. Existing scoring rules are complicated, dataset dependent, and provide less emphasis on reliability (empirical distribution matches with observed distribution) than sharpness (the smallest distance between any two quantiles of a CDF). A graphical way to evaluate probabilistic forecasts along with two new scoring rules are offered in this work. The non-parametric and semi-parametric probabilistic forecasting methods outperformed the benchmark method during unusual days (difficult days to forecast) as well as on other days

    Creation of a web application for smart park system and parking prediction study for system integration

    Get PDF
    In the last few years, we have observed an exponential increasing of the information systems, and parking information is one more example of them. The needs of obtaining reliable and updated information of parking slots availability are very important in the goal of traffic reduction. Also parking slot prediction is a new topic that has already started to be applied. San Francisco in America and Santander in Spain are examples of such projects carried out to obtain this kind of information. The aim of this thesis is the study and evaluation of methodologies for parking slot prediction and the integration in a web application, where all kind of users will be able to know the current parking status and also future status according to parking model predictions. The source of the data is ancillary in this work but it needs to be understood anyway to understand the parking behaviour. Actually, there are many modelling techniques used for this purpose such as time series analysis, decision trees, neural networks and clustering. In this work, the author explains the best techniques at this work, analyzes the result and points out the advantages and disadvantages of each one. The model will learn the periodic and seasonal patterns of the parking status behaviour, and with this knowledge it can predict future status values given a date. The data used comes from the Smart Park Ontinyent and it is about parking occupancy status together with timestamps and it is stored in a database. After data acquisition, data analysis and pre-processing was needed for model implementations. The first test done was with the boosting ensemble classifier, employed over a set of decision trees, created with C5.0 algorithm from a set of training samples, to assign a prediction value to each object. In addition to the predictions, this work has got measurements error that indicates the reliability of the outcome predictions being correct. The second test was done using the function fitting seasonal exponential smoothing tbats model. Finally as the last test, it has been tried a model that is actually a combination of the previous two models, just to see the result of this combination. The results were quite good for all of them, having error averages of 6.2, 6.6 and 5.4 in vacancies predictions for the three models respectively. This means from a parking of 47 places a 10% average error in parking slot predictions. This result could be even better with longer data available. In order to make this kind of information visible and reachable from everyone having a device with internet connection, a web application was made for this purpose. Beside the data displaying, this application also offers different functions to improve the task of searching for parking. The new functions, apart from parking prediction, were: - Park distances from user location. It provides all the distances to user current location to the different parks in the city. - Geocoding. The service for matching a literal description or an address to a concrete location. - Geolocation. The service for positioning the user. - Parking list panel. This is not a service neither a function, is just a better visualization and better handling of the information

    Machine Learning Tools in the Predictive Analysis of ERCOT Load Demand Data

    Get PDF
    The electric load industry has seen a significant transformation over the last few decades, culminating in the establishment and implementation of electricity markets. This transition separates electric generation services into a distinct, more competitive sector of the industry, allowing for the introduction of greater unpredictability into the system. Forecasting power system load has developed into a core research area in power and energy demand engineering in order to maintain a constant balance between electricity supply and demand. The purpose of this thesis dissertation is to reduce power system uncertainty by improving forecasting accuracy through the use of sophisticated machine learning techniques. Additionally, this research provides sophisticated machine learning-based forecasting methodologies for the three forecasting professions from a variety of perspectives, incorporating several advanced deep learning features such as Naïve/default, Hyperparameter Tuning, and Custom Early Stopping. We begin by creating long-term memory (LSTM) and gated recurrent unit (GRU) models for ERCOT demand data, and then compare them to some of the most well-known supervised machine learning models, such as ARIMA and SARIMA, to identify the best set of models for long- and short-term load forecasting. We will also use multiple comparison approaches, such as the radar chart and the Pygal radar chart, to perform a thorough evaluation of each of the deep learning models before settling on the best model

    Short-Term Passenger Demand Forecasting Using Univariate Time Series Theory

    Get PDF
    The purpose of the paper is to identify and analyse the forecasting performance of the model of passenger demand for suburban bus transport time series, which satisfies the statistical significance of its parameters and randomness of its residuals. Box-Jenkins, exponential smoothing and multiple linear regression models are used in order to design a more accurate and reliable model compared the ones used nowadays. Forecasting accuracy of the models is evaluated by comparative analysis of the calculated mean absolute percent errors of different approaches to forecasting. In accordance with the main goal of the paper was identified the ARIMA model, which fulfils almost all statistical criterions with an exception of the model residuals normality. In spite of the limitation, the best forecasting abilities of identified model have been proven in comparison with other approaches to forecasting in the paper. The published findings of research will have positive influence on increasing the forecasting accuracy in the process of passenger demand forecasting

    Lijst van Engelse statistische termen

    Get PDF
    corecore