983 research outputs found

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Quantifying Forecast Uncertainty in the Energy Domain

    Get PDF
    This dissertation focuses on quantifying forecast uncertainties in the energy domain, especially for the electricity and natural gas industry. Accurate forecasts help the energy industry minimize their production costs. However, inaccurate weather forecasts, unusual human behavior, sudden changes in economic conditions, unpredictable availability of renewable sources (wind and solar), etc., represent uncertainties in the energy demand-supply chain. In the current smart grid era, total electricity demand from non-renewable sources influences by the uncertainty of the renewable sources. Thus, quantifying forecast uncertainty has become important to improve the quality of forecasts and decision making. In the natural gas industry, the task of the gas controllers is to guide the hourly natural gas flow in such a way that it remains within a certain daily maximum and minimum flow limits to avoid penalties. Due to inherent uncertainties in the natural gas forecasts, setting such maximum and minimum flow limits a day or more in advance is difficult. Probabilistic forecasts (cumulative distribution functions), which quantify forecast uncertainty, are a useful tool to guide gas controllers to make such tough decisions. Three methods (parametric, semi-parametric, and non-parametric) are presented in this dissertation to generate 168-hour horizon probabilistic forecasts for two real utilities (electricity and natural gas) in the US. Probabilistic forecasting is used as a tool to solve a real-life problem in the natural gas industry. A benchmark was created based on the existing solution, which assumes forecast error is normal. Two new probabilistic forecasting methods are implemented in this work without the normality assumption. There is no single popular evaluation technique available to assess probabilistic forecasts, which is one reason for people’s lack of interest in using probabilistic forecasts. Existing scoring rules are complicated, dataset dependent, and provide less emphasis on reliability (empirical distribution matches with observed distribution) than sharpness (the smallest distance between any two quantiles of a CDF). A graphical way to evaluate probabilistic forecasts along with two new scoring rules are offered in this work. The non-parametric and semi-parametric probabilistic forecasting methods outperformed the benchmark method during unusual days (difficult days to forecast) as well as on other days

    Air pollution forecasts: An overview

    Full text link
    © 2018 by the authors. Licensee MDPI, Basel, Switzerland. Air pollution is defined as a phenomenon harmful to the ecological system and the normal conditions of human existence and development when some substances in the atmosphere exceed a certain concentration. In the face of increasingly serious environmental pollution problems, scholars have conducted a significant quantity of related research, and in those studies, the forecasting of air pollution has been of paramount importance. As a precaution, the air pollution forecast is the basis for taking effective pollution control measures, and accurate forecasting of air pollution has become an important task. Extensive research indicates that the methods of air pollution forecasting can be broadly divided into three classical categories: statistical forecasting methods, artificial intelligence methods, and numerical forecasting methods. More recently, some hybrid models have been proposed, which can improve the forecast accuracy. To provide a clear perspective on air pollution forecasting, this study reviews the theory and application of those forecasting models. In addition, based on a comparison of different forecasting methods, the advantages and disadvantages of some methods of forecasting are also provided. This study aims to provide an overview of air pollution forecasting methods for easy access and reference by researchers, which will be helpful in further studies

    Flood Forecasting Using Machine Learning Methods

    Get PDF
    This book is a printed edition of the Special Issue Flood Forecasting Using Machine Learning Methods that was published in Wate

    A fuzzy theory-based machine learning method for workdays and weekends short-term load forecasting

    Get PDF
    Countries around the globe have introduced renewable energies (RE) and minimized the dependency of fossil resources in power systems to address extensive environmental risks. However, such large-scale energy transitions pose a great challenge to power systems due to the volatility of RE. Meanwhile, power demand is increasing over time and it shows temporal characteristics, such as seasonal and peak-valley patterns. Whether the future power system with a larger proportion of RE can meet the surging but fluctuated electricity demand remains problematic. Previous studies on short-term load forecasting focused more on forecasting accuracy than stability. Further, there is a relative paucity of research into temporal patterns. In order to fill in these research gaps, this paper proposes a fuzzy theory-based machine learning model for workdays and weekends short-term load forecasting. Fuzzy time series (FTS) is applied for data mining and back propagation (BP) neural network is used as the main predictor for short-term load forecasting. To exploit the trade-offs between forecasting stability and accuracy, multi-objective optimization is applied to modify the parameters of BP. Moreover, an interval forecasting architecture with several statistical tests is constructed to address forecasting uncertainties. Short-term load data from Victoria in Australia is selected as a case study. Results demonstrate that the proposed method can significantly boost forecasting stability and accuracy, and help strategy making in the field of energy and electricity system management and planning. (c) 2021 The Author. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).Industrial Ecolog

    A Clustering-Based Hybrid Support Vector Regression Model to Predict Container Volume at Seaport Sanitary Facilities

    Get PDF
    An accurate prediction of freight volume at the sanitary facilities of seaports is a key factor to improve planning operations and resource allocation. This study proposes a hybrid approach to forecast container volume at the sanitary facilities of a seaport. The methodology consists of a three-step procedure, combining the strengths of linear and non-linear models and the capability of a clustering technique. First, a self-organizing map (SOM) is used to decompose the time series into smaller clusters easier to predict. Second, a seasonal autoregressive integrated moving averages (SARIMA) model is applied in each cluster in order to obtain predicted values and residuals of each cluster. These values are finally used as inputs of a support vector regression (SVR) model together with the historical data of the cluster. The final prediction result integrates the prediction results of each cluster. The experimental results showed that the proposed model provided accurate prediction results and outperforms the rest of the models tested. The proposed model can be used as an automatic decision-making tool by seaport management due to its capacity to plan resources in advance, avoiding congestion and time delays

    Leveraging Artificial Neural Networks for Modeling Hydrogeological Time Series

    Get PDF
    Bei der Lösung globaler Herausforderungen, wie der nachhaltigen Bewirtschaftung und Nutzung der verfügbaren Grundwasserressourcen, ist die Entwicklung neuer, effizienter und leicht übertragbarer Modellierungsansätze von entscheidender Bedeutung. Hierfür bieten sich vor allem künstliche neuronale Netze (KNN) an, die als Verfahren des maschinellen Lernens selbstständig relevante Zusammenhänge aus größeren Datensätzen geeigneter Parameter lernen und nutzen können. Die vorliegende Arbeit untersucht die Nutzung von KNN zu Modellierung und Vorhersage von hydrogeologischen Zeitreihen. In vier Studien, die den Hauptteil dieser Arbeit bilden, werden verschiedene Fragestellungen entwickelt und deren Lösbarkeit mit Hilfe von KNN demonstriert. Das Clustern von Ganglinien ist eine Möglichkeit räumliche und zeitliche Muster der Grundwasserdynamik zu erkennen. Dies ist wichtig um Aquifere zu charakterisieren, Einflussfaktoren zu identifizieren und effektive Bewirtschaftungsmethoden zu entwickeln. Aus diesen Gründen wird in der ersten Studie auf Basis von Self-Organizing Maps ein Clustering Verfahren entwickelt, mit dessen Hilfe sich in heterogenen Datensätzen von Grundwasserganglinien solche mit ähnlicher Dynamik gruppieren lassen. Das Verfahren nutzt zur Charakterisierung der Grundwasserdynamik sogenannte Features, die auch die Verarbeitung von Ganglinien mit variabler Datenqualität ermöglichen. Anhand eines Datensatzes von ca. 1800 wöchentlichen Ganglinien wird die Anwendung im Oberrheingraben in Deutschland und Frankreich erfolgreich demonstriert. Eine Analyse der Clusterergebnisse zeigt, dass sich externe Einflussfaktoren räumlich und zeitlich komplex überlagern und eine Trennung häufig nicht möglich ist. Dennoch sind einige Cluster eindeutig auf externe Faktoren (z.B. Grundwasserbewirtschaftung) zurückzuführen. Es folgt ein detaillierter Vergleich verschiedener KNN Modelle zur Grundwasserstandsvorhersage. Untersucht werden hierbei Nonlinear Autoregressive Models with Exogenous Inputs (NARX), Long Short-Term Memory Networks (LSTM) und Convolutional Neural Networks (CNN) sowohl jeweils für Einzelwert- als auch Sequenzvorhersagen. Als Eingangsdaten werden nur wenige, aber dafür weithin verfügbare und leicht zu messende meteorologische Parameter verwendet, wodurch die breite Übertragbarkeit des Ansatzes gewährleistet ist. Es zeigt sich, dass alle Modelltypen grundsätzlich gute Prognoseeigenschaften aufweisen und NARX hierbei in der Regel die präzisesten Vorhersagen treffen, dicht gefolgt von CNNs. Für die praktische Anwendbarkeit zeigen CNNs insgesamt das größte Potenzial, da diese eine geringere Abhängigkeit von der pseudorandomisierten Netzinitialisierung als NARX sowie eine vielfach höhere Berechnungsgeschwindigkeit aufweisen als beide rekurrenten Alternativen. Dabei erreichen CNNs dennoch eine hohe Güte und sind gleichzeitig flexibel implementierbar. CNNs bilden daher die Grundlage für weitere untersuchte Fragestellungen. Die nachfolgende Studie untersucht die Entwicklung der Grundwasserstände in Deutschland im Kontext des Klimawandels. Hierfür werden auf Basis von CNNs und anhand von Temperatur und Niederschlag aus drei Klimaszenarien (RCP2.6, 4.5 und 8.5) die zukünftigen Grundwasserstände an 118 ausgewählten Messstellen in Deutschland modelliert und der direkte Einfluss des zukünftigen Klimas abgeschätzt. Wichtige sekundäre Faktoren wie anthropogene Einflüsse, werden jedoch nicht in die Simulationen mit einbezogen. Unter RCP8.5 (pessimistisches Szenario) sind flächenhaft und ausgeprägt fallende Grundwasserstände zu erwarten, mit einem räumlichen Muster von stärkeren Abnahmen vor allem in Nord- und Ostdeutschland. Ebenfalls abnehmende Trends zeigen die Ergebnisse für die optimistischeren Szenarien RCP2.6 und RCP4.5, jedoch mit vergleichsweise wenig signifikanten Veränderungen. Hier wird der positive Einfluss der verminderten Treibhausgasemissionen deutlich, jedoch werden auch noch für das optimistischste Szenario RCP2.6 in einigen Projektionen deutschlandweit abnehmende Grundwasserstände festgestellt. Abschließend stehen Karstquellschüttungen im Fokus der Arbeit. Zur Modellierung werden zum einen die vorhandenen CNN Ansätze herangezogen, zum anderen wird ein ebenfalls auf CNNs basierender 2D-Ansatz entwickelt, der die direkte Verarbeitung von flächenhaften Rasterdaten als Inputs erlaubt. Hierdurch lässt sich vielfach das Problem der ungenügenden Datenverfügbarkeit von meteorologischen Eingabedaten im Einzugsgebiet lösen. Beide Ansätze zeigen in allen Testgebieten sehr gute Ergebnisse und übertreffen teils die Ergebnisse bereits existierender Modelle. Der direkte Vergleich zwischen herkömmlichem und flächenhaftem Modellierungsansatz erlaubt kein abschließendes Urteil zur Überlegenheit einer der beiden Ansätze hinsichtlich der Genauigkeit der Ergebnisse. Die räumliche und zeitliche Vollständigkeit der Eingabedaten ist jedoch ein schwerwiegender Vorteil des flächenhaften Ansatzes. Weiterhin zeigt der flächenhafte Ansatz Potenzial für die Lokalisierung und, bei entsprechender Datenverfügbarkeit und Weiterentwicklung des Ansatzes, auch für die Abgrenzung von Quelleinzugsgebieten im Karst

    Delineation of precipitation areas from MODIS visible and infrared imagery with artificial neural networks

    Get PDF
    An important phase in a nowcasting system is the diagnosis of the forecast variables. This work focuses on the diagnosis of precipitation. The Nimrod automatic nowcasting system at the Met Office (UK) has long used Meteosat visible and infrared data to supplement the data it receives from the UK weather radar network to produce rainfall analyses. With the advent of Meteosat Second Generation (MSG) attention has focused on how best to use the larger range of spectral information from MSG to improve the rainfall analyses. Earlier work at the Met Office had suggested artificial neural networks (ANNs) to be a useful tool for such applications. Pending the availability of data from MSG, ANNs were used to process data from appropriate visible and infrared channels on the MODIS instrument. Sixty daytime winter cases were collected, and Nimrod radar rainfall analyses provided 'ground truth' for both training and testing the ANNs. The optimal combination of MODIS channels was investigated and it was found that almost all the skill in rain/no rain discrimination was provided by the radiance values from six selected spectral channels. A notable result was that the 1.64 µm channel had no value as a discriminator when used alone, but produced a large increase in skill when used in conjunction with a visible channel. The ANN with MODIS data was found to outperform the corresponding Nimrod look-up table technique applied to Meteosat data. Application of the technique to SEVIRI data is proposed, as is extension to other seasons. Copyright © 2005 Royal Meteorological Societ
    • …
    corecore