6 research outputs found

    Abnormal event detection by using data mining and machine learning methods : modelling normality and anomalies

    No full text
    Zsfassung in dt. SpracheIn der vorliegenden Diplomarbeit werden unterschiedliche Ansätze diskutiert, um ungewöhnliche und außerordentliche Ereignisse, für die man eine separate und spezialisierte Betrachtung als wünschenswert erachtet, anhand in Form einer Zeitreihe regelmäßig mitprotokollierter Daten möglichst frühzeitig zu detektieren und bei entsprechender Möglichkeit sogar vorherzusagen. Dieser Problemstellung, die in einem einleitenden Kapitel zusammen mit unterschiedlichen Anwendungsbereichen, in denen ebendiese auftritt, etwas ausführlicher vorgestellt wird, wird mittels einiger Methoden aus dem Bereich des Data Mining, des Machine Learning und des Soft Computing hybrid begegnet. Nach einer kurzen Grundlageneinführung zu Zeitreihen mitsamt den zugehörigen statistischen Vorhersagemodellen werden die erwähnten Begriffe näher abgesteckt, bevor der Fokus auf die einzelnen Teilmethoden gelegt wird. Auf eine Vorstellung von Werkzeugen zur Ausreißer- bzw. Novumserkennung folgt eine abschließende Diskussion der Simulationsergebnisse, die im Rahmen jenes Projekts erzielt wurden, in Zuge dessen diese Arbeit entstand. Der Text endet mit einem Ausblick auf mögliche Modellerweiterungen und zukünftige Arbeiten.This diploma thesis will discuss several approaches to detect unusual and extraordinary events, which we consider to be worth a separate and specialised further investigation, in a time series of frequently collected data as early as possible and - wherever applicable - to even predict them. We rise to this task, which will be introduced together with some different scopes of application in a more detailed way in the opening chapter, using various methods originating in the field of data mining, machine learning and soft computing in a hybrid manner. Following a short and basic introduction to time series including several statistical prediction models, I will delimit and discuss these terms in general, before I will focus on the modular parts of the proposed methodology. After the presentation of some algorithms to detect outliers and novelties, the results of the simulation gained in the project this work was part of are put up for discussion. The text ends with some prospects of possible extensions and enhancements as well as future research work.12

    Comparison of Prediction Models for Delays of Trains by using Data Mining and Machine Learning Methods

    No full text
    On the one hand, having a tight schedule is desirable and very efficient for freight transport companies. On the other hand, a tight schedule increases the impact of delays and cancellation. Furthermore, the prediction of delays is extremely complex, because they depend on many factors of influence. To address these issues, this work will show an approach to forecast delays of trains by using data mining and machine learning methods. For this purpose, an international freight transport company in rail traffic, provided us with a huge amount of historical data of freight and passenger train runs. In order to get a suitable prediction model, we apply a knowledge discovery in databases (KDD) process, which contains the steps data selection, data preprocessing, data transformation, data mining and interpretation/evaluation. After the data selection and data preprocessing step we transform categorical features via one-hot encoder as well as via embedding with various embedding sizes. Furthermore, we present a transformation method for cyclical continuous features like weekday. In the actual data mining process, we use the prepared historical data to perform a regression analysis, which forecast the delays of trains, and compare several regression models like decision tree, random forest, extra trees and gradient boosting regression. An adequate prediction model will be integrated into an agent-based model, which tests the robustness of train networks

    More than Bags of Words : Sentiment Analysis with Word Embeddings

    No full text
    Moving beyond the dominant bag-of-words approach to sentiment analysis we introduce an alternative procedure based on distributed word embeddings. The strength of word embeddings is the ability to capture similarities in word meaning. We use word embeddings as part of a supervised machine learning procedure which estimates levels of negativity in parliamentary speeches. The procedure’s accuracy is evaluated with crowdcoded training sentences; its external validity through a study of patterns of negativity in Austrian parliamentary speeches. The results show the potential of the word embeddings approach for sentiment analysis in the social sciences.publishe

    Calculating Shadows with U-Nets for Urban Environments (Short Paper)

    No full text
    Shadow calculation is an important prerequisite for many urban and environmental analyses such as the assessment of solar energy potential. We propose a neural net approach that can be trained with 3D geographical information and predict the presence and depth of shadows. We adapt a U-Net algorithm traditionally used in biomedical image segmentation and train it on sections of Styria, Austria. Our two-step approach first predicts binary existence of shadows and then estimates the depth of shadows as well. Our results on the case study of Styria, Austria show that the proposed approach can predict in both models shadows with over 80% accuracy which is satisfactory for real-world applications, but still leaves room for improvement

    Methods for integrated simulation: 10 concepts to integrate

    No full text
    This note summarises the current status of the work of EUROSIMs ans ASIMs Technical Committees "Data Driven System Simulation" - with main emphasis on Big Data integration in simulation. This overview suggests ten developed concepts and methods which should be considered, implemented and documented in modern simulation studies with Big Data

    More than Bags of Words: Sentiment Analysis with Word Embeddings

    No full text
    Moving beyond the dominant bag-of-words approach to sentiment analysis we introduce an alternative procedure based on distributed word embeddings. The strength of word embeddings is the ability to capture similarities in word meaning. We use word embeddings as part of a supervised machine learning procedure which estimates levels of negativity in parliamentary speeches. The procedure’s accuracy is evaluated with crowdcoded training sentences; its external validity through a study of patterns of negativity in Austrian parliamentary speeches. The results show the potential of the word embeddings approach for sentiment analysis in the social sciences.© 2018 Elena Rudkowsky, Martin Haselmayer, Matthias Wastian, Marcelo Jenny, Štefan Emrich and Michael Sedlmai
    corecore