902 research outputs found

    Neural Networks for Complex Data

    Full text link
    Artificial neural networks are simple and efficient machine learning tools. Defined originally in the traditional setting of simple vector data, neural network models have evolved to address more and more difficulties of complex real world problems, ranging from time evolving data to sophisticated data structures such as graphs and functions. This paper summarizes advances on those themes from the last decade, with a focus on results obtained by members of the SAMM team of Universit\'e Paris

    A Two-stage Architecture for Stock Price Forecasting by Integrating Self-Organizing Map and Support Vector Regression

    Get PDF
    Stock price prediction has attracted much attention from both practitioners and researchers. However, most studies in this area ignored the non-stationary nature of stock price series. That is, stock price series do not exhibit identical statistical properties at each point of time. As a result, the relationships between stock price series and their predictors are quite dynamic. It is challenging for any single artificial technique to effectively address this problematic characteristics in stock price series. One potential solution is to hybridize different artificial techniques. Towards this end, this study employs a two-stage architecture for better stock price prediction. Specifically, the self-organizing map (SOM) is first used to decompose the whole input space into regions where data points with similar statistical distributions are grouped together, so as to contain and capture the non-stationary property of financial series. After decomposing heterogeneous data points into several homogenous regions, support vector regression (SVR) is applied to forecast financial indices. The proposed technique is empirically tested using stock price series from seven major financial markets. The results show that the performance of stock price prediction can be significantly enhanced by using the two-stage architecture in comparison with a single SVR model

    Decision support with data-analysis methods in a nuclear power plant

    Get PDF
    Early fault detection is an important issue in nuclear industry. Methods based on self-organizing map (SOM) in dynamic systems are discussed and developed to help operators and plant experts in their decision making and used together with other methods. Visualization issues are in an important role in this research. Prototype systems are built to be able to test the basic principles. Five different studies are presented in detail. This report summarizes the test case 4 (TC4) "Decision support at a nuclear power plant" in NoTeS and NoTeS2 projects in TEKES MASI research program

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Mooreā€™s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Financial time series analysis with competitive neural networks

    Full text link
    Lā€™objectif principal de meĢmoire est la modeĢlisation des donneĢes temporelles non stationnaires. Bien que les modeĢ€les statistiques classiques tentent de corriger les donneĢes non stationnaires en diffeĢrenciant et en ajustant pour la tendance, je tente de creĢer des grappes localiseĢes de donneĢes de seĢries temporelles stationnaires graĢ‚ce aĢ€ lā€™algorithme du Ā« self-organizing map Ā». Bien que de nombreuses techniques aient eĢteĢ deĢveloppeĢes pour les seĢries chronologiques aĢ€ lā€™aide du Ā« self- organizing map Ā», je tente de construire un cadre matheĢmatique qui justifie son utilisation dans la preĢvision des seĢries chronologiques financieĢ€res. De plus, je compare les meĢthodes de preĢvision existantes aĢ€ lā€™aide du SOM avec celles pour lesquelles un cadre matheĢmatique a eĢteĢ deĢveloppeĢ et qui nā€™ont pas eĢteĢ appliqueĢes dans un contexte de preĢvision. Je compare ces meĢthodes avec la meĢthode ARIMA bien connue pour la preĢvision des seĢries chronologiques. Le deuxieĢ€me objectif de meĢmoire est de deĢmontrer la capaciteĢ du Ā« self-organizing map Ā» aĢ€ regrouper des donneĢes vectorielles, puisquā€™elle a eĢteĢ deĢveloppeĢe aĢ€ lā€™origine comme un reĢseau neuronal avec lā€™objectif de regroupement. Plus preĢciseĢment, je deĢmontrerai ses capaciteĢs de regroupement sur les donneĢes du Ā« limit order book Ā» et preĢsenterai diverses meĢthodes de visualisation de ses sorties.The main objective of this Masterā€™s thesis is in the modelling of non-stationary time series data. While classical statistical models attempt to correct non- stationary data through differencing and de-trending, I attempt to create localized clusters of stationary time series data through the use of the self-organizing map algorithm. While numerous techniques have been developed that model time series using the self-organizing map, I attempt to build a mathematical framework that justifies its use in the forecasting of financial times series. Additionally, I compare existing forecasting methods using the SOM with those for which a framework has been developed and which have not been applied in a forecasting context. I then compare these methods with the well known ARIMA method of time series forecasting. The second objective of this thesis is to demonstrate the self-organizing mapā€™s ability to cluster data vectors as it was originally developed as a neural network approach to clustering. Specifically I will demonstrate its clustering abilities on limit order book data and present various visualization methods of its output

    A two-stage architecture for stock price forecasting by integrating self-organizing map and support vector regression

    Get PDF
    Author name used in the publication: JJ Po-An Hsieh2008-2009 > Academic research: refereed > Publication in refereed journalAccepted ManuscriptPublishe

    Data-Driven Methods for Data Center Operations Support

    Get PDF
    During the last decade, cloud technologies have been evolving at an impressive pace, such that we are now living in a cloud-native era where developers can leverage on an unprecedented landscape of (possibly managed) services for orchestration, compute, storage, load-balancing, monitoring, etc. The possibility to have on-demand access to a diverse set of configurable virtualized resources allows for building more elastic, flexible and highly-resilient distributed applications. Behind the scenes, cloud providers sustain the heavy burden of maintaining the underlying infrastructures, consisting in large-scale distributed systems, partitioned and replicated among many geographically dislocated data centers to guarantee scalability, robustness to failures, high availability and low latency. The larger the scale, the more cloud providers have to deal with complex interactions among the various components, such that monitoring, diagnosing and troubleshooting issues become incredibly daunting tasks. To keep up with these challenges, development and operations practices have undergone significant transformations, especially in terms of improving the automations that make releasing new software, and responding to unforeseen issues, faster and sustainable at scale. The resulting paradigm is nowadays referred to as DevOps. However, while such automations can be very sophisticated, traditional DevOps practices fundamentally rely on reactive mechanisms, that typically require careful manual tuning and supervision from human experts. To minimize the risk of outagesā€”and the related costsā€”it is crucial to provide DevOps teams with suitable tools that can enable a proactive approach to data center operations. This work presents a comprehensive data-driven framework to address the most relevant problems that can be experienced in large-scale distributed cloud infrastructures. These environments are indeed characterized by a very large availability of diverse data, collected at each level of the stack, such as: time-series (e.g., physical host measurements, virtual machine or container metrics, networking components logs, application KPIs); graphs (e.g., network topologies, fault graphs reporting dependencies among hardware and software components, performance issues propagation networks); and text (e.g., source code, system logs, version control system history, code review feedbacks). Such data are also typically updated with relatively high frequency, and subject to distribution drifts caused by continuous configuration changes to the underlying infrastructure. In such a highly dynamic scenario, traditional model-driven approaches alone may be inadequate at capturing the complexity of the interactions among system components. DevOps teams would certainly benefit from having robust data-driven methods to support their decisions based on historical information. For instance, effective anomaly detection capabilities may also help in conducting more precise and efficient root-cause analysis. Also, leveraging on accurate forecasting and intelligent control strategies would improve resource management. Given their ability to deal with high-dimensional, complex data, Deep Learning-based methods are the most straightforward option for the realization of the aforementioned support tools. On the other hand, because of their complexity, this kind of models often requires huge processing power, and suitable hardware, to be operated effectively at scale. These aspects must be carefully addressed when applying such methods in the context of data center operations. Automated operations approaches must be dependable and cost-efficient, not to degrade the services they are built to improve. i

    Selection of representative feature training sets with self-organized maps for optimized time series modeling and prediction: application to forecasting daily drought conditions with ARIMA and neural network models

    Get PDF
    While the simulation of stochastic time series is challenging due to their inherently complex nature, this is compounded by the arbitrary and widely accepted feature data usage methods frequently applied during the model development phase. A pertinent context where these practices are reflected is in the forecasting of drought events. This chapter considers optimization of feature data usage by sampling daily data sets via self-organizing maps to select representative training and testing subsets and accordingly, improve the performance of effective drought index (EDI) prediction models. The effect would be observed through a comparison of artificial neural network (ANN) and an autoregressive integrated moving average (ARIMA) models incorporating the SOM approach through an inspection of commonly used performance indices for the city of Brisbane. This study shows that SOM-ANN ensemble models demonstrate competitive predictive performance for EDI values to those produced by ARIMA models
    • ā€¦
    corecore