7 research outputs found
Flexible spatial models on the example of temperature in China
Spatial modeling of temperature is of crucial importance for agriculture, industry and ecology. This work presents interpolation methods for the daily average temperature in China in the time period from 1957 to 2009. Due to complex topography and diverse climate of the country flexibility of the spatial models is of great importance. This study attempts to develop techniques which are able to minimize the spatial prediction error and to capture temperature extremes. The current research extends copula-based interpolation method and proposes the innovative IDW-GEV model. Spatial regression, kriging and inverse distance interpolation are used as a benchmark to evaluate the performance of suggested techniques
Statistical and data mining methods for classification task
Pēdējos gados klasifikēcijas problemētika kļuvusi ļoti aktuāla lēmumu pieņemšanas
dažādās sfērās. Šo uzdevumu var risināt gan ar statistikas, gan ar datu izraces (angliski
data mining) palīdzību. Šī darba mērķis ir noskaidrot, vai datu izraces algoritmi spēj
konkurēt ar statistiskām metodām. Darbā ir aprakstīta lineāra diskriminantu analīze
(LDA),kodolu diskriminantu analīze (KDA), kā arī klasifikācijas koki (CRT) un vienslāņa
neironu tīkli (NNET). Statistiskajiem klasifikatoriem ir aprakstīts diskriminantu funkcijas
un diskriminācijas robežu iegūšanas process, datu izraces modeļiem ir apskatīti klasifikatoru
būvēšanas algoritmi, kā arī metodes, ar kuru palīdzību var izvairīties no pārliekas
pielāgošanās datiem. Darba nobeigumā ir apskatīti modeļu salīdzināšanas paņēmieni. Lai
empīriski salīdzinātu klasiskās metodes ar datu izraces metodēm, tika veiktas simulācijas
programmā R.
Atslēgas vārdi: datu izrace, klasifikators, diskriminantu analīze, klasifikācijas koki,
neironu tīkli, kopējā precizitāte.In recent years classification problem has become a topical question in different field of
decision making. Such kind of tasks can be solved using both statistical and data mining
techniques. The goal of this thesis is to elucidate whether the data mining algorithms
can be considered as competitors of statistical methods. Linear and kernel discriminant
analysis, classification trees and neural networks are described in the thesis. It is explained
how to get discriminant function and discrimination borders for statistical techniques and
how to construct data mining classifiers avoiding unnecessary adaptation to data. Finally,
model assessment and selection are discussed. The thesis contains empirical comparison
of the classical statistical techniques and data mining algorithms in terms of simulated
examples. Simulations were fulfilled, using statistical software R.
Key words: data mining, classifier, discriminant analysis,classification trees, neural
networks, overall accuracy
A Forest Full of Risk Forecasts for Managing Volatility
We propose a hybrid approach to modeling stock return volatility in a large panel of stocks that combines the machine learning principle random forest with ordinary least squares regression models. Our model's time-varying parameters are assumed to be data-driven functions of idiosyncratic stock information and changing market conditions. The empirical analysis demonstrates our model’s superior risk forecasting performance across multiple forecast horizons and 186 S&P 500 constituents, resulting in significantly higher utility for volatility-managed investments. Furthermore, this superior forecast performance is observed uniformly across firm characteristics
A Forest Full of Risk Forecasts for Managing Volatility
We propose a hybrid approach to modeling stock return volatility in a large panel of stocks that combines the machine learning principle random forest with ordinary least squares regression models. Our model's time-varying parameters are assumed to be data-driven functions of idiosyncratic stock information and changing market conditions. The empirical analysis demonstrates our model’s superior risk forecasting performance across multiple forecast horizons and 186 S&P 500 constituents, resulting in significantly higher utility for volatility-managed investments. Furthermore, this superior forecast performance is observed uniformly across firm characteristics
How much is the view from the window worth?: Machine learning-driven hedonic pricing model of the real estate market
Understanding the customers’ perception of the value of constituent characteristics of a good is among the key questions in any pricing strategy. Hedonic pricing allows such an analysis and is frequently applied in economic fields. Although it is regarded as a benchmark in its original form, the availability of new data sources and the development of machine learning techniques created a space for further improvement. In this study, we propose a general framework for applying machine learning tools to enhance the hedonic pricing model in several directions. We do this, first, by adding image and text sources to conventional data and then by applying an advanced nonparametric prediction model. Lastly, we use model agnostic analysis to uncover new pricing factors and unravel complex relationships that could not be captured by conventional models