2,140 research outputs found
Learning Models over Relational Data using Sparse Tensors and Functional Dependencies
Integrated solutions for analytics over relational databases are of great
practical importance as they avoid the costly repeated loop data scientists
have to deal with on a daily basis: select features from data residing in
relational databases using feature extraction queries involving joins,
projections, and aggregations; export the training dataset defined by such
queries; convert this dataset into the format of an external learning tool; and
train the desired model using this tool. These integrated solutions are also a
fertile ground of theoretically fundamental and challenging problems at the
intersection of relational and statistical data models.
This article introduces a unified framework for training and evaluating a
class of statistical learning models over relational databases. This class
includes ridge linear regression, polynomial regression, factorization
machines, and principal component analysis. We show that, by synergizing key
tools from database theory such as schema information, query structure,
functional dependencies, recent advances in query evaluation algorithms, and
from linear algebra such as tensor and matrix operations, one can formulate
relational analytics problems and design efficient (query and data)
structure-aware algorithms to solve them.
This theoretical development informed the design and implementation of the
AC/DC system for structure-aware learning. We benchmark the performance of
AC/DC against R, MADlib, libFM, and TensorFlow. For typical retail forecasting
and advertisement planning applications, AC/DC can learn polynomial regression
models and factorization machines with at least the same accuracy as its
competitors and up to three orders of magnitude faster than its competitors
whenever they do not run out of memory, exceed 24-hour timeout, or encounter
internal design limitations.Comment: 61 pages, 9 figures, 2 table
On the Model-Based Interpretation of Filters and the Reliability of Trend-Cycle Estimates
The paper is concerned with a class of trend cycle filters, encompassing popular ones, such as the Hodrick-Prescott filter, that are derived using the Wiener-Kolmogorov signal extraction theory under maintained models that prove unrealistic in applied time series analysis. As the maintained model is misspecified, inference about the unobserved components, and in particular their first two conditional moments, given the observations, are not delivered by the Kalman filter and smoother or the Wiener-Kolmogorov filter for the maintained model. The paper proposes a model based framework according to which the same class of filters is adapted to the particular time series under investigation; via a suitable decomposition of the innovation process, it is shown that any linear time series with ARIMA representation can be broken down into orthogonal trend and cycle components, for which the class of filters is optimal. Finite sample inferences are provided by the Kalman filter and smoother for the relevant state space representation of the decomposition. In this framework it is possible to discuss two aspects of the reliability of the signals’ estimates: the mean square error of the final estimates and the extent of the revisions. The paper discusses and illustrates how the uncertainty is related to features of the series and the design parameters of the filter, the role of smoothness priors, and the fundamental trade-off between the uncertainty and the magnitude of the revisions as new observations become available.Signal Extraction, Revisions, Kalman filter and Smoother.
An empirical study on the various stock market prediction methods
Investment in the stock market is one of the much-admired investment actions. However, prediction of the stock market has remained a hard task because of the non-linearity exhibited. The non-linearity is due to multiple affecting factors such as global economy, political situations, sector performance, economic numbers, foreign institution investment, domestic institution investment, and so on. A proper set of such representative factors must be analyzed to make an efficient prediction model. Marginal improvement of prediction accuracy can be gainful for investors. This review provides a detailed analysis of research papers presenting stock market prediction techniques. These techniques are assessed in the time series analysis and sentiment analysis section. A detailed discussion on research gaps and issues is presented. The reviewed articles are analyzed based on the use of prediction techniques, optimization algorithms, feature selection methods, datasets, toolset, evaluation matrices, and input parameters. The techniques are further investigated to analyze relations of prediction methods with feature selection algorithm, datasets, feature selection methods, and input parameters. In addition, major problems raised in the present techniques are also discussed. This survey will provide researchers with deeper insight into various aspects of current stock market prediction methods
Eigenvalue filtering in VAR models with application to the Czech business cycle
We propose the method of eigenvalue filtering as a new tool to extract time series subcomponents (such as business-cycle or irregular) defined by properties of the underlying eigenvalues. We logically extend the Beveridge-Nelson decomposition of the VAR time-series models focusing on the transient component. We introduce the canonical state-space representation of the VAR models to facilitate this type of analysis. We illustrate the eigenvalue filtering by examining a stylized model of inflation determination estimated on the Czech data.We characterize the estimated components of CPI, WPI and import inflations, together with the real production wage and real output, survey their basic properties, and impose an identification scheme to calculate the structural innovations. We test the results in a simple bootstrap simulation experiment. We find two major areas for further research: first, verifying and improving the robustness of the method, and second, exploring the method’s potential for empirical validation of structural economic models. JEL Classification: C32, E32Beveridge-Nelson decomposition, business cycle, eigenvalues, filtering, inflation, time series analysis
- …