2,140 research outputs found

    Learning Models over Relational Data using Sparse Tensors and Functional Dependencies

    Full text link
    Integrated solutions for analytics over relational databases are of great practical importance as they avoid the costly repeated loop data scientists have to deal with on a daily basis: select features from data residing in relational databases using feature extraction queries involving joins, projections, and aggregations; export the training dataset defined by such queries; convert this dataset into the format of an external learning tool; and train the desired model using this tool. These integrated solutions are also a fertile ground of theoretically fundamental and challenging problems at the intersection of relational and statistical data models. This article introduces a unified framework for training and evaluating a class of statistical learning models over relational databases. This class includes ridge linear regression, polynomial regression, factorization machines, and principal component analysis. We show that, by synergizing key tools from database theory such as schema information, query structure, functional dependencies, recent advances in query evaluation algorithms, and from linear algebra such as tensor and matrix operations, one can formulate relational analytics problems and design efficient (query and data) structure-aware algorithms to solve them. This theoretical development informed the design and implementation of the AC/DC system for structure-aware learning. We benchmark the performance of AC/DC against R, MADlib, libFM, and TensorFlow. For typical retail forecasting and advertisement planning applications, AC/DC can learn polynomial regression models and factorization machines with at least the same accuracy as its competitors and up to three orders of magnitude faster than its competitors whenever they do not run out of memory, exceed 24-hour timeout, or encounter internal design limitations.Comment: 61 pages, 9 figures, 2 table

    On the Model-Based Interpretation of Filters and the Reliability of Trend-Cycle Estimates

    Get PDF
    The paper is concerned with a class of trend cycle filters, encompassing popular ones, such as the Hodrick-Prescott filter, that are derived using the Wiener-Kolmogorov signal extraction theory under maintained models that prove unrealistic in applied time series analysis. As the maintained model is misspecified, inference about the unobserved components, and in particular their first two conditional moments, given the observations, are not delivered by the Kalman filter and smoother or the Wiener-Kolmogorov filter for the maintained model. The paper proposes a model based framework according to which the same class of filters is adapted to the particular time series under investigation; via a suitable decomposition of the innovation process, it is shown that any linear time series with ARIMA representation can be broken down into orthogonal trend and cycle components, for which the class of filters is optimal. Finite sample inferences are provided by the Kalman filter and smoother for the relevant state space representation of the decomposition. In this framework it is possible to discuss two aspects of the reliability of the signals’ estimates: the mean square error of the final estimates and the extent of the revisions. The paper discusses and illustrates how the uncertainty is related to features of the series and the design parameters of the filter, the role of smoothness priors, and the fundamental trade-off between the uncertainty and the magnitude of the revisions as new observations become available.Signal Extraction, Revisions, Kalman filter and Smoother.

    An empirical study on the various stock market prediction methods

    Get PDF
    Investment in the stock market is one of the much-admired investment actions. However, prediction of the stock market has remained a hard task because of the non-linearity exhibited. The non-linearity is due to multiple affecting factors such as global economy, political situations, sector performance, economic numbers, foreign institution investment, domestic institution investment, and so on. A proper set of such representative factors must be analyzed to make an efficient prediction model. Marginal improvement of prediction accuracy can be gainful for investors. This review provides a detailed analysis of research papers presenting stock market prediction techniques. These techniques are assessed in the time series analysis and sentiment analysis section. A detailed discussion on research gaps and issues is presented. The reviewed articles are analyzed based on the use of prediction techniques, optimization algorithms, feature selection methods, datasets, toolset, evaluation matrices, and input parameters. The techniques are further investigated to analyze relations of prediction methods with feature selection algorithm, datasets, feature selection methods, and input parameters. In addition, major problems raised in the present techniques are also discussed. This survey will provide researchers with deeper insight into various aspects of current stock market prediction methods

    Eigenvalue filtering in VAR models with application to the Czech business cycle

    Get PDF
    We propose the method of eigenvalue filtering as a new tool to extract time series subcomponents (such as business-cycle or irregular) defined by properties of the underlying eigenvalues. We logically extend the Beveridge-Nelson decomposition of the VAR time-series models focusing on the transient component. We introduce the canonical state-space representation of the VAR models to facilitate this type of analysis. We illustrate the eigenvalue filtering by examining a stylized model of inflation determination estimated on the Czech data.We characterize the estimated components of CPI, WPI and import inflations, together with the real production wage and real output, survey their basic properties, and impose an identification scheme to calculate the structural innovations. We test the results in a simple bootstrap simulation experiment. We find two major areas for further research: first, verifying and improving the robustness of the method, and second, exploring the method’s potential for empirical validation of structural economic models. JEL Classification: C32, E32Beveridge-Nelson decomposition, business cycle, eigenvalues, filtering, inflation, time series analysis
    • …
    corecore