14 research outputs found

    Forecasting the Index of Financial Safety (IFS) of South Africa using neural networks

    Get PDF
    This paper investigates neural network tools, especially the nonlinear autoregressive model with exogenous input (NARX), to forecast the future conditions of the Index of Financial Safety (IFS) of South Africa. Based on the time series that was used to construct the IFS for South Africa (Matkovskyy, 2012), the NARX model was built to forecast the future values of this index and the results are benchmarked against that of Bayesian Vector-Autoregressive Models. The results show that the NARX model applied to IFS of South Africa and trained by the Levenberg-Marquardt algorithm may ensure a forecast of adequate quality with less computation expanses, compared to BVAR models with different priors

    Forecasting the Index of Financial Safety (IFS) of South Africa using neural networks

    Get PDF
    This paper investigates neural network tools, especially the nonlinear autoregressive model with exogenous input (NARX), to forecast the future conditions of the Index of Financial Safety (IFS) of South Africa. Based on the time series that was used to construct the IFS for South Africa (Matkovskyy, 2012), the NARX model was built to forecast the future values of this index and the results are benchmarked against that of Bayesian Vector-Autoregressive Models. The results show that the NARX model applied to IFS of South Africa and trained by the Levenberg-Marquardt algorithm may ensure a forecast of adequate quality with less computation expanses, compared to BVAR models with different priors

    VC-dimension of univariate decision trees

    Get PDF
    PubMed ID: 25594983In this paper, we give and prove the lower bounds of the Vapnik-Chervonenkis (VC)-dimension of the univariate decision tree hypothesis class. The VC-dimension of the univariate decision tree depends on the VC-dimension values of its subtrees and the number of inputs. Via a search algorithm that calculates the VC-dimension of univariate decision trees exhaustively, we show that our VC-dimension bounds are tight for simple trees. To verify that the VC-dimension bounds are useful, we also use them to get VC-generalization bounds for complexity control using structural risk minimization in decision trees, i.e., pruning. Our simulation results show that structural risk minimization pruning using the VC-dimension bounds finds trees that are more accurate as those pruned using cross validation.Publisher's VersionAuthor Post Prin

    Resolution of similar patterns in a solvable model of unsupervised deep learning with structured data

    Full text link
    Empirical data, on which deep learning relies, has substantial internal structure, yet prevailing theories often disregard this aspect. Recent research has led to the definition of structured data ensembles, aimed at equipping established theoretical frameworks with interpretable structural elements, a pursuit that aligns with the broader objectives of spin glass theory. We consider a one-parameter structured ensemble where data consists of correlated pairs of patterns, and a simplified model of unsupervised learning, whereby the internal representation of the training set is fixed at each layer. A mean field solution of the model identifies a set of layer-wise recurrence equations for the overlaps between the internal representations of an unseen input and of the training set. The bifurcation diagram of this discrete-time dynamics is topologically inequivalent to the unstructured one, and displays transitions between different phases, selected by varying the load (the number of training pairs divided by the width of the network). The network's ability to resolve different patterns undergoes a discontinuous transition to a phase where signal processing along the layers dissipates differential information about an input's proximity to the different patterns in a pair. A critical value of the parameter tuning the correlations separates regimes where data structure improves or hampers the identification of a given pair of patterns

    What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation

    Get PDF
    One of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a pre-specified maximum number of training updates, we investigate the convergence of the backpropagation algorithm with respect to a) the complexity of the required function approximation, b) the size of the network in relation to the size required for an optimal solution, and c) the degree of noise in the training data. In general, for a) the solution found is worse when the function to be approximated is more complex, for b) oversize networks can result in lower training and generalization error, and for c) the use of committee or ensemble techniques can be more beneficial as the amount of noise in the training data is increased. For the experiments we performed, we do not obtain the optimal solution in any case. We further support the observation that larger networks can produce better training and generalization error using a face recognition example where a network with many more parameters than training points generalizes better than smaller networks. (Also cross-referenced as UMIACS-TR-96-22

    Characterizing Rational versus Exponential Learning Curves

    Get PDF
    AbstractWe consider the standard problem of learning a concept from random examples. Here alearning curveis defined to be the expected error of a learner's hypotheses as a function of training sample size. Haussler, Littlestone, and Warmuth have shown that, in the distribution-free setting, the smallest expected error a learner can achieve in the worst case over a class of conceptsCconverges rationally to zero error; i.e.,Θ(t−1) in the training sample sizet. However, Cohn and Tesauro have recently demonstrated thatexponentialconvergence can often be observed in experimental settings (i.e., average error decreasing aseΘ−t)). By addressing a simple non-uniformity in the original analysis this paper shows how the dichotomy between rational and exponential worst case learning curves can be recovered in the distribution-free theory. In particular, our results support the experimental findings of Cohn and Tesauro: for finite concept classes any consistent learner achieves exponential convergence, even in the worst case, whereas for continuous concept classes no learner can exhibit sub-rational convergence for every target concept and domain distribution. We also draw a precise boundary between rational and exponential convergence for simple concept chains—showing that somewhere-dense chains always force rational convergence in the worst case, while exponential convergence can always be achieved for nowhere-dense chains
    corecore