56 research outputs found

    Modular Approaches for Designing Feedforward Neural Networks to Solve Classification Problems.

    Get PDF
    Perhaps the most popular approach for solving classification problems is the backpropagation method. In this approach, a feedforward network is initially built by an intelligent guess regarding the architecture and the number of hidden nodes and then trained using an iterative algorithm. The major problem with this approach is the slow rate of convergence of the training. To overcome this difficulty, two different modular approaches have been investigated. In the first approach, the classification task is reduced to sub-tasks, where each sub-task is solved by a sub-network. An algorithm is presented for building and training each sub-network using any of the single-node learning rules such as the perceptron rule. Simulation results of a digit recognition task show that this approach reduces the training time compared to that of backpropagation while achieving comparable generalization. This approach has the added benefit that it develops the structure of the network while learning proceeds as opposed to the approach of backpropagation where the structure of the network is guessed and is fixed prior to learning. The second approach investigates a recently developed technique for training feedforward networks called corner classification. Training using corner classification is a single step method unlike the computationally intensive iterative method of backpropagation. In this dissertation, modifications are made to the corner classification algorithm in order to improve its generalization capability. Simulation results of the digit recognition task show that the generalization capabilities of the networks of the corner classification are comparable to those of the backpropagation networks. But the learning speed of the corner classification is far ahead of the iterative methods. Designing the network by corner classification involves a large number of hidden nodes. In this dissertation, a pruning procedure is developed that eliminates some of the redundant hidden nodes. The pruned network has generalization comparable to that of the unpruned method

    Pattern recognition methods for EMG prosthetic control

    Get PDF
    In this work we focus on pattern recognition methods related to EMG upper-limb prosthetic control. After giving a detailed review of the most widely used classification methods, we propose a new classification approach. It comes as a result of comparison in the Fourier analysis between able-bodied and trans-radial amputee subjects. We thus suggest a different classification method which considers each surface electrodes contribute separately, together with five time domain features, obtaining an average classification accuracy equals to 75% on a sample of trans-radial amputees. We propose an automatic feature selection procedure as a minimization problem in order to improve the method and its robustness

    Efficient Learning Machines

    Get PDF
    Computer scienc

    Robust Algorithms for Estimating Vehicle Movement from Motion Sensors Within Smartphones

    Get PDF
    Building sustainable traffic control solutions for urban streets (e.g., eco-friendly signal control) and highways requires effective and reliable sensing capabilities for monitoring traffic flow conditions so that both the temporal and spatial extents of congestion are observed. This would enable optimal control strategies to be implemented for maximizing efficiency and for minimizing the environmental impacts of traffic. Various types of traffic detection systems, such as inductive loops, radar, and cameras have been used for these purposes. However, these systems are limited, both in scope and in time. Using GPS as an alternative method is not always viable because of problems such as urban canyons, battery depletion, and precision errors. In this research, a novel approach has been taken, in which smartphone low energy sensors (such as the accelerometer) are exploited. The ubiquitous use of smartphones in everyday life, coupled with the fact that they can collect, store, compute, and transmit data, makes them a feasible and inexpensive alternative to the mainstream methods. Machine learning techniques have been used to develop models that are able to classify vehicle movement and to detect the stop and start points during a trip. Classifiers such as logistic regression, discriminant analysis, classification trees, support vector machines, neural networks, and Hidden Markov models have been tested. Hidden Markov models substantially outperformed all the other methods. The feature quality plays a key role in the success of a model. It was found that, the features which exploited the variance of the data were the most effective. In order to assist in quantifying the performance of the machine learning models, a performance metric called Change Point Detection Performance Metric (CPDPM) was developed. CPDPM proved to be very useful in model evaluation in which the goal was to find the change points in time series data with high accuracy and precision. The integration of accelerometer data, even in the motion direction, yielded an estimated speed with a steady slope, because of factors such as phone sensor bias, vibration, gravity, and other white noise. A calibration method was developed that makes use of the predicted stop and start points and the slope of integrated accelerometer data, which achieves great accuracy in estimating speed. The developed models can serve as the basis for many applications. One such field is fuel consumption and CO2 emission estimation, in which speed is the main input. Transportation mode detection can be improved by integrating speed information. By integrating Vehicle (Phone) to Infrastructure systems (V2I), the model outputs, such as the stop and start instances, average speed along a corridor, and queue length at an intersection, can provide useful information for traffic engineers, planners, and decision makers

    IST Austria Thesis

    Get PDF
    Deep learning is best known for its empirical success across a wide range of applications spanning computer vision, natural language processing and speech. Of equal significance, though perhaps less known, are its ramifications for learning theory: deep networks have been observed to perform surprisingly well in the high-capacity regime, aka the overfitting or underspecified regime. Classically, this regime on the far right of the bias-variance curve is associated with poor generalisation; however, recent experiments with deep networks challenge this view. This thesis is devoted to investigating various aspects of underspecification in deep learning. First, we argue that deep learning models are underspecified on two levels: a) any given training dataset can be fit by many different functions, and b) any given function can be expressed by many different parameter configurations. We refer to the second kind of underspecification as parameterisation redundancy and we precisely characterise its extent. Second, we characterise the implicit criteria (the inductive bias) that guide learning in the underspecified regime. Specifically, we consider a nonlinear but tractable classification setting, and show that given the choice, neural networks learn classifiers with a large margin. Third, we consider learning scenarios where the inductive bias is not by itself sufficient to deal with underspecification. We then study different ways of ‘tightening the specification’: i) In the setting of representation learning with variational autoencoders, we propose a hand- crafted regulariser based on mutual information. ii) In the setting of binary classification, we consider soft-label (real-valued) supervision. We derive a generalisation bound for linear networks supervised in this way and verify that soft labels facilitate fast learning. Finally, we explore an application of soft-label supervision to the training of multi-exit models

    Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data

    Full text link
    Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient descent. To better understand this empirical observation, we consider the generalization error of two-layer neural networks trained to interpolation by gradient descent on the logistic loss following random initialization. We assume the data comes from well-separated class-conditional log-concave distributions and allow for a constant fraction of the training labels to be corrupted by an adversary. We show that in this setting, neural networks exhibit benign overfitting: they can be driven to zero training error, perfectly fitting any noisy training labels, and simultaneously achieve minimax optimal test error. In contrast to previous work on benign overfitting that require linear or kernel-based predictors, our analysis holds in a setting where both the model and learning dynamics are fundamentally nonlinear.Comment: 39 pages; updated proof of loss ratio boun
    corecore