382 research outputs found

    Passive Shallow Water Automated Target Recognition using Deep Convolutional Bi directional Long Short Term Memory

    Get PDF
    The extremely challenging nature of passive acoustic surveillance makes it a key area of research in NavalNon-Co-operative Target Recognition especially in Anti-Submarine Warfare systems. In shallow waters, thecomplex acoustics due to the highly varying ambient background noise as well as the multi-modal propagation in the surface-bottom bounded channel makes surveillance even difficult. In this work, an ensemble of Convolutional Neural Networks and Bidirectional Long Short Term Memory stages employing soft attention is used to effectively capture the spectro-temporal dynamics of the target signature. In order to alleviate the overall computational cost associated with the optimal model search in the extensive hyperparameter space, a recursive model elimination scheme, making frugal use of the available resources, is also proposed. Experimental analysis on acoustic target records, collected from the shallows of Arabian Sea, has yielded encouraging results in terms of model accuracy, precision and recall

    Combined optimization algorithms applied to pattern classification

    Get PDF
    Accurate classification by minimizing the error on test samples is the main goal in pattern classification. Combinatorial optimization is a well-known method for solving minimization problems, however, only a few examples of classifiers axe described in the literature where combinatorial optimization is used in pattern classification. Recently, there has been a growing interest in combining classifiers and improving the consensus of results for a greater accuracy. In the light of the "No Ree Lunch Theorems", we analyse the combination of simulated annealing, a powerful combinatorial optimization method that produces high quality results, with the classical perceptron algorithm. This combination is called LSA machine. Our analysis aims at finding paradigms for problem-dependent parameter settings that ensure high classifica, tion results. Our computational experiments on a large number of benchmark problems lead to results that either outperform or axe at least competitive to results published in the literature. Apart from paxameter settings, our analysis focuses on a difficult problem in computation theory, namely the network complexity problem. The depth vs size problem of neural networks is one of the hardest problems in theoretical computing, with very little progress over the past decades. In order to investigate this problem, we introduce a new recursive learning method for training hidden layers in constant depth circuits. Our findings make contributions to a) the field of Machine Learning, as the proposed method is applicable in training feedforward neural networks, and to b) the field of circuit complexity by proposing an upper bound for the number of hidden units sufficient to achieve a high classification rate. One of the major findings of our research is that the size of the network can be bounded by the input size of the problem and an approximate upper bound of 8 + √2n/n threshold gates as being sufficient for a small error rate, where n := log/SL and SL is the training set

    Herding as a Learning System with Edge-of-Chaos Dynamics

    Full text link
    Herding defines a deterministic dynamical system at the edge of chaos. It generates a sequence of model states and parameters by alternating parameter perturbations with state maximizations, where the sequence of states can be interpreted as "samples" from an associated MRF model. Herding differs from maximum likelihood estimation in that the sequence of parameters does not converge to a fixed point and differs from an MCMC posterior sampling approach in that the sequence of states is generated deterministically. Herding may be interpreted as a"perturb and map" method where the parameter perturbations are generated using a deterministic nonlinear dynamical system rather than randomly from a Gumbel distribution. This chapter studies the distinct statistical characteristics of the herding algorithm and shows that the fast convergence rate of the controlled moments may be attributed to edge of chaos dynamics. The herding algorithm can also be generalized to models with latent variables and to a discriminative learning setting. The perceptron cycling theorem ensures that the fast moment matching property is preserved in the more general framework

    Doctor of Philosophy

    Get PDF
    dissertationDeep Neural Networks (DNNs) are the state-of-art solution in a growing number of tasks including computer vision, speech recognition, and genomics. However, DNNs are computationally expensive as they are carefully trained to extract and abstract features from raw data using multiple layers of neurons with millions of parameters. In this dissertation, we primarily focus on inference, e.g., using a DNN to classify an input image. This is an operation that will be repeatedly performed on billions of devices in the datacenter, in self-driving cars, in drones, etc. We observe that DNNs spend a vast majority of their runtime to runtime performing matrix-by-vector multiplications (MVM). MVMs have two major bottlenecks: fetching the matrix and performing sum-of-product operations. To address these bottlenecks, we use in-situ computing, where the matrix is stored in programmable resistor arrays, called crossbars, and sum-of-product operations are performed using analog computing. In this dissertation, we propose two hardware units, ISAAC and Newton.In ISAAC, we show that in-situ computing designs can outperform DNN digital accelerators, if they leverage pipelining, smart encodings, and can distribute a computation in time and space, within crossbars, and across crossbars. In the ISAAC design, roughly half the chip area/power can be attributed to the analog-to-digital conversion (ADC), i.e., it remains the key design challenge in mixed-signal accelerators for deep networks. In spite of the ADC bottleneck, ISAAC is able to out-perform the computational efficiency of the state-of-the-art design (DaDianNao) by 8x. In Newton, we take advantage of a number of techniques to address ADC inefficiency. These techniques exploit matrix transformations, heterogeneity, and smart mapping of computation to the analog substrate. We show that Newton can increase the efficiency of in-situ computing by an additional 2x. Finally, we show that in-situ computing, unfortunately, cannot be easily adapted to handle training of deep networks, i.e., it is only suitable for inference of already-trained networks. By improving the efficiency of DNN inference with ISAAC and Newton, we move closer to low-cost deep learning that in turn will have societal impact through self-driving cars, assistive systems for the disabled, and precision medicine

    Predictive Accuracy of Recommender Algorithms

    Get PDF
    Recommender systems present a customized list of items based upon user or item characteristics with the objective of reducing a large number of possible choices to a smaller ranked set most likely to appeal to the user. A variety of algorithms for recommender systems have been developed and refined including applications of deep learning neural networks. Recent research reports point to a need to perform carefully controlled experiments to gain insights about the relative accuracy of different recommender algorithms, because studies evaluating different methods have not used a common set of benchmark data sets, baseline models, and evaluation metrics. The dissertation used publicly available sources of ratings data with a suite of three conventional recommender algorithms and two deep learning (DL) algorithms in controlled experiments to assess their comparative accuracy. Results for the non-DL algorithms conformed well to published results and benchmarks. The two DL algorithms did not perform as well and illuminated known challenges implementing DL recommender algorithms as reported in the literature. Model overfitting is discussed as a potential explanation for the weaker performance of the DL algorithms and several regularization strategies are reviewed as possible approaches to improve predictive error. Findings justify the need for further research in the use of deep learning models for recommender systems

    Applying Deep Machine Learning for psycho-demographic profiling of Internet users using O.C.E.A.N. model of personality

    Full text link
    In the modern era, each Internet user leaves enormous amounts of auxiliary digital residuals (footprints) by using a variety of on-line services. All this data is already collected and stored for many years. In recent works, it was demonstrated that it's possible to apply simple machine learning methods to analyze collected digital footprints and to create psycho-demographic profiles of individuals. However, while these works clearly demonstrated the applicability of machine learning methods for such an analysis, created simple prediction models still lacks accuracy necessary to be successfully applied for practical needs. We have assumed that using advanced deep machine learning methods may considerably increase the accuracy of predictions. We started with simple machine learning methods to estimate basic prediction performance and moved further by applying advanced methods based on shallow and deep neural networks. Then we compared prediction power of studied models and made conclusions about its performance. Finally, we made hypotheses how prediction accuracy can be further improved. As result of this work, we provide full source code used in the experiments for all interested researchers and practitioners in corresponding GitHub repository. We believe that applying deep machine learning for psycho-demographic profiling may have an enormous impact on the society (for good or worse) and provides means for Artificial Intelligence (AI) systems to better understand humans by creating their psychological profiles. Thus AI agents may achieve the human-like ability to participate in conversation (communication) flow by anticipating human opponents' reactions, expectations, and behavior

    THE SPATIAL INDUCTIVE BIAS OF DEEP LEARNING

    Get PDF
    In the past few years, Deep Learning has become the method of choice for producing state-of-the-art results on machine learning problems involving images, text, and speech. The explosion of interest in these techniques has resulted in a large number of successful applications of deep learning, but relatively few studies exploring the nature of and reason for that success. This dissertation is motivated by a desire to understand and reproduce the performance characteristics of deep learning systems, particularly Convolutional Neural Networks (CNNs). One factor in the success of CNNs is that they have an inductive bias that assumes a certain type of spatial structure is present in the data. We give a formal definition of how this type of spatial structure can be characterised, along with some statistical tools for testing whether spatial structure is present in a given dataset. These tools are applied to several standard image datasets, and the results are analyzed. We demonstrate that CNNs rely heavily on the presence of such structure, and then show several ways that a similar bias can be introduced into other methods. The first is a partition-based method for training Restricted Boltzmann Machines and Deep Belief Networks, which is able to speed up convergence significantly without changing the overall representational power of the network. The second is a deep partitioned version of Principal Component Analysis, which demonstrates that a spatial bias can be useful even in a model that is non-connectionist and completely linear. The third is a variation on projective Random Forests, which shows that we can introduce a spatial bias with only minor changes to the algorithm, and no externally imposed partitioning is required. In each case, we can show that introducing a spatial bias results in improved performance on spatial data

    Efficient Models and Algorithms for Image Processing for Industrial Applications

    Get PDF
    Image processing and computer vision are now part of our daily life and allow artificial intelligence systems to see and perceive the world with a visual system similar to the human one. In the quest to improve performance, computer vision algorithms reach remarkable computational complexities. The high computational complexity is mitigated by the availability of hardware capable of supporting these computational demands. However, high-performance hardware cannot always be relied upon when one wants to make the research product usable. In this work, we have focused on the development of computer vision algorithms and methods with low computational complexity but high performance. The first approach is to study the relationship between Fourier-based metrics and Wasserstein distances to propose alternative metrics to the latter, considerably reducing the time required to obtain comparable results. In the second case, instead, we start from an industrial problem and develop a deep learning model for change detection, obtaining state-of-the-art performance but reducing the computational complexity required by at least a third compared to the existing literature
    • …
    corecore