15 research outputs found

    Broad Learning System Based on Maximum Correntropy Criterion

    Full text link
    As an effective and efficient discriminative learning method, Broad Learning System (BLS) has received increasing attention due to its outstanding performance in various regression and classification problems. However, the standard BLS is derived under the minimum mean square error (MMSE) criterion, which is, of course, not always a good choice due to its sensitivity to outliers. To enhance the robustness of BLS, we propose in this work to adopt the maximum correntropy criterion (MCC) to train the output weights, obtaining a correntropy based broad learning system (C-BLS). Thanks to the inherent superiorities of MCC, the proposed C-BLS is expected to achieve excellent robustness to outliers while maintaining the original performance of the standard BLS in Gaussian or noise-free environment. In addition, three alternative incremental learning algorithms, derived from a weighted regularized least-squares solution rather than pseudoinverse formula, for C-BLS are developed.With the incremental learning algorithms, the system can be updated quickly without the entire retraining process from the beginning, when some new samples arrive or the network deems to be expanded. Experiments on various regression and classification datasets are reported to demonstrate the desirable performance of the new methods

    ADVANCED REPRESENTATION LEARNING STRATEGIES FOR BIG DATA ANALYSIS

    Get PDF
    With the fast technological advancement in data storage and machine learning, big data analytics has become a core component of various practical applications ranging from industrial automation to medical diagnosis and from cyber-security to space exploration. Recent studies show that every day, more than 1.8 billion photos/images are posted on social media, and 720 thousand hours of videos are uploaded to YouTube. Thus, to handle this large amount of visual data efficiently, image/video classification, object detection/recognition, and segmentation tasks have gathered a lot of attention since the decade. Consequently, the researchers in this domain has proposed various feature extraction, feature learning, and feature encoding algorithms for improving the generalization performance of the aforesaid tasks. For example, the generalization performance of the image classification models mainly depends on the choice of data representation. These models aim at building comprehensive representation learning (RL) strategies to encode the relationship among the input and output attributes from the raw big data. Existing RL strategies can be divided into three general categories: statistic approaches (e.g. probabilistic-based analysis, and correlation-based measures), unsupervised learning (e.g., autoencoders), and supervised learning (e.g., deep convolutional neural network (DCNN)). Among these categories, the unsupervised and supervised learning strategies using artificial neural networks (ANNs) have been widely adopted. In this direction, several auxiliary ideas have been proposed over the past decade, to improve the learning capability of the ANNs. For instance, Moore-Penrose (MP) inverse is exploited to refine the parameters (weights and biases) of a trained network. However, the existing MP inverse-based RL methods have an important limitation. The representations learned through the MP inverse-based strategies suffer from loosely-connected feature coding, resulting into a poor representation of the objects having lack of discriminative power. To address this issue, this dissertation proposes a set of eight novel MP inverse-based RL algorithms. The first part of this dissertation from Chapter 4 to Chapter 7 is dedicated to proposing novel width-growth models based on subnet neural network (SNN) for representation learning and image classification. In this part, a novel feature learning algorithm, coined Wi-HSNN is proposed, followed by an improved batch-by-batch learning algorithm, called OS-HSNN. Then, two novel SNNs are introduced to detect extreme outliers for one-class classification (OCC). Finally, a semi-supervised SNN, named SS-HSNN is introduced to extend the strategy from the supervised learning domain to the semi-supervised learning domain. The second part of this thesis, subsuming Chapter 8 and Chapter 9, focuses on improving the performance of the existing multilayer neural networks through harnessing the MP inverse. Here, a novel weight optimization strategy is proposed to improve the performance of multilayer extreme learning machines (ELMs), where the MP inverse is used to feedback the classification imprecision information from the output layer to the hidden layers. Then, a novel fast retraining framework is proposed to enhance the efficiency of transfer learning of DCNNs. The effectiveness of the proposed subnet- and retraining-based algorithms have been evaluated on several widely used image classification datasets, such as ImageNet and Places-365. Furthermore, we validated the performance of the proposed strategies in some extended domains, such as ship-target detection, food image classification, camera model identification and misinformation identification. The experimental results illustrate the superiority of the proposed algorithms

    Non-iterative and Fast Deep Learning: Multilayer Extreme Learning Machines

    Get PDF
    In the past decade, deep learning techniques have powered many aspects of our daily life, and drawn ever-increasing research interests. However, conventional deep learning approaches, such as deep belief network (DBN), restricted Boltzmann machine (RBM), and convolutional neural network (CNN), suffer from time-consuming training process due to fine-tuning of a large number of parameters and the complicated hierarchical structure. Furthermore, the above complication makes it difficult to theoretically analyze and prove the universal approximation of those conventional deep learning approaches. In order to tackle the issues, multilayer extreme learning machines (ML-ELM) were proposed, which accelerate the development of deep learning. Compared with conventional deep learning, ML-ELMs are non-iterative and fast due to the random feature mapping mechanism. In this paper, we perform a thorough review on the development of ML-ELMs, including stacked ELM autoencoder (ELM-AE), residual ELM, and local receptive field based ELM (ELM-LRF), as well as address their applications. In addition, we also discuss the connection between random neural networks and conventional deep learning

    Mathematics and Digital Signal Processing

    Get PDF
    Modern computer technology has opened up new opportunities for the development of digital signal processing methods. The applications of digital signal processing have expanded significantly and today include audio and speech processing, sonar, radar, and other sensor array processing, spectral density estimation, statistical signal processing, digital image processing, signal processing for telecommunications, control systems, biomedical engineering, and seismology, among others. This Special Issue is aimed at wide coverage of the problems of digital signal processing, from mathematical modeling to the implementation of problem-oriented systems. The basis of digital signal processing is digital filtering. Wavelet analysis implements multiscale signal processing and is used to solve applied problems of de-noising and compression. Processing of visual information, including image and video processing and pattern recognition, is actively used in robotic systems and industrial processes control today. Improving digital signal processing circuits and developing new signal processing systems can improve the technical characteristics of many digital devices. The development of new methods of artificial intelligence, including artificial neural networks and brain-computer interfaces, opens up new prospects for the creation of smart technology. This Special Issue contains the latest technological developments in mathematics and digital signal processing. The stated results are of interest to researchers in the field of applied mathematics and developers of modern digital signal processing systems

    Self-Adaptive, Dynamic, Integrated Statistical and Information Theory Learning

    Full text link
    The paper analyses and serves with a positioning of various error measures applied in neural network training and identifies that there is no best of measure, although there is a set of measures with changing superiorities in different learning situations. An outstanding, remarkable measure called EExpE_{Exp} published by Silva and his research partners represents a research direction to combine more measures successfully with fixed importance weighting during learning. The main idea of the paper is to go far beyond and to integrate this relative importance into the neural network training algorithm(s) realized through a novel error measure called EExpAbsE_{ExpAbs}. This approach is included into the Levenberg-Marquardt training algorithm, so, a novel version of it is also introduced, resulting a self-adaptive, dynamic learning algorithm. This dynamism does not has positive effects on the resulted model accuracy only, but also on the training process itself. The described comprehensive algorithm tests proved that the proposed, novel algorithm integrates dynamically the two big worlds of statistics and information theory that is the key novelty of the paper.Comment: 62 pages, 30 figures, original articl

    Learning understandable classifier models.

    Get PDF
    The topic of this dissertation is the automation of the process of extracting understandable patterns and rules from data. An unprecedented amount of data is available to anyone with a computer connected to the Internet. The disciplines of Data Mining and Machine Learning have emerged over the last two decades to face this challenge. This has led to the development of many tools and methods. These tools often produce models that make very accurate predictions about previously unseen data. However, models built by the most accurate methods are usually hard to understand or interpret by humans. In consequence, they deliver only decisions, and are short of any explanations. Hence they do not directly lead to the acquisition of new knowledge. This dissertation contributes to bridging the gap between the accurate opaque models and those less accurate but more transparent for humans. This dissertation first defines the problem of learning from data. It surveys the state-of-the-art methods for supervised learning of both understandable and opaque models from data, as well as unsupervised methods that detect features present in the data. It describes popular methods of rule extraction from unintelligible models which rewrite them into an understandable form. Limitations of rule extraction are described. A novel definition of understandability which ties computational complexity and learning is provided to show that rule extraction is an NP-hard problem. Next, a discussion whether one can expect that even an accurate classifier has learned new knowledge. The survey ends with a presentation of two approaches to building of understandable classifiers. On the one hand, understandable models must be able to accurately describe relations in the data. On the other hand, often a description of the output of a system in terms of its input requires the introduction of intermediate concepts, called features. Therefore it is crucial to develop methods that describe the data with understandable features and are able to use those features to present the relation that describes the data. Novel contributions of this thesis follow the survey. Two families of rule extraction algorithms are considered. First, a method that can work with any opaque classifier is introduced. Artificial training patterns are generated in a mathematically sound way and used to train more accurate understandable models. Subsequently, two novel algorithms that require that the opaque model is a Neural Network are presented. They rely on access to the network\u27s weights and biases to induce rules encoded as Decision Diagrams. Finally, the topic of feature extraction is considered. The impact on imposing non-negativity constraints on the weights of a neural network is considered. It is proved that a three layer network with non-negative weights can shatter any given set of points and experiments are conducted to assess the accuracy and interpretability of such networks. Then, a novel path-following algorithm that finds robust sparse encodings of data is presented. In summary, this dissertation contributes to improved understandability of classifiers in several tangible and original ways. It introduces three distinct aspects of achieving this goal: infusion of additional patterns from the underlying pattern distribution into rule learners, the derivation of decision diagrams from neural networks, and achieving sparse coding with neural networks with non-negative weights

    Friction, Vibration and Dynamic Properties of Transmission System under Wear Progression

    Get PDF
    This reprint focuses on wear and fatigue analysis, the dynamic properties of coating surfaces in transmission systems, and non-destructive condition monitoring for the health management of transmission systems. Transmission systems play a vital role in various types of industrial structure, including wind turbines, vehicles, mining and material-handling equipment, offshore vessels, and aircrafts. Surface wear is an inevitable phenomenon during the service life of transmission systems (such as on gearboxes, bearings, and shafts), and wear propagation can reduce the durability of the contact coating surface. As a result, the performance of the transmission system can degrade significantly, which can cause sudden shutdown of the whole system and lead to unexpected economic loss and accidents. Therefore, to ensure adequate health management of the transmission system, it is necessary to investigate the friction, vibration, and dynamic properties of its contact coating surface and monitor its operating conditions
    corecore