481 research outputs found

    How meta-heuristic algorithms contribute to deep learning in the hype of big data analytics

    Get PDF
    Deep learning (DL) is one of the most emerging types of contemporary machine learning techniques that mimic the cognitive patterns of animal visual cortex to learn the new abstract features automatically by deep and hierarchical layers. DL is believed to be a suitable tool so far for extracting insights from very huge volume of so-called big data. Nevertheless, one of the three “V” or big data is velocity that implies the learning has to be incremental as data are accumulating up rapidly. DL must be fast and accurate. By the technical design of DL, it is extended from feed-forward artificial neural network with many multi-hidden layers of neurons called deep neural network (DNN). In the training process of DNN, it has certain inefficiency due to very long training time required. Obtaining the most accurate DNN within a reasonable run-time is a challenge, given there are potentially many parameters in the DNN model configuration and high dimensionality of the feature space in the training dataset. Meta-heuristic has a history of optimizing machine learning models successfully. How well meta-heuristic could be used to optimize DL in the context of big data analytics is a thematic topic which we pondered on in this paper. As a position paper, we review the recent advances of applying meta-heuristics on DL, discuss about their pros and cons and point out some feasible research directions for bridging the gaps between meta-heuristics and DL

    Quaternion-based deep belief networks fine-tuning

    Get PDF
    Deep learning techniques have been paramount in the last years, mainly due to their outstanding results in a number of applications. In this paper, we address the issue of fine-tuning parameters of Deep Belief Networks by means of meta-heuristics in which real-valued decision variables are described by quaternions. Such approaches essentially perform optimization in fitness landscapes that are mapped to a different representation based on hypercomplex numbers that may generate smoother surfaces. We therefore can map the optimization process onto a new space representation that is more suitable to learning parameters. Also, we proposed two approaches based on Harmony Search and quaternions that outperform the state-of-the-art results obtained so far in three public datasets for the reconstruction of binary images

    Handling dropout probability estimation in convolution neural networks using meta-heuristics

    Get PDF
    Deep learning-based approaches have been paramount in recent years, mainly due to their outstanding results in several application domains, ranging from face and object recognition to handwritten digit identification. Convolutional Neural Networks (CNN) have attracted a considerable attention since they model the intrinsic and complex brain working mechanisms. However, one main shortcoming of such models concerns their overfitting problem, which prevents the network from predicting unseen data effectively. In this paper, we address this problem by means of properly selecting a regularization parameter known as Dropout in the context of CNNs using meta-heuristic-driven techniques. As far as we know, this is the first attempt to tackle this issue using this methodology. Additionally, we also take into account a default dropout parameter and a dropout-less CNN for comparison purposes. The results revealed that optimizing Dropout-based CNNs is worthwhile, mainly due to the easiness in finding suitable dropout probability values, without needing to set new parameters empirically

    Text Detection in Natural Scenes and Technical Diagrams with Convolutional Feature Learning and Cascaded Classification

    Get PDF
    An enormous amount of digital images are being generated and stored every day. Understanding text in these images is an important challenge with large impacts for academic, industrial and domestic applications. Recent studies address the difficulty of separating text targets from noise and background, all of which vary greatly in natural scenes. To tackle this problem, we develop a text detection system to analyze and utilize visual information in a data driven, automatic and intelligent way. The proposed method incorporates features learned from data, including patch-based coarse-to-fine detection (Text-Conv), connected component extraction using region growing, and graph-based word segmentation (Word-Graph). Text-Conv is a sliding window-based detector, with convolution masks learned using the Convolutional k-means algorithm (Coates et. al, 2011). Unlike convolutional neural networks (CNNs), a single vector/layer of convolution mask responses are used to classify patches. An initial coarse detection considers both local and neighboring patch responses, followed by refinement using varying aspect ratios and rotations for a smaller local detection window. Different levels of visual detail from ground truth are utilized in each step, first using constraints on bounding box intersections, and then a combination of bounding box and pixel intersections. Combining masks from different Convolutional k-means initializations, e.g., seeded using random vectors and then support vectors improves performance. The Word-Graph algorithm uses contextual information to improve word segmentation and prune false character detections based on visual features and spatial context. Our system obtains pixel, character, and word detection f-measures of 93.14%, 90.26%, and 86.77% respectively for the ICDAR 2015 Robust Reading Focused Scene Text dataset, out-performing state-of-the-art systems, and producing highly accurate text detection masks at the pixel level. To investigate the utility of our feature learning approach for other image types, we perform tests on 8- bit greyscale USPTO patent drawing diagram images. An ensemble of Ada-Boost classifiers with different convolutional features (MetaBoost) is used to classify patches as text or background. The Tesseract OCR system is used to recognize characters in detected labels and enhance performance. With appropriate pre-processing and post-processing, f-measures of 82% for part label location, and 73% for valid part label locations and strings are obtained, which are the best obtained to-date for the USPTO patent diagram data set used in our experiments. To sum up, an intelligent refinement of convolutional k-means-based feature learning and novel automatic classification methods are proposed for text detection, which obtain state-of-the-art results without the need for strong prior knowledge. Different ground truth representations along with features including edges, color, shape and spatial relationships are used coherently to improve accuracy. Different variations of feature learning are explored, e.g. support vector-seeded clustering and MetaBoost, with results suggesting that increased diversity in learned features benefit convolution-based text detectors
    • …
    corecore