2,588 research outputs found

    An Exploration of Controlling the Content Learned by Deep Neural Networks

    Get PDF
    With the great success of the Deep Neural Network (DNN), how to get a trustworthy model attracts more and more attention. Generally, people intend to provide the raw data to the DNN directly in training. However, the entire training process is in a black box, in which the knowledge learned by the DNN is out of control. There are many risks inside. The most common one is overfitting. With the deepening of research on neural networks, additional and probably greater risks were discovered recently. The related research shows that unknown clues can hide in the training data because of the randomization of the data and the finite scale of the training data. Some of the clues build meaningless but explicit links between input data the output data called ``shortcuts\u27\u27. The DNN makes the decision based on these ``shortcuts\u27\u27. This phenomenon is also called ``network cheating\u27\u27. The knowledge of such shortcuts learned by DNN ruins all the training and makes the performance of the DNN unreliable. Therefore, we need to control the raw data using in training. Here, we name the explicit raw data as ``content\u27\u27 and the implicit logic learned by the DNN as ``knowledge\u27\u27 in this dissertation. By quantifying the information in DNN\u27s training, we find that the information learned by the network is much less than the information contained in the dataset. It indicates that it is unnecessary to train the neural network with all of the information, which means using partial information for training can also achieve a similar effect of using full information. In other words, it is possible to control the content fed into the DNN, and this strategy shown in this study can reduce the risks (e.g., overfitting and shortcuts) mentioned above. Moreover, use reconstructed data (with partial information) to train the network can reduce the complexity of the network and accelerate the training. In this dissertation, we provide a pipeline to implement content control in DNN\u27s training. We use a series of experiments to prove its feasibility in two applications. One is human brain anatomy structure analysis, and the other is human pose detection and classification

    A Genetic Bayesian Approach for Texture-Aided Urban Land-Use/Land-Cover Classification

    Get PDF
    Urban land-use/land-cover classification is entering a new era with the increased availability of high-resolution satellite imagery and new methods such as texture analysis and artificial intelligence classifiers. Recent research demonstrated exciting improvements of using fractal dimension, lacunarity, and Moran’s I in classification but the integration of these spatial metrics has seldom been investigated. Also, previous research focuses more on developing new classifiers than improving the robust, simple, and fast maximum likelihood classifier. The goal of this dissertation research is to develop a new approach that utilizes a texture vector (fractal dimension, lacunarity, and Moran’s I), combined with a new genetic Bayesian classifier, to improve urban land-use/land-cover classification accuracy. Examples of different land-use/land-covers using post-Katrina IKONOS imagery of New Orleans were demonstrated. Because previous geometric-step and arithmetic-step implementations of the triangular prism algorithm can result in significant unutilized pixels when measuring local fractal dimension, the divisor-step method was developed and found to yield more accurate estimation. In addition, a new lacunarity estimator based on the triangular prism method and the gliding-box algorithm was developed and found better than existing gray-scale estimators for classifying land-use/land-cover from IKONOS imagery. The accuracy of fractal dimension-aided classification was less sensitive to window size than lacunarity and Moran’s I. In general, the optimal window size for the texture vector-aided approach is 27x27 to 37x37 pixels (i.e., 108x108 to 148x148 meters). As expected, a texture vector-aided approach yielded 2-16% better accuracy than individual textural index-aided approach. Compared to the per-pixel maximum likelihood classification, the proposed genetic Bayesian classifier yielded 12% accuracy improvement by optimizing prior probabilities with the genetic algorithm; whereas the integrated approach with a texture vector and the genetic Bayesian classifier significantly improved classification accuracy by 17-21%. Compared to the neural network classifier and genetic algorithm-support vector machines, the genetic Bayesian classifier was slightly less accurate but more computationally efficient and required less human supervision. This research not only develops a new approach of integrating texture analysis with artificial intelligence for classification, but also reveals a promising avenue of using advanced texture analysis and classification methods to associate socioeconomic statuses with remote sensing image textures

    Classification and Analysis of Android Malware Images Using Feature Fusion Technique

    Get PDF
    The super packed functionalities and artificial intelligence (AI)-powered applications have made the Android operating system a big player in the market. Android smartphones have become an integral part of life and users are reliant on their smart devices for making calls, sending text messages, navigation, games, and financial transactions to name a few. This evolution of the smartphone community has opened new horizons for malware developers. As malware variants are growing at a tremendous rate every year, there is an urgent need to combat against stealth malware techniques. This paper proposes a visualization and machine learning-based framework for classifying Android malware. Android malware applications from the DREBIN dataset were converted into grayscale images. In the first phase of the experiment, the proposed framework transforms Android malware into fifteen different image sections and identifies malware files by exploiting handcrafted features associated with Android malware images. The algorithms such as Gray Level Co-occurrence Matrix-based (GLCM), Global Image deScripTors (GIST), and Local Binary Pattern (LBP) are used to extract the handcrafted features from the image sections. The extracted features were further classified using machine learning algorithms like K-Nearest Neighbors, Support Vector Machines, and Random Forests. In the second phase of the experiment, handcrafted features were fused with CNN features to form the feature fusion strategy. The classification performance was evaluated against every malware image file section. The results obtained using the Feature Fusion strategy are compared with handcrafted features results. The experiment results conclude to the fact that Feature Fusion-SVM model is most suited for the identification and classification of Android malware using the certificate and Android Manifest (CR + AM) malware images. It attained an high accuracy of 93.24%

    Siamese Instance Search for Tracking

    Get PDF
    In this paper we present a tracker, which is radically different from state-of-the-art trackers: we apply no model updating, no occlusion detection, no combination of trackers, no geometric matching, and still deliver state-of-the-art tracking performance, as demonstrated on the popular online tracking benchmark (OTB) and six very challenging YouTube videos. The presented tracker simply matches the initial patch of the target in the first frame with candidates in a new frame and returns the most similar patch by a learned matching function. The strength of the matching function comes from being extensively trained generically, i.e., without any data of the target, using a Siamese deep neural network, which we design for tracking. Once learned, the matching function is used as is, without any adapting, to track previously unseen targets. It turns out that the learned matching function is so powerful that a simple tracker built upon it, coined Siamese INstance search Tracker, SINT, which only uses the original observation of the target from the first frame, suffices to reach state-of-the-art performance. Further, we show the proposed tracker even allows for target re-identification after the target was absent for a complete video shot.Comment: This paper is accepted to the IEEE Conference on Computer Vision and Pattern Recognition, 201
    • …
    corecore