11 research outputs found

    Open set learning with augmented category by exploiting unlabelled data (open-LACU)

    Full text link
    Considering the nature of unlabelled data, it is common for partially labelled training datasets to contain samples that belong to novel categories. Although these so-called observed novel categories exist in the training data, they do not belong to any of the training labels. In contrast, open-sets define novel categories as those unobserved during during training, but present during testing. This research is the first to generalize between observed and unobserved novel categories within a new learning policy called open-set learning with augmented category by exploiting unlabeled data or open-LACU. This study conducts a high-level review on novelty detection so to differentiate between research fields that concern observed novel categories, and the research fields that concern unobserved novel categories. Open-LACU is then introduced as a synthesis of the relevant fields to maintain the advantages of each within a single learning policy. Currently, we are finalising the first open-LACU network which will be combined with this pre-print to be sent for publication.Comment: 11 Page

    COVID-19 Patterns Identification using Generative Adversarial Networks Based Implementation

    Get PDF
    Abstract: Predictive analytics and medical diagnostics are two of the most important fields of study that have a lot of room for growth. Today, the COVID-19 virus has a huge impact, but it changes a lot. This virus has spread across the world, and there is currently no vaccine for it. The number of cases in India now stands at more than 10,000, and more than 300 people have died from it. Twenty people in the world have COVID. Neuronal network technology has made big changes. The Generative Adversarial Network (GAN) is used to analyse pictures and multimedia data in huge areas with great speed. Medical images from COVID-19 data sets will be looked at to see if they can predict what will happen to patients. Medical images, such as X-rays and CT scans, are used to train the GANs, which build, change, and analyse data sets and key points with advanced deep learning models. If GANs are used in the general prediction study, they can help traditional neural networks outperform them in a lot of places. This study is meant to help people better plan for mining and information exploration by combining work done on Benchmark data sets with more advanced text.  

    VAE-Info-cGAN: Generating Synthetic Images by Combining Pixel-level and Feature-level Geospatial Conditional Inputs

    Full text link
    Training robust supervised deep learning models for many geospatial applications of computer vision is difficult due to dearth of class-balanced and diverse training data. Conversely, obtaining enough training data for many applications is financially prohibitive or may be infeasible, especially when the application involves modeling rare or extreme events. Synthetically generating data (and labels) using a generative model that can sample from a target distribution and exploit the multi-scale nature of images can be an inexpensive solution to address scarcity of labeled data. Towards this goal, we present a deep conditional generative model, called VAE-Info-cGAN, that combines a Variational Autoencoder (VAE) with a conditional Information Maximizing Generative Adversarial Network (InfoGAN), for synthesizing semantically rich images simultaneously conditioned on a pixel-level condition (PLC) and a macroscopic feature-level condition (FLC). Dimensionally, the PLC can only vary in the channel dimension from the synthesized image and is meant to be a task-specific input. The FLC is modeled as an attribute vector in the latent space of the generated image which controls the contributions of various characteristic attributes germane to the target distribution. An interpretation of the attribute vector to systematically generate synthetic images by varying a chosen binary macroscopic feature is explored. Experiments on a GPS trajectories dataset show that the proposed model can accurately generate various forms of spatio-temporal aggregates across different geographic locations while conditioned only on a raster representation of the road network. The primary intended application of the VAE-Info-cGAN is synthetic data (and label) generation for targeted data augmentation for computer vision-based modeling of problems relevant to geospatial analysis and remote sensing.Comment: 10 pages, 4 figures, Peer-reviewed and accepted version of the paper published at the 13th ACM SIGSPATIAL International Workshop on Computational Transportation Science (IWCTS 2020

    Learning from small and imbalanced dataset of images using generative adversarial neural networks.

    Get PDF
    The performance of deep learning models is unmatched by any other approach in supervised computer vision tasks such as image classification. However, training these models requires a lot of labeled data, which are not always available. Labelling a massive dataset is largely a manual and very demanding process. Thus, this problem has led to the development of techniques that bypass the need for labelling at scale. Despite this, existing techniques such as transfer learning, data augmentation and semi-supervised learning have not lived up to expectations. Some of these techniques do not account for other classification challenges, such as a class-imbalance problem. Thus, these techniques mostly underperform when compared with fully supervised approaches. In this thesis, we propose new methods to train a deep model on image classification with a limited number of labeled examples. This was achieved by extending state-of-the-art generative adversarial networks with multiple fake classes and network switchers. These new features enabled us to train a classifier using large unlabeled data, while generating class specific samples. The proposed model is label agnostic and is suitable for different classification scenarios, ranging from weakly supervised to fully supervised settings. This was used to address classification challenges with limited labeled data and a class-imbalance problem. Extensive experiments were carried out on different benchmark datasets. Firstly, the proposed approach was used to train a classification model and our findings indicated that the proposed approach achieved better classification accuracies, especially when the number of labeled samples is small. Secondly, the proposed approach was able to generate high-quality samples from class-imbalance datasets. The samples' quality is evident in improved classification performances when generated samples were used in neutralising class-imbalance. The results are thoroughly analyzed and, overall, our method showed superior performances over popular resampling technique and the AC-GAN model. Finally, we successfully applied the proposed approach as a new augmentation technique to two challenging real-world problems: face with attributes and legacy engineering drawings. The results obtained demonstrate that the proposed approach is effective even in extreme cases

    Machine Learning Modeling for Image Segmentation in Manufacturing and Agriculture Applications

    Get PDF
    Doctor of PhilosophyDepartment of Industrial & Manufacturing Systems EngineeringShing I ChangThis dissertation focuses on applying machine learning (ML) modelling for image segmentation tasks of various applications such as additive manufacturing monitoring, agricultural soil cover classification, and laser scribing quality control. The proposed ML framework uses various ML models such as gradient boosting classifier and deep convolutional neural network to improve and automate image segmentation tasks. In recent years, supervised ML methods have been widely adopted for imaging processing applications in various industries. The presence of cameras installed in production processes has generated a vast amount of image data that can potentially be used for process monitoring. Specifically, deep supervised machine learning models have been successfully implemented to build automatic tools for filtering and classifying useful information for process monitoring. However, successful implementations of deep supervised learning algorithms depend on several factors such as distribution and size of training data, selected ML models, and consistency in the target domain distribution that may change based on different environmental conditions over time. The proposed framework takes advantage of general-purposed, trained supervised learning models and applies them for process monitoring applications related to manufacturing and agriculture. In Chapter 2, a layer-wise framework is proposed to monitor the quality of 3D printing parts based on top-view images. The proposed statistical process monitoring method starts with self-start control charts that require only two successful initial prints. Unsupervised machine learning methods can be used for problems in which high accuracy is not required, but statistical process monitoring usually demands high classification accuracies to avoid Type I and II errors. Answering the challenges of image processing using unsupervised methods due to lighting, a supervised Gradient Boosting Classifier (GBC) with 93 percent accuracy is adopted to classify each printed layer from the printing bed. Despite the power of GBC or other decision-tree-based ML models to comparable to unsupervised ML models, their capability is limited in terms of accuracy and running time for complex classification problems such as soil cover classification. In Chapter 3, a deep convolutional neural network (DCNN) for semantic segmentation is trained to quantify and monitor soil coverage in agricultural fields. The trained model is capable of accurately quantifying green canopy cover, counting plants, and classifying stubble. Due to the wide variety of scenarios in a real agricultural field, 3942 high-resolution images were collected and labeled for training and test data set. The difficulty and hardship of collecting, cleaning, and labeling the mentioned dataset was the motivation to find a better approach to alleviate data-wrangling burden for any ML model training. One of the most influential factors is the need for a high volume of labeled data from an exact problem domain in terms of feature space and distributions of data of all classes. Image data preparation for deep learning model training is expensive in terms of the time for labelling due to tedious manual processing. Multiple human labelers can work simultaneously but inconsistent labeling will generate a training data set that often compromises model performance. In addition, training a ML model for a complication problem from scratch will also demand vast computational power. One of the potential approaches for alleviating data wrangling challenges is transfer learning (TL). In Chapter 4, a TL approach was adopted for monitoring three laser scribing characteristics – scribe width, straightness, and debris to answer these challenges. The proposed transfer deep convolutional neural network (TDCNN) model can reduce timely and costly processing of data preparation. The proposed framework leverages a deep learning model already trained for a similar problem and only uses 21 images generated gleaned from the problem domain. The proposed TDCNN overcame the data challenge by leveraging the DCNN model called VGG16 already trained for basic geometric features using more than two million pictures. Appropriate image processing techniques were provided to measure scribe width and line straightness as well as total scribe and debris area using classified images with 96 percent accuracy. In addition to the fact that the TDCNN is functioning with less trainable parameters (i.e., 5 million versus 15 million for VGG16), increasing training size to 154 did not provide significant improvement in accuracy that shows the TDCNN does not need high volume of data to be successful. Finally, chapter 5 summarizes the proposed work and lays out the topics for future research

    Gotham city. Predicting ‘corrupted’municipalities with machine learning

    Get PDF
    The economic costs of white-collar crimes, such as corruption, bribery, embezzlement, abuse of authority, and fraud, are substantial. How to eradicate them is a mounting task in many countries. Using police archives, we apply machine learning algorithms to predict corruption crimes in Italian municipalities. Drawing on input data from 2011, our classification trees correctly forecast over 70 % (about 80 %) of the municipalities that will experience corruption episodes (an increase in corruption crimes) over the period 2012–2014. We show that algorithmic predictions could strengthen the ability of the 2012 Italy's anti-corruption law to fight white-collar delinquencies and prevent the occurrence of such crimes while preserving transparency and accountability of the policymaker

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered
    corecore