6 research outputs found

    Various generative adversarial networks model for synthetic prohibitory sign image generation

    Get PDF
    A synthetic image is a critical issue for computer vision. Traffic sign images synthesized from standard models are commonly used to build computer recognition algorithms for acquiring more knowledge on various and low-cost research issues. Convolutional Neural Network (CNN) achieves excellent detection and recognition of traffic signs with sufficient annotated training data. The consistency of the entire vision system is dependent on neural networks. However, locating traffic sign datasets from most countries in the world is complicated. This work uses various generative adversarial networks (GAN) models to construct intricate images, such as Least Squares Generative Adversarial Networks (LSGAN), Deep Convolutional Generative Adversarial Networks (DCGAN), and Wasserstein Generative Adversarial Networks (WGAN). This paper also discusses, in particular, the quality of the images produced by various GANs with different parameters. For processing, we use a picture with a specific number and scale. The Structural Similarity Index (SSIM) and Mean Squared Error (MSE) will be used to measure image consistency. Between the generated image and the corresponding real image, the SSIM values will be compared. As a result, the images display a strong similarity to the real image when using more training images. LSGAN outperformed other GAN models in the experiment with maximum SSIM values achieved using 200 images as inputs, 2000 epochs, and size 32 × 32

    Model-based occlusion disentanglement for image-to-image translation

    Full text link
    Image-to-image translation is affected by entanglement phenomena, which may occur in case of target data encompassing occlusions such as raindrops, dirt, etc. Our unsupervised model-based learning disentangles scene and occlusions, while benefiting from an adversarial pipeline to regress physical parameters of the occlusion model. The experiments demonstrate our method is able to handle varying types of occlusions and generate highly realistic translations, qualitatively and quantitatively outperforming the state-of-the-art on multiple datasets.Comment: ECCV 202

    Deep learning-based signal processing approaches for improved tracking of human health and behaviour with wearable sensors

    Get PDF
    This thesis explores two lines of research in the context of sequential data and machine learning in the remote environment, i.e., outside the lab setting - using data acquired from wearable devices. Firstly, we explore Generative Adversarial Networks (GANs) as a reliable tool for time series generation, imputation and forecasting. Secondly, we investigate the applicability of novel deep learning frameworks to sequential data processing and their advantages over traditional methods. More specifically, we use our models to unlock additional insights and biomarkers in human-centric datasets. Our first research avenue concerns the generation of sequential physiological data. Access to physiological data, particularly medical data, has become heavily regulated in recent years, which has presented bottlenecks in developing computational models to assist in diagnosing and treating patients. Therefore, we explore GAN models to generate medical time series data that adhere to privacy-preserving regulations. We present our novel methods of generating and imputing synthetic, multichannel sequential medical data while complying with privacy regulations. Addressing these concerns allows for sharing and disseminating medical data and, in turn, developing clinical research in the relevant fields. Secondly, we explore novel deep learning technologies applied to human-centric sequential data to unlock further insights while addressing the idea of environmentally sustainable AI. We develop novel deep learning processing methods to estimate human activity and heart rate through convolutional networks. We also introduce our ‘time series-to-time series GAN’, which maps photoplethysmograph data to blood pressure measurements. Importantly, we denoise artefact-laden biosignal data to a competitive standard using a custom objective function and novel application of GANs. These deep learning methods help to produce nuanced biomarkers and state-of-the-art insights from human physiological data. The work laid out in this thesis provides a foundation for state-of-the-art deep learning methods for sequential data processing while keeping a keen eye on sustain- able AI

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Multimedia Forensics

    Get PDF
    This book is open access. Media forensics has never been more relevant to societal life. Not only media content represents an ever-increasing share of the data traveling on the net and the preferred communications means for most users, it has also become integral part of most innovative applications in the digital information ecosystem that serves various sectors of society, from the entertainment, to journalism, to politics. Undoubtedly, the advances in deep learning and computational imaging contributed significantly to this outcome. The underlying technologies that drive this trend, however, also pose a profound challenge in establishing trust in what we see, hear, and read, and make media content the preferred target of malicious attacks. In this new threat landscape powered by innovative imaging technologies and sophisticated tools, based on autoencoders and generative adversarial networks, this book fills an important gap. It presents a comprehensive review of state-of-the-art forensics capabilities that relate to media attribution, integrity and authenticity verification, and counter forensics. Its content is developed to provide practitioners, researchers, photo and video enthusiasts, and students a holistic view of the field

    Understanding and Exploiting the Latent Space to improve Machine Learning models eXplainability

    Get PDF
    In recent years, Artificial Intelligence (AI) and Machine Learning (ML) systems have dramatically increased their capabilities, achieving human-like or even humansuperior performance in specific tasks. This increased performance has gone hand in hand with an increase in the complexity of AI and ML models, compromising their transparency and trustworthiness and making them inscrutable black boxes for decision making. Explainable AI (XAI) is a field that seeks to make the decisions suggested by ML models more transparent to human users, by providing different types of explanations. This thesis explores the possibility of using a reduced feature space called “latent space”, produced by a particular kind of ML models, as a means for the explanation process. First, we study the possibility of navigating the latent space as a form of interactive explanation to better understand the rationale behind the model’s predictions. Second, we propose an interpretable-by-design approach to make the explanation process completely transparent to the user. Third, we exploit mathematical properties of the latent space of certain ML models (similarity and linearity) to produce explanations that are shown more plausible and accurate than those of existing competitors in the state of the art. In order to validate our approach, we perform extensive benchmarking on different datasets, with respect to both existing metrics and new ones introduced in our work to highlight new XAI problems, beyond current literature
    corecore