175 research outputs found

    Privacy-preserving information hiding and its applications

    Get PDF
    The phenomenal advances in cloud computing technology have raised concerns about data privacy. Aided by the modern cryptographic techniques such as homomorphic encryption, it has become possible to carry out computations in the encrypted domain and process data without compromising information privacy. In this thesis, we study various classes of privacy-preserving information hiding schemes and their real-world applications for cyber security, cloud computing, Internet of things, etc. Data breach is recognised as one of the most dreadful cyber security threats in which private data is copied, transmitted, viewed, stolen or used by unauthorised parties. Although encryption can obfuscate private information against unauthorised viewing, it may not stop data from illegitimate exportation. Privacy-preserving Information hiding can serve as a potential solution to this issue in such a manner that a permission code is embedded into the encrypted data and can be detected when transmissions occur. Digital watermarking is a technique that has been used for a wide range of intriguing applications such as data authentication and ownership identification. However, some of the algorithms are proprietary intellectual properties and thus the availability to the general public is rather limited. A possible solution is to outsource the task of watermarking to an authorised cloud service provider, that has legitimate right to execute the algorithms as well as high computational capacity. Privacypreserving Information hiding is well suited to this scenario since it is operated in the encrypted domain and hence prevents private data from being collected by the cloud. Internet of things is a promising technology to healthcare industry. A common framework consists of wearable equipments for monitoring the health status of an individual, a local gateway device for aggregating the data, and a cloud server for storing and analysing the data. However, there are risks that an adversary may attempt to eavesdrop the wireless communication, attack the gateway device or even access to the cloud server. Hence, it is desirable to produce and encrypt the data simultaneously and incorporate secret sharing schemes to realise access control. Privacy-preserving secret sharing is a novel research for fulfilling this function. In summary, this thesis presents novel schemes and algorithms, including: • two privacy-preserving reversible information hiding schemes based upon symmetric cryptography using arithmetic of quadratic residues and lexicographic permutations, respectively. • two privacy-preserving reversible information hiding schemes based upon asymmetric cryptography using multiplicative and additive privacy homomorphisms, respectively. • four predictive models for assisting the removal of distortions inflicted by information hiding based respectively upon projection theorem, image gradient, total variation denoising, and Bayesian inference. • three privacy-preserving secret sharing algorithms with different levels of generality

    Preventing Unauthorized AI Over-Analysis by Medical Image Adversarial Watermarking

    Full text link
    The advancement of deep learning has facilitated the integration of Artificial Intelligence (AI) into clinical practices, particularly in computer-aided diagnosis. Given the pivotal role of medical images in various diagnostic procedures, it becomes imperative to ensure the responsible and secure utilization of AI techniques. However, the unauthorized utilization of AI for image analysis raises significant concerns regarding patient privacy and potential infringement on the proprietary rights of data custodians. Consequently, the development of pragmatic and cost-effective strategies that safeguard patient privacy and uphold medical image copyrights emerges as a critical necessity. In direct response to this pressing demand, we present a pioneering solution named Medical Image Adversarial watermarking (MIAD-MARK). Our approach introduces watermarks that strategically mislead unauthorized AI diagnostic models, inducing erroneous predictions without compromising the integrity of the visual content. Importantly, our method integrates an authorization protocol tailored for legitimate users, enabling the removal of the MIAD-MARK through encryption-generated keys. Through extensive experiments, we validate the efficacy of MIAD-MARK across three prominent medical image datasets. The empirical outcomes demonstrate the substantial impact of our approach, notably reducing the accuracy of standard AI diagnostic models to a mere 8.57% under white box conditions and 45.83% in the more challenging black box scenario. Additionally, our solution effectively mitigates unauthorized exploitation of medical images even in the presence of sophisticated watermark removal networks. Notably, those AI diagnosis networks exhibit a meager average accuracy of 38.59% when applied to images protected by MIAD-MARK, underscoring the robustness of our safeguarding mechanism

    On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

    Get PDF
    Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

    Deep Intellectual Property: A Survey

    Full text link
    With the widespread application in industrial manufacturing and commercial services, well-trained deep neural networks (DNNs) are becoming increasingly valuable and crucial assets due to the tremendous training cost and excellent generalization performance. These trained models can be utilized by users without much expert knowledge benefiting from the emerging ''Machine Learning as a Service'' (MLaaS) paradigm. However, this paradigm also exposes the expensive models to various potential threats like model stealing and abuse. As an urgent requirement to defend against these threats, Deep Intellectual Property (DeepIP), to protect private training data, painstakingly-tuned hyperparameters, or costly learned model weights, has been the consensus of both industry and academia. To this end, numerous approaches have been proposed to achieve this goal in recent years, especially to prevent or discover model stealing and unauthorized redistribution. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field. More than 190 research contributions are included in this survey, covering many aspects of Deep IP Protection: challenges/threats, invasive solutions (watermarking), non-invasive solutions (fingerprinting), evaluation metrics, and performance. We finish the survey by identifying promising directions for future research.Comment: 38 pages, 12 figure

    Classifiers and machine learning techniques for image processing and computer vision

    Get PDF
    Orientador: Siome Klein GoldensteinTese (doutorado) - Universidade Estadual de Campinas, Instituto da ComputaçãoResumo: Neste trabalho de doutorado, propomos a utilizaçãoo de classificadores e técnicas de aprendizado de maquina para extrair informações relevantes de um conjunto de dados (e.g., imagens) para solução de alguns problemas em Processamento de Imagens e Visão Computacional. Os problemas de nosso interesse são: categorização de imagens em duas ou mais classes, detecçãao de mensagens escondidas, distinção entre imagens digitalmente adulteradas e imagens naturais, autenticação, multi-classificação, entre outros. Inicialmente, apresentamos uma revisão comparativa e crítica do estado da arte em análise forense de imagens e detecção de mensagens escondidas em imagens. Nosso objetivo é mostrar as potencialidades das técnicas existentes e, mais importante, apontar suas limitações. Com esse estudo, mostramos que boa parte dos problemas nessa área apontam para dois pontos em comum: a seleção de características e as técnicas de aprendizado a serem utilizadas. Nesse estudo, também discutimos questões legais associadas a análise forense de imagens como, por exemplo, o uso de fotografias digitais por criminosos. Em seguida, introduzimos uma técnica para análise forense de imagens testada no contexto de detecção de mensagens escondidas e de classificação geral de imagens em categorias como indoors, outdoors, geradas em computador e obras de arte. Ao estudarmos esse problema de multi-classificação, surgem algumas questões: como resolver um problema multi-classe de modo a poder combinar, por exemplo, caracteríisticas de classificação de imagens baseadas em cor, textura, forma e silhueta, sem nos preocuparmos demasiadamente em como normalizar o vetor-comum de caracteristicas gerado? Como utilizar diversos classificadores diferentes, cada um, especializado e melhor configurado para um conjunto de caracteristicas ou classes em confusão? Nesse sentido, apresentamos, uma tecnica para fusão de classificadores e caracteristicas no cenário multi-classe através da combinação de classificadores binários. Nós validamos nossa abordagem numa aplicação real para classificação automática de frutas e legumes. Finalmente, nos deparamos com mais um problema interessante: como tornar a utilização de poderosos classificadores binarios no contexto multi-classe mais eficiente e eficaz? Assim, introduzimos uma tecnica para combinação de classificadores binarios (chamados classificadores base) para a resolução de problemas no contexto geral de multi-classificação.Abstract: In this work, we propose the use of classifiers and machine learning techniques to extract useful information from data sets (e.g., images) to solve important problems in Image Processing and Computer Vision. We are particularly interested in: two and multi-class image categorization, hidden messages detection, discrimination among natural and forged images, authentication, and multiclassification. To start with, we present a comparative survey of the state-of-the-art in digital image forensics as well as hidden messages detection. Our objective is to show the importance of the existing solutions and discuss their limitations. In this study, we show that most of these techniques strive to solve two common problems in Machine Learning: the feature selection and the classification techniques to be used. Furthermore, we discuss the legal and ethical aspects of image forensics analysis, such as, the use of digital images by criminals. We introduce a technique for image forensics analysis in the context of hidden messages detection and image classification in categories such as indoors, outdoors, computer generated, and art works. From this multi-class classification, we found some important questions: how to solve a multi-class problem in order to combine, for instance, several different features such as color, texture, shape, and silhouette without worrying about the pre-processing and normalization of the combined feature vector? How to take advantage of different classifiers, each one custom tailored to a specific set of classes in confusion? To cope with most of these problems, we present a feature and classifier fusion technique based on combinations of binary classifiers. We validate our solution with a real application for automatic produce classification. Finally, we address another interesting problem: how to combine powerful binary classifiers in the multi-class scenario more effectively? How to boost their efficiency? In this context, we present a solution that boosts the efficiency and effectiveness of multi-class from binary techniques.DoutoradoEngenharia de ComputaçãoDoutor em Ciência da Computaçã

    A survey on vulnerability of federated learning: A learning algorithm perspective

    Get PDF
    Federated Learning (FL) has emerged as a powerful paradigm for training Machine Learning (ML), particularly Deep Learning (DL) models on multiple devices or servers while maintaining data localized at owners’ sites. Without centralizing data, FL holds promise for scenarios where data integrity, privacy and security and are critical. However, this decentralized training process also opens up new avenues for opponents to launch unique attacks, where it has been becoming an urgent need to understand the vulnerabilities and corresponding defense mechanisms from a learning algorithm perspective. This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of developing robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications. The categorized bibliography can be found at: https://github.com/Rand2AI/Awesome-Vulnerability-of-Federated-Learning

    A survey on vulnerability of federated learning: A learning algorithm perspective

    Get PDF
    Federated Learning (FL) has emerged as a powerful paradigm for training Machine Learning (ML), particularly Deep Learning (DL) models on multiple devices or servers while maintaining data localized at owners’ sites. Without centralizing data, FL holds promise for scenarios where data integrity, privacy and security and are critical. However, this decentralized training process also opens up new avenues for opponents to launch unique attacks, where it has been becoming an urgent need to understand the vulnerabilities and corresponding defense mechanisms from a learning algorithm perspective. This review paper takes a comprehensive look at malicious attacks against FL, categorizing them from new perspectives on attack origins and targets, and providing insights into their methodology and impact. In this survey, we focus on threat models targeting the learning process of FL systems. Based on the source and target of the attack, we categorize existing threat models into four types, Data to Model (D2M), Model to Data (M2D), Model to Model (M2M) and composite attacks. For each attack type, we discuss the defense strategies proposed, highlighting their effectiveness, assumptions and potential areas for improvement. Defense strategies have evolved from using a singular metric to excluding malicious clients, to employing a multifaceted approach examining client models at various phases. In this survey paper, our research indicates that the to-learn data, the learning gradients, and the learned model at different stages all can be manipulated to initiate malicious attacks that range from undermining model performance, reconstructing private local data, and to inserting backdoors. We have also seen these threat are becoming more insidious. While earlier studies typically amplified malicious gradients, recent endeavors subtly alter the least significant weights in local models to bypass defense measures. This literature review provides a holistic understanding of the current FL threat landscape and highlights the importance of developing robust, efficient, and privacy-preserving defenses to ensure the safe and trusted adoption of FL in real-world applications. The categorized bibliography can be found at: https://github.com/Rand2AI/Awesome-Vulnerability-of-Federated-Learning

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Information embedding and retrieval in 3D printed objects

    Get PDF
    Deep learning and convolutional neural networks have become the main tools of computer vision. These techniques are good at using supervised learning to learn complex representations from data. In particular, under limited settings, the image recognition model now performs better than the human baseline. However, computer vision science aims to build machines that can see. It requires the model to be able to extract more valuable information from images and videos than recognition. Generally, it is much more challenging to apply these deep learning models from recognition to other problems in computer vision. This thesis presents end-to-end deep learning architectures for a new computer vision field: watermark retrieval from 3D printed objects. As it is a new area, there is no state-of-the-art on many challenging benchmarks. Hence, we first define the problems and introduce the traditional approach, Local Binary Pattern method, to set our baseline for further study. Our neural networks seem useful but straightfor- ward, which outperform traditional approaches. What is more, these networks have good generalization. However, because our research field is new, the problems we face are not only various unpredictable parameters but also limited and low-quality training data. To address this, we make two observations: (i) we do not need to learn everything from scratch, we know a lot about the image segmentation area, and (ii) we cannot know everything from data, our models should be aware what key features they should learn. This thesis explores these ideas and even explore more. We show how to use end-to-end deep learning models to learn to retrieve watermark bumps and tackle covariates from a few training images data. Secondly, we introduce ideas from synthetic image data and domain randomization to augment training data and understand various covariates that may affect retrieve real-world 3D watermark bumps. We also show how the illumination in synthetic images data to effect and even improve retrieval accuracy for real-world recognization applications
    • …
    corecore