15 research outputs found

    Structure propagation for zero-shot learning

    Full text link
    The key of zero-shot learning (ZSL) is how to find the information transfer model for bridging the gap between images and semantic information (texts or attributes). Existing ZSL methods usually construct the compatibility function between images and class labels with the consideration of the relevance on the semantic classes (the manifold structure of semantic classes). However, the relationship of image classes (the manifold structure of image classes) is also very important for the compatibility model construction. It is difficult to capture the relationship among image classes due to unseen classes, so that the manifold structure of image classes often is ignored in ZSL. To complement each other between the manifold structure of image classes and that of semantic classes information, we propose structure propagation (SP) for improving the performance of ZSL for classification. SP can jointly consider the manifold structure of image classes and that of semantic classes for approximating to the intrinsic structure of object classes. Moreover, the SP can describe the constrain condition between the compatibility function and these manifold structures for balancing the influence of the structure propagation iteration. The SP solution provides not only unseen class labels but also the relationship of two manifold structures that encode the positive transfer in structure propagation. Experimental results demonstrate that SP can attain the promising results on the AwA, CUB, Dogs and SUN databases

    Application and Theory of Multimedia Signal Processing Using Machine Learning or Advanced Methods

    Get PDF
    This Special Issue is a book composed by collecting documents published through peer review on the research of various advanced technologies related to applications and theories of signal processing for multimedia systems using ML or advanced methods. Multimedia signals include image, video, audio, character recognition and optimization of communication channels for networks. The specific contents included in this book are data hiding, encryption, object detection, image classification, and character recognition. Academics and colleagues who are interested in these topics will find it interesting to read

    Robust Computer Vision Against Adversarial Examples and Domain Shifts

    Get PDF
    Recent advances in deep learning have achieved remarkable success in various computer vision problems. Driven by progressive computing resources and a vast amount of data, deep learning technology is reshaping human life. However, Deep Neural Networks (DNNs) have been shown vulnerable to adversarial examples, in which carefully crafted perturbations can easily fool DNNs into making wrong predictions. On the other hand, DNNs have poor generalization to domain shifts, as they suffer from performance degradation when encountering data from new visual distributions. We view these issues from the perspective of robustness. More precisely, existing deep learning technology is not reliable enough for many scenarios, where adversarial examples and domain shifts are among the most critical. The lack of reliability inevitably limits DNNs from being deployed in more important computer vision applications, such as self-driving vehicles and medical instruments that have major safety concerns. To overcome these challenges, we focus on investigating and addressing the robustness of deep learning-based computer vision approaches. The first part of this thesis attempts to robustify computer vision models against adversarial examples. We dive into such adversarial robustness from four aspects: novel attacks for strengthening benchmarks, empirical defenses validated by a third-party evaluator, generalizable defenses that can defend against multiple and unforeseen attacks, and defenses specifically designed for less explored tasks. The second part of this thesis improves the robustness against domain shifts via domain adaptation. We dive into two important domain adaptation settings: unsupervised domain adaptation, which is the most common, and source-free domain adaptation, which is more practical in real-world scenarios. The last part explores the intersection of adversarial robustness and domain adaptation fields to provide new insights for robust DNNs. We study two directions: adversarial defense for domain adaptation and adversarial defense via domain adaptations. This dissertation aims at more robust, reliable, and trustworthy computer vision

    Deep learning for characterizing full-color 3D printers: accuracy, robustness, and data-efficiency

    Get PDF
    High-fidelity color and appearance reproduction via multi-material-jetting full-color 3D printing has seen increasing applications, including art and cultural artifacts preservation, product prototypes, game character figurines, stop-motion animated movie, and 3D-printed prostheses such as dental restorations or prosthetic eyes. To achieve high-quality appearance reproduction via full-color 3D printing, a prerequisite is an accurate optical printer model that is a predicting function from an arrangement or ratio of printing materials to the optical/visual properties (e.g. spectral reflectance, color, and translucency) of the resulting print. For appearance 3D printing, the model needs to be inverted to determine the printing material arrangement that reproduces distinct optical/visual properties such as color. Therefore, the accuracy of optical printer models plays a crucial role for the final print quality. The process of fitting an optical printer model's parameters for a printing system is called optical characterization, which requires test prints and optical measurements. The objective of developing a printer model is to maximize prediction performance such as accuracy, while minimizing optical characterization efforts including printing, post-processing, and measuring. In this thesis, I aim at leveraging deep learning to achieve holistically-performant optical printer models, in terms of three different performance aspects of optical printer models: 1) accuracy, 2) robustness, and 3) data efficiency. First, for model accuracy, we propose two deep learning-based printer models that both achieve high accuracies with only a moderate number of required training samples. Experiments show that both models outperform the traditional cellular Neugebauer model by large margins: up to 6 times higher accuracy, or, up to 10 times less data for a similar accuracy. The high accuracy could enhance or even enable color- and translucency-critical applications of 3D printing such as dental restorations or prosthetic eyes. Second, for model robustness, we propose a methodology to induce physically-plausible constraints and smoothness into deep learning-based optical printer models. Experiments show that the model not only almost always corrects implausible relationships between material arrangement and the resulting optical/visual properties, but also ensures significantly smoother predictions. The robustness and smoothness improvements are important to alleviate or avoid unacceptable banding artifacts on textures of the final printouts, particularly for applications where texture details must be preserved, such as for reproducing prosthetic eyes whose texture must match the companion (healthy) eye. Finally, for data efficiency, we propose a learning framework that significantly improves printer models' data efficiency by employing existing characterization data from other printers. We also propose a contrastive learning-based approach to learn dataset embeddings that are extra inputs required by the aforementioned learning framework. Experiments show that the learning framework can drastically reduce the number of required samples for achieving an application-specific prediction accuracy. For some printers, it requires only 10% of the samples to achieve a similar accuracy as the state-of-the-art model. The significant improvement in data efficiency makes it economically possible to frequently characterize 3D printers to achieve more consistent output across different printers over time, which is crucial for color- and translucency-critical individualized mass production. With these proposed deep learning-based methodologies significantly improving the three performance aspects (i.e. accuracy, robustness, and data efficiency), a holistically-performant optical printer model can be achieved, which is particularly important for color- and translucency-critical applications such as dental restorations or prosthetic eyes

    Real-time Ultrasound Signals Processing: Denoising and Super-resolution

    Get PDF
    Ultrasound acquisition is widespread in the biomedical field, due to its properties of low cost, portability, and non-invasiveness for the patient. The processing and analysis of US signals, such as images, 2D videos, and volumetric images, allows the physician to monitor the evolution of the patient's disease, and support diagnosis, and treatments (e.g., surgery). US images are affected by speckle noise, generated by the overlap of US waves. Furthermore, low-resolution images are acquired when a high acquisition frequency is applied to accurately characterise the behaviour of anatomical features that quickly change over time. Denoising and super-resolution of US signals are relevant to improve the visual evaluation of the physician and the performance and accuracy of processing methods, such as segmentation and classification. The main requirements for the processing and analysis of US signals are real-time execution, preservation of anatomical features, and reduction of artefacts. In this context, we present a novel framework for the real-time denoising of US 2D images based on deep learning and high-performance computing, which reduces noise while preserving anatomical features in real-time execution. We extend our framework to the denoise of arbitrary US signals, such as 2D videos and 3D images, and we apply denoising algorithms that account for spatio-temporal signal properties into an image-to-image deep learning model. As a building block of this framework, we propose a novel denoising method belonging to the class of low-rank approximations, which learns and predicts the optimal thresholds of the Singular Value Decomposition. While previous denoise work compromises the computational cost and effectiveness of the method, the proposed framework achieves the results of the best denoising algorithms in terms of noise removal, anatomical feature preservation, and geometric and texture properties conservation, in a real-time execution that respects industrial constraints. The framework reduces the artefacts (e.g., blurring) and preserves the spatio-temporal consistency among frames/slices; also, it is general to the denoising algorithm, anatomical district, and noise intensity. Then, we introduce a novel framework for the real-time reconstruction of the non-acquired scan lines through an interpolating method; a deep learning model improves the results of the interpolation to match the target image (i.e., the high-resolution image). We improve the accuracy of the prediction of the reconstructed lines through the design of the network architecture and the loss function. %The design of the deep learning architecture and the loss function allow the network to improve the accuracy of the prediction of the reconstructed lines. In the context of signal approximation, we introduce our kernel-based sampling method for the reconstruction of 2D and 3D signals defined on regular and irregular grids, with an application to US 2D and 3D images. Our method improves previous work in terms of sampling quality, approximation accuracy, and geometry reconstruction with a slightly higher computational cost. For both denoising and super-resolution, we evaluate the compliance with the real-time requirement of US applications in the medical domain and provide a quantitative evaluation of denoising and super-resolution methods on US and synthetic images. Finally, we discuss the role of denoising and super-resolution as pre-processing steps for segmentation and predictive analysis of breast pathologies

    Technologies of information transmission and processing

    Get PDF
    Сборник содержит статьи, тематика которых посвящена научно-теоретическим разработкам в области сетей телекоммуникаций, информационной безопасности, технологий передачи и обработки информации. Предназначен для научных сотрудников в области инфокоммуникаций, преподавателей, аспирантов, магистрантов и студентов технических вузов

    Applied Metaheuristic Computing

    Get PDF
    For decades, Applied Metaheuristic Computing (AMC) has been a prevailing optimization technique for tackling perplexing engineering and business problems, such as scheduling, routing, ordering, bin packing, assignment, facility layout planning, among others. This is partly because the classic exact methods are constrained with prior assumptions, and partly due to the heuristics being problem-dependent and lacking generalization. AMC, on the contrary, guides the course of low-level heuristics to search beyond the local optimality, which impairs the capability of traditional computation methods. This topic series has collected quality papers proposing cutting-edge methodology and innovative applications which drive the advances of AMC

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Symmetry-Adapted Machine Learning for Information Security

    Get PDF
    Symmetry-adapted machine learning has shown encouraging ability to mitigate the security risks in information and communication technology (ICT) systems. It is a subset of artificial intelligence (AI) that relies on the principles of processing future events by learning past events or historical data. The autonomous nature of symmetry-adapted machine learning supports effective data processing and analysis for security detection in ICT systems without the interference of human authorities. Many industries are developing machine-learning-adapted solutions to support security for smart hardware, distributed computing, and the cloud. In our Special Issue book, we focus on the deployment of symmetry-adapted machine learning for information security in various application areas. This security approach can support effective methods to handle the dynamic nature of security attacks by extraction and analysis of data to identify hidden patterns of data. The main topics of this Issue include malware classification, an intrusion detection system, image watermarking, color image watermarking, battlefield target aggregation behavior recognition model, IP camera, Internet of Things (IoT) security, service function chain, indoor positioning system, and crypto-analysis
    corecore