254 research outputs found

    Information embedding and retrieval in 3D printed objects

    Get PDF
    Deep learning and convolutional neural networks have become the main tools of computer vision. These techniques are good at using supervised learning to learn complex representations from data. In particular, under limited settings, the image recognition model now performs better than the human baseline. However, computer vision science aims to build machines that can see. It requires the model to be able to extract more valuable information from images and videos than recognition. Generally, it is much more challenging to apply these deep learning models from recognition to other problems in computer vision. This thesis presents end-to-end deep learning architectures for a new computer vision field: watermark retrieval from 3D printed objects. As it is a new area, there is no state-of-the-art on many challenging benchmarks. Hence, we first define the problems and introduce the traditional approach, Local Binary Pattern method, to set our baseline for further study. Our neural networks seem useful but straightfor- ward, which outperform traditional approaches. What is more, these networks have good generalization. However, because our research field is new, the problems we face are not only various unpredictable parameters but also limited and low-quality training data. To address this, we make two observations: (i) we do not need to learn everything from scratch, we know a lot about the image segmentation area, and (ii) we cannot know everything from data, our models should be aware what key features they should learn. This thesis explores these ideas and even explore more. We show how to use end-to-end deep learning models to learn to retrieve watermark bumps and tackle covariates from a few training images data. Secondly, we introduce ideas from synthetic image data and domain randomization to augment training data and understand various covariates that may affect retrieve real-world 3D watermark bumps. We also show how the illumination in synthetic images data to effect and even improve retrieval accuracy for real-world recognization applications

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Robust digital watermarking techniques for multimedia protection

    Get PDF
    The growing problem of the unauthorized reproduction of digital multimedia data such as movies, television broadcasts, and similar digital products has triggered worldwide efforts to identify and protect multimedia contents. Digital watermarking technology provides law enforcement officials with a forensic tool for tracing and catching pirates. Watermarking refers to the process of adding a structure called a watermark to an original data object, which includes digital images, video, audio, maps, text messages, and 3D graphics. Such a watermark can be used for several purposes including copyright protection, fingerprinting, copy protection, broadcast monitoring, data authentication, indexing, and medical safety. The proposed thesis addresses the problem of multimedia protection and consists of three parts. In the first part, we propose new image watermarking algorithms that are robust against a wide range of intentional and geometric attacks, flexible in data embedding, and computationally fast. The core idea behind our proposed watermarking schemes is to use transforms that have different properties which can effectively match various aspects of the signal's frequencies. We embed the watermark many times in all the frequencies to provide better robustness against attacks and increase the difficulty of destroying the watermark. The second part of the thesis is devoted to a joint exploitation of the geometry and topology of 3D objects and its subsequent application to 3D watermarking. The key idea consists of capturing the geometric structure of a 3D mesh in the spectral domain by computing the eigen-decomposition of the mesh Laplacian matrix. We also use the fact that the global shape features of a 3D model may be reconstructed using small low-frequency spectral coefficients. The eigen-analysis of the mesh Laplacian matrix is, however, prohibitively expensive. To lift this limitation, we first partition the 3D mesh into smaller 3D sub-meshes, and then we repeat the watermark embedding process as much as possible in the spectral coefficients of the compressed 3D sub-meshes. The visual error of the watermarked 3D model is evaluated by computing a nonlinear visual error metric between the original 3D model and the watermarked model obtained by our proposed algorithm. The third part of the thesis is devoted to video watermarking. We propose robust, hybrid scene-based MPEG video watermarking techniques based on a high-order tensor singular value decomposition of the video image sequences. The key idea behind our approaches is to use the scene change analysis to embed the watermark repeatedly in a fixed number of the intra-frames. These intra-frames are represented as 3D tensors with two dimensions in space and one dimension in time. We embed the watermark information in the singular values of these high-order tensors, which have good stability and represent the video properties. Illustration of numerical experiments with synthetic and real data are provided to demonstrate the potential and the much improved performance of the proposed algorithms in multimedia watermarking

    Artifact magnification on deepfake videos increases human detection and subjective confidence

    Full text link
    The development of technologies for easily and automatically falsifying video has raised practical questions about people's ability to detect false information online. How vulnerable are people to deepfake videos? What technologies can be applied to boost their performance? Human susceptibility to deepfake videos is typically measured in laboratory settings, which do not reflect the challenges of real-world browsing. In typical browsing, deepfakes are rare, engagement with the video may be short, participants may be distracted, or the video streaming quality may be degraded. Here, we tested deepfake detection under these ecological viewing conditions, and found that detection was lowered in all cases. Principles from signal detection theory indicated that different viewing conditions affected different dimensions of detection performance. Overall, this suggests that the current literature underestimates people's susceptibility to deepfakes. Next, we examined how computer vision models might be integrated into users' decision process to increase accuracy and confidence during deepfake detection. We evaluated the effectiveness of communicating the model's prediction to the user by amplifying artifacts in fake videos. We found that artifact amplification was highly effective at making fake video distinguishable from real, in a manner that was robust across viewing conditions. Additionally, compared to a traditional text-based prompt, artifact amplification was more convincing: people accepted the model's suggestion more often, and reported higher final confidence in their model-supported decision, particularly for more challenging videos. Overall, this suggests that visual indicators that cause distortions on fake videos may be highly effective at mitigating the impact of falsified video.Comment: 8 pages, 4 figure
    corecore