167 research outputs found

    CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields

    Full text link
    Neural Radiance Fields (NeRF) have the potential to be a major representation of media. Since training a NeRF has never been an easy task, the protection of its model copyright should be a priority. In this paper, by analyzing the pros and cons of possible copyright protection solutions, we propose to protect the copyright of NeRF models by replacing the original color representation in NeRF with a watermarked color representation. Then, a distortion-resistant rendering scheme is designed to guarantee robust message extraction in 2D renderings of NeRF. Our proposed method can directly protect the copyright of NeRF models while maintaining high rendering quality and bit accuracy when compared among optional solutions.Comment: 11 pages, 6 figures, accepted by iccv 2023 non-camera-ready versio

    Information embedding and retrieval in 3D printed objects

    Get PDF
    Deep learning and convolutional neural networks have become the main tools of computer vision. These techniques are good at using supervised learning to learn complex representations from data. In particular, under limited settings, the image recognition model now performs better than the human baseline. However, computer vision science aims to build machines that can see. It requires the model to be able to extract more valuable information from images and videos than recognition. Generally, it is much more challenging to apply these deep learning models from recognition to other problems in computer vision. This thesis presents end-to-end deep learning architectures for a new computer vision field: watermark retrieval from 3D printed objects. As it is a new area, there is no state-of-the-art on many challenging benchmarks. Hence, we first define the problems and introduce the traditional approach, Local Binary Pattern method, to set our baseline for further study. Our neural networks seem useful but straightfor- ward, which outperform traditional approaches. What is more, these networks have good generalization. However, because our research field is new, the problems we face are not only various unpredictable parameters but also limited and low-quality training data. To address this, we make two observations: (i) we do not need to learn everything from scratch, we know a lot about the image segmentation area, and (ii) we cannot know everything from data, our models should be aware what key features they should learn. This thesis explores these ideas and even explore more. We show how to use end-to-end deep learning models to learn to retrieve watermark bumps and tackle covariates from a few training images data. Secondly, we introduce ideas from synthetic image data and domain randomization to augment training data and understand various covariates that may affect retrieve real-world 3D watermark bumps. We also show how the illumination in synthetic images data to effect and even improve retrieval accuracy for real-world recognization applications

    T4DT: Tensorizing Time for Learning Temporal 3D Visual Data

    Full text link
    Unlike 2D raster images, there is no single dominant representation for 3D visual data processing. Different formats like point clouds, meshes, or implicit functions each have their strengths and weaknesses. Still, grid representations such as signed distance functions have attractive properties also in 3D. In particular, they offer constant-time random access and are eminently suitable for modern machine learning. Unfortunately, the storage size of a grid grows exponentially with its dimension. Hence they often exceed memory limits even at moderate resolution. This work explores various low-rank tensor formats, including the Tucker, tensor train, and quantics tensor train decompositions, to compress time-varying 3D data. Our method iteratively computes, voxelizes, and compresses each frame's truncated signed distance function and applies tensor rank truncation to condense all frames into a single, compressed tensor that represents the entire 4D scene. We show that low-rank tensor compression is extremely compact to store and query time-varying signed distance functions. It significantly reduces the memory footprint of 4D scenes while surprisingly preserving their geometric quality. Unlike existing iterative learning-based approaches like DeepSDF and NeRF, our method uses a closed-form algorithm with theoretical guarantees

    It is hard to see a needle in a haystack: Modeling contrast masking effect in a numerical observer

    Full text link
    Within the framework of a virtual clinical trial for breast imaging, we aim to develop numerical observers that follow the same detection performance trends as those of a typical human observer. In our prior work, we showed that by including spatiotemporal contrast sensitivity function (stCSF) of human visual system (HVS) in a multi-slice channelized Hotelling observer (msCHO), we can correctly predict trends of a typical human observer performance with the viewing parameters of browsing speed, viewing distance and contrast. In this work we further improve our numerical observer by modeling contrast masking. After stCSF, contrast masking is the second most prominent property of HVS and it refers to the fact that the presence of one signal affects the visibility threshold for another signal. Our results indicate that the improved numerical observer better predicts changes in detection performance with background complexity

    Open Data and Digital Morphology

    Get PDF
    Over the past two decades, the development of methods for visualizing and analysing specimens digitally, in three and even four dimensions, has transformed the study of living and fossil organisms. However, the initial promise that the widespread application of such methods would facilitate access to the underlying digital data has not been fully achieved. The underlying datasets for many published studies are not readily or freely available, introducing a barrier to verification and reproducibility, and the reuse of data. There is no current agreement or policy on the amount and type of data that should be made available alongside studies that use, and in some cases are wholly reliant on, digital morphology. Here, we propose a set of recommendations for minimum standards and additional best practice for three-dimensional digital data publication, and review the issues around data storage, management and accessibility

    Automatic cerebrovascular segmentation methods - a review

    Get PDF
    Cerebrovascular diseases are one of the serious causes for the increase in mortality rate in the world which affect the blood vessels and blood supply to the brain. In order, diagnose and study the abnormalities in the cerebrovascular system, accurate segmentation methods can be used. The shape, direction and distribution of blood vessels can be studied using automatic segmentation. This will help the doctors to envisage the cerebrovascular system. Due to the complex shape and topology, automatic segmentation is still a challenge to the clinicians. In this paper, some of the latest approaches used for segmentation of magnetic resonance angiography (MRA) images are explained. Some of such methods are deep convolutional neural network (CNN), 3dimentional-CNN (3D-CNN) and 3D U-Net. Finally, these methods are compared for evaluating their performance. 3D U-Net is the better performer among the described methods

    Facial re-enactment, speech synthesis and the rise of the Deepfake

    Get PDF
    Emergent technologies in the fields of audio speech synthesis and video facial manipulation have the potential to drastically impact our societal patterns of multimedia consumption. At a time when social media and internet culture is plagued by misinformation, propaganda and “fake news”, their latent misuse represents a possible looming threat to fragile systems of information sharing and social democratic discourse. It has thus become increasingly recognised in both academic and mainstream journalism that the ramifications of these tools must be examined to determine what they are and how their widespread availability can be managed. This research project seeks to examine four emerging software programs – Face2Face, FakeApp , Adobe VoCo and Lyrebird – that are designed to facilitate the synthesis of speech and manipulate facial features in videos. I will explore their positive industry applications and the potentially negative consequences of their release into the public domain. Consideration will be directed to how such consequences and risks can be ameliorated through detection, regulation and education. A final analysis of these three competing threads will then attempt to address whether the practical and commercial applications of these technologies are outweighed by the inherent unethical or illegal uses they engender, and if so; what we can do in response
    • …
    corecore