10 research outputs found

    SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation

    Full text link
    In this paper, we propose a scribble-based video colorization network with temporal aggregation called SVCNet. It can colorize monochrome videos based on different user-given color scribbles. It addresses three common issues in the scribble-based video colorization area: colorization vividness, temporal consistency, and color bleeding. To improve the colorization quality and strengthen the temporal consistency, we adopt two sequential sub-networks in SVCNet for precise colorization and temporal smoothing, respectively. The first stage includes a pyramid feature encoder to incorporate color scribbles with a grayscale frame, and a semantic feature encoder to extract semantics. The second stage finetunes the output from the first stage by aggregating the information of neighboring colorized frames (as short-range connections) and the first colorized frame (as a long-range connection). To alleviate the color bleeding artifacts, we learn video colorization and segmentation simultaneously. Furthermore, we set the majority of operations on a fixed small image resolution and use a Super-resolution Module at the tail of SVCNet to recover original sizes. It allows the SVCNet to fit different image resolutions at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo benchmarks. The experimental results demonstrate that SVCNet produces both higher-quality and more temporally consistent videos than other well-known video colorization approaches. The codes and models can be found at https://github.com/zhaoyuzhi/SVCNet.Comment: accepted by IEEE Transactions on Image Processing (TIP

    A review of image and video colorization: From analogies to deep learning

    Get PDF
    Image colorization is a classic and important topic in computer graphics, where the aim is to add color to a monochromatic input image to produce a colorful result. In this survey, we present the history of colorization research in chronological order and summarize popular algorithms in this field. Early works on colorization mostly focused on developing techniques to improve the colorization quality. In the last few years, researchers have considered more possibilities such as combining colorization with NLP (natural language processing) and focused more on industrial applications. To better control the color, various types of color control are designed, such as providing reference images or color-scribbles. We have created a taxonomy of the colorization methods according to the input type, divided into grayscale, sketch-based and hybrid. The pros and cons are discussed for each algorithm, and they are compared according to their main characteristics. Finally, we discuss how deep learning, and in particular Generative Adversarial Networks (GANs), has changed this field

    The effects of user-AI co-creation on UI design tasks

    Get PDF
    With the boost of GPU computation power and the developments of neural networks in the recent decade, a lot of AI technique are invented and show bright potential of improving human tasks. GAN (generative adversarial network) as one of recent AI technique has powerful ability to perform image generation tasks. Besides, many researchers are working on exploring the potentials and understand user-AI collaboration by developing prototype with the help of neural networks (such as GAN). Unlike previous works focus on simple sketch task, this work studied the user experience with UI design task to understand how AI could improve or harm the user experience within practical and complex design tasks. The findings are as follows: multiple-hint AI turned out to be more user-friendly, and it is im-portant to study and understand how AI’s presentation should be designed for user-AI col-laboration. Based on these findings and previous works, this research discussed about what factors should be taken into consideration when designing user-AI collaboration tool

    Image and Video Forensics

    Get PDF
    Nowadays, images and videos have become the main modalities of information being exchanged in everyday life, and their pervasiveness has led the image forensics community to question their reliability, integrity, confidentiality, and security. Multimedia contents are generated in many different ways through the use of consumer electronics and high-quality digital imaging devices, such as smartphones, digital cameras, tablets, and wearable and IoT devices. The ever-increasing convenience of image acquisition has facilitated instant distribution and sharing of digital images on digital social platforms, determining a great amount of exchange data. Moreover, the pervasiveness of powerful image editing tools has allowed the manipulation of digital images for malicious or criminal ends, up to the creation of synthesized images and videos with the use of deep learning techniques. In response to these threats, the multimedia forensics community has produced major research efforts regarding the identification of the source and the detection of manipulation. In all cases (e.g., forensic investigations, fake news debunking, information warfare, and cyberattacks) where images and videos serve as critical evidence, forensic technologies that help to determine the origin, authenticity, and integrity of multimedia content can become essential tools. This book aims to collect a diverse and complementary set of articles that demonstrate new developments and applications in image and video forensics to tackle new and serious challenges to ensure media authenticity

    Assistive visual content creation tools via multimodal correlation analysis

    Get PDF
    Visual imagery is ubiquitous in society and can take various formats: from 2D sketches and photographs to photorealistic 3D renderings and animations. The creation processes for each of these mediums have their own unique challenges and methodologies that artists need to overcome and master. For example, for an artist to depict a 3D scene in a 2D drawing they need to understand foreshortening effects to position and scale objects accurately on the page; or, when modeling 3D scenes, artists need to understand how light interacts with objects and materials, to achieve a desired appearance. Many of these tasks can be complex, time-consuming, and repetitive for content creators. The goal of this thesis is to develop tools to alleviate artists from some of these issues and to assist them in the creation process. The key hypothesis is that understanding the relationships between multiple signals present in the scene being created enables such assistive tools. This thesis proposes three assistive tools. First, we present an image degradation model for depth-augmented image editing to help evaluate the quality of the image manipulation. Second, we address the problem of teaching novices to draw objects accurately by automatically generating easy-to-follow sketching tutorials for arbitrary 3D objects. Finally, we propose a method to automatically transfer 2D parametric user edits made to rendered 3D scenes to global variations of the original scene

    Compendium of U.S. Copyright Office Practices, Third Edition

    Get PDF
    The Compendium of U.S. Copyright Office Practices, Third Edition (the “Compendium” or “Third Edition”) is the administrative manual of the Register of Copyrights concerning Title 17 of the United States Code and Chapter 37 of the Code of Federal Regulations. It provides instruction to agency staff regarding their statutory duties and provides expert guidance to copyright applicants, practitioners, scholars, the courts, and members of the general public regarding institutional practices and related principles of law. The Compendium documents and explains the many technical requirements, regulations, and legal interpretations of the U.S. Copyright Office with a primary focus on the registration of copyright claims, documentation of copyright ownership, and recordation of copyright documents, including assignments and licenses. It describes the wide range of services that the Office provides for searching, accessing, and retrieving information located in its extensive collection of copyright records and the associated fees for these services. The Compendium provides guidance regarding the contents and scope of particular registrations and records. And it seeks to educate applicants about a number of common mistakes, such as providing incorrect, ambiguous, or insufficient information, or making overbroad claims of authorship. The Compendium does not cover every principle of copyright law or detail every aspect of the Office’s administrative practices. The Office may, in exceptional circumstances, depart from its normal practices to ensure an outcome that is most appropriate. The Compendium does not override any existing statute or regulation. The policies and practices set forth in the Compendium do not in themselves have the force and effect of law and are not binding upon the Register of Copyrights or Copyright Office staff. However, the Compendium does explain the legal rationale and determinations of the Copyright Office, where applicable, including circumstances where there is no controlling judicial authority
    corecore