67 research outputs found

    BeyondPixels: A Comprehensive Review of the Evolution of Neural Radiance Fields

    Full text link
    Neural rendering combines ideas from classical computer graphics and machine learning to synthesize images from real-world observations. NeRF, short for Neural Radiance Fields, is a recent innovation that uses AI algorithms to create 3D objects from 2D images. By leveraging an interpolation approach, NeRF can produce new 3D reconstructed views of complicated scenes. Rather than directly restoring the whole 3D scene geometry, NeRF generates a volumetric representation called a ``radiance field,'' which is capable of creating color and density for every point within the relevant 3D space. The broad appeal and notoriety of NeRF make it imperative to examine the existing research on the topic comprehensively. While previous surveys on 3D rendering have primarily focused on traditional computer vision-based or deep learning-based approaches, only a handful of them discuss the potential of NeRF. However, such surveys have predominantly focused on NeRF's early contributions and have not explored its full potential. NeRF is a relatively new technique continuously being investigated for its capabilities and limitations. This survey reviews recent advances in NeRF and categorizes them according to their architectural designs, especially in the field of novel view synthesis.Comment: 22 page, 1 figure, 5 tabl

    A multimedia indexing and retrieval framework for multimedia database systems

    No full text
    The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users\u27 subjective perceptions of multimedia content. The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model

    Guest Editors’ Introduction

    No full text

    IEEE IRI 2012 INTERNATIONAL TECHNICAL PROGRAM COMMITTEE

    No full text
    Welcome to the proceedings of the 13th IEEE International Conference on Information Reuse and Integration (IEEE IRI 2012) in Las Vegas, Nevada, USA. Information Reuse and Integration (IRI) aims at maximizing the reuse of information by creating simple, rich, and reusable knowledge representations and consequently explores strategies for integrating this knowledge into legacy systems. IRI plays a pivotal role in the capture, representation, maintenance, integration, validation, and extrapolation of information; and applies both information and knowledge for enhancing decision-making in various application domains. During more than adecade of conferences, IRI has established itself as an internationally renowned forum for researchers and practitioners to exchange ideas, connect with colleagues, and advance the state of the art and practice of current and future research in information reuse and integration

    Semantic Event Extraction Using Neural Network Ensembles

    No full text
    This paper proposes a novel semantic content analysis framework for reliable video event extraction which is essential for high-level video indexing and retrieval. In this work, we target to address the unique challenges posed in rare event detection, where positive examples (i.e., eventful data points) are vastly outnumbered and thus overshadowed by negative ones (i.e., noneventful data points). The proposed framework tackles this issue by integrating the strength of multimodal content analysis and neural network ensembles. Specifically, due to the rareness of the target events, the boostrapped sampling method is adopted to reduce the effect of class imbalance and a group of component neural networks are constructed consequently. Thereafter, a weighting scheme is applied to intelligently traverse and combine the component network predictions. The effectiveness of the proposed framework is demonstrated over a large collection of soccer video data with different styles produced by different broadcasters. 1
    corecore