67 research outputs found
BeyondPixels: A Comprehensive Review of the Evolution of Neural Radiance Fields
Neural rendering combines ideas from classical computer graphics and machine
learning to synthesize images from real-world observations. NeRF, short for
Neural Radiance Fields, is a recent innovation that uses AI algorithms to
create 3D objects from 2D images. By leveraging an interpolation approach, NeRF
can produce new 3D reconstructed views of complicated scenes. Rather than
directly restoring the whole 3D scene geometry, NeRF generates a volumetric
representation called a ``radiance field,'' which is capable of creating color
and density for every point within the relevant 3D space. The broad appeal and
notoriety of NeRF make it imperative to examine the existing research on the
topic comprehensively. While previous surveys on 3D rendering have primarily
focused on traditional computer vision-based or deep learning-based approaches,
only a handful of them discuss the potential of NeRF. However, such surveys
have predominantly focused on NeRF's early contributions and have not explored
its full potential. NeRF is a relatively new technique continuously being
investigated for its capabilities and limitations. This survey reviews recent
advances in NeRF and categorizes them according to their architectural designs,
especially in the field of novel view synthesis.Comment: 22 page, 1 figure, 5 tabl
A multimedia indexing and retrieval framework for multimedia database systems
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users\u27 subjective perceptions of multimedia content. The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model
Recommended from our members
A dynamic user concept pattern learning framework for content-based image retrieval
IEEE IRI 2012 INTERNATIONAL TECHNICAL PROGRAM COMMITTEE
Welcome to the proceedings of the 13th IEEE International Conference on Information Reuse and Integration (IEEE IRI 2012) in Las Vegas, Nevada, USA. Information Reuse and Integration (IRI) aims at maximizing the reuse of information by creating simple, rich, and reusable knowledge representations and consequently explores strategies for integrating this knowledge into legacy systems. IRI plays a pivotal role in the capture, representation, maintenance, integration, validation, and extrapolation of information; and applies both information and knowledge for enhancing decision-making in various application domains. During more than adecade of conferences, IRI has established itself as an internationally renowned forum for researchers and practitioners to exchange ideas, connect with colleagues, and advance the state of the art and practice of current and future research in information reuse and integration
Semantic Event Extraction Using Neural Network Ensembles
This paper proposes a novel semantic content analysis framework for reliable video event extraction which is essential for high-level video indexing and retrieval. In this work, we target to address the unique challenges posed in rare event detection, where positive examples (i.e., eventful data points) are vastly outnumbered and thus overshadowed by negative ones (i.e., noneventful data points). The proposed framework tackles this issue by integrating the strength of multimodal content analysis and neural network ensembles. Specifically, due to the rareness of the target events, the boostrapped sampling method is adopted to reduce the effect of class imbalance and a group of component neural networks are constructed consequently. Thereafter, a weighting scheme is applied to intelligently traverse and combine the component network predictions. The effectiveness of the proposed framework is demonstrated over a large collection of soccer video data with different styles produced by different broadcasters. 1
Recommended from our members
Spatiotemporal vehicle tracking: the use of unsupervised learning-based segmentation and object tracking
In this paper, a framework for spatiotemporal vehicle tracking using unsupervised learning-based segmentation and object tracking is presented. An adaptive background learning and subtraction method is proposed and applied to two real-traffic video sequences to obtain more accurate spatiotemporal information on the vehicle objects. As demonstrated in the experiments, almost all vehicle objects are successfully identified through this framework
- …