6 research outputs found

    MORE: Simultaneous Multi-View 3D Object Recognition and Pose Estimation

    Get PDF
    Simultaneous object recognition and pose estimation are two key functionalities for robots to safely interact with humans as well as environments. Although both object recognition and pose estimation use visual input, most state-of-the-art tackles them as two separate problems since the former needs a view-invariant representation while object pose estimation necessitates a view-dependent description. Nowadays, multi-view Convolutional Neural Network (MVCNN) approaches show state-of-the-art classification performance. Although MVCNN object recognition has been widely explored, there has been very little research on multi-view object pose estimation methods, and even less on addressing these two problems simultaneously. The pose of virtual cameras in MVCNN methods is often pre-defined in advance, leading to bound the application of such approaches. In this paper, we propose an approach capable of handling object recognition and pose estimation simultaneously. In particular, we develop a deep object-agnostic entropy estimation model, capable of predicting the best viewpoints of a given 3D object. The obtained views of the object are then fed to the network to simultaneously predict the pose and category label of the target object. Experimental results showed that the views obtained from such positions are descriptive enough to achieve a good accuracy score. Code is available online at: https://github.com/tparisotto/more_mvcn

    3D Model Retrieval Based on Vision Feature Fusion

    Get PDF
    By applying multi-view features of a 3D model as the whole feature descriptors to match the 3D model feature, this paper presents a modified 3D model retrieval algorithm which is based on the fusion of contour features and texture features of the model. After two-dimensional depth images of the 3D model are obtained under the different views of a spherical bounding box, the contour feature and the texture features of the model images are fused for realizing the constitute of the 3D model. The experiment results shown that the proposed method gains great improvement in retrieval speed and effective rate in comparison with other view-based 3D model retrieval methods

    Canonical 3D object orientation for interactive light-field visualization

    Get PDF
    Light-field visualization allows the users to freely choose a preferred location for observation within the display’s valid field of view. As such 3D visualization technology offers continuous motion parallax, the users location determines the perceived orientation of the visualized content, if we consider static objects and scenes. In case of interactive light-field visualization, the arbitrary rotation of content enables efficient orientation changes without the need for actual user movement. However, the preference of content orientation is a subjective matter, yet it is possible to be objectively managed and assessed as well. In this paper, we present a series of subjective tests we carried out on a real light-field display that addresses static content orientation preference. The state-of-the-art objective methodologies were used to evaluate the experimental setup and the content. We used the subjective results in order to develop our own objective metric for canonical orientation selection

    Exploiting user interactivity in quality assessment of point cloud imaging

    Get PDF
    Point clouds are a new modality for representation of plenoptic content and a popular alternative to create immersive media. Despite recent progress in capture, display, storage, delivery and processing, the problem of a reliable approach to subjectively and objectively assess the quality of point clouds is still largely open. In this study, we extend the state of the art in projection-based objective quality assessment of point cloud imaging by investigating the impact of the number of viewpoints employed to assess the visual quality of a content, while discarding information that does not belong to the object under assessment, such as background color. Additionally, we propose assigning weights to the projected views based on interactivity information, obtained during subjective evaluation experiments. In the experiment that was conducted, human observers assessed a carefully selected collection of typical contents, subject to geometry and color degradations due to compression. The point cloud models were rendered using cubes as primitive elements with adaptive sizes based on local neighborhoods. Our results show that employing a larger number of projected views does not necessarily lead to better predictions of visual quality, while user interactivity information can improve the performance

    Visualization, Adaptation, and Transformation of Procedural Grammars

    Get PDF
    Procedural shape grammars are powerful tools for the automatic generation of highly detailed 3D content from a set of descriptive rules. It is easy to encode variations in stochastic and parametric grammars, and an uncountable number of models can be generated quickly. While shape grammars offer these advantages over manual 3D modeling, they also suffer from certain drawbacks. We present three novel methods that address some of the limitations of shape grammars. First, it is often difficult to grasp the diversity of models defined by a given grammar. We propose a pipeline to automatically generate, cluster, and select a set of representative preview images for a grammar. The system is based on a new view attribute descriptor that measures how suitable an image is in representing a model and that enables the comparison of different models derived from the same grammar. Second, the default distribution of models in a stochastic grammar is often undesirable. We introduce a framework that allows users to design a new probability distribution for a grammar without editing the rules. Gaussian process regression interpolates user preferences from a set of scored models over an entire shape space. A symbol split operation enables the adaptation of the grammar to generate models according to the learned distribution. Third, it is hard to combine elements of two grammars to emerge new designs. We present design transformations and grammar co-derivation to create new designs from existing ones. Algorithms for fine-grained rule merging can generate a large space of design variations and can be used to create animated transformation sequences between different procedural designs. Our contributions to visualize, adapt, and transform grammars makes the procedural modeling methodology more accessible to non-programmers
    corecore