Search CORE

220 research outputs found

Automatic visual recognition using parallel machines

Author: Chen Yui-Liang
Publication venue: Digital Commons @ NJIT
Publication date: 31/10/1995
Field of study

Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity. In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods. Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration. A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture

Digital Commons @ New Jersey Institute of Technology (NJIT)

ShadowNeuS: Neural SDF Reconstruction by Shadow Ray Supervision

Author: Ling Jingwang
Wang Zhibo
Xu Feng
Publication venue
Publication date: 25/11/2022
Field of study

By supervising camera rays between a scene and multi-view image planes, NeRF reconstructs a neural scene representation for the task of novel view synthesis. On the other hand, shadow rays between the light source and the scene have yet to be considered. Therefore, we propose a novel shadow ray supervision scheme that optimizes both the samples along the ray and the ray location. By supervising shadow rays, we successfully reconstruct a neural SDF of the scene from single-view pure shadow or RGB images under multiple lighting conditions. Given single-view binary shadows, we train a neural network to reconstruct a complete scene not limited by the camera's line of sight. By further modeling the correlation between the image colors and the shadow rays, our technique can also be effectively extended to RGB inputs. We compare our method with previous works on challenging tasks of shape reconstruction from single-view binary shadow or RGB images and observe significant improvements. The code and data will be released.Comment: Project page: https://gerwang.github.io/shadowneus

arXiv.org e-Print Archive

Recommended from our members

The MVP sensor planning system for robotic vision tasks

Author: Allen Peter K.
Tarabanis Konstantinos
Tsai Roger Y.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/1995
Field of study

The MVP (machine vision planner) model-based sensor planning system for robotic vision is presented. MVP automatically synthesizes desirable camera views of a scene based on geometric models of the environment, optical models of the vision sensors, and models of the task to be achieved. The generic task of feature detectability has been chosen since it is applicable to many robot-controlled vision systems. For such a task, features of interest in the environment are required to simultaneously be visible, inside the field of view, in focus, and magnified as required. In this paper, we present a technique that poses the vision sensor planning problem in an optimization setting and determines viewpoints that satisfy all previous requirements simultaneously and with a margin. In addition, we present experimental results of this technique when applied to a robotic vision system that consists of a camera mounted on a robot manipulator in a hand-eye configuration

Columbia University Academic Commons

3D Model Assisted Image Segmentation

Author: Hutter Marcus
Jayawardena Srimal
Yang Di
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2016
Field of study

The Australian National University

3D Model Assisted Image Segmentation

Author: Di Yang
Marcus Hutter
Srimal Jayawardena
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

The problem of segmenting a given image into coherent regions is important in Computer Vision and many industrial applications require segmenting a known object into its components. Examples include identifying individual parts of a component for process control work in a manufacturing plant and identifying parts of a car from a photo for automatic damage detection. Unfortunately most of an object’s parts of interest in such applications share the same pixel characteristics, having similar colour and texture. This makes segmenting the object into its components a non-trivial task for conventional image segmentation algorithms. In this paper, we propose a “Model Assisted Segmentation ” method to tackle this problem. A 3D model of the object is registered over the given image by optimising a novel gradient based loss function. This registration obtains the full 3D pose from an image of the object. The image can have an arbitrary view of the object and is not limited to a particular set of views. The segmentation is subsequently performed using a level-set based method, using the projected contours of the registered 3D model as initialisation curves. The method is fully automatic and requires no user interaction. Also, the system does not require any prior training. We present our results on photographs of a real car

CiteSeerX

Crossref

The Australian National University

Integrated smoothed location model and data reduction approaches for multi variables classification

Author: Hashibah Hamid
Publication venue
Publication date: 01/01/2014
Field of study

Smoothed Location Model is a classification rule that deals with mixture of continuous variables and binary variables simultaneously. This rule discriminates groups in a parametric form using conditional distribution of the continuous variables given each pattern of the binary variables. To conduct a practical classification analysis, the objects must first be sorted into the cells of a multinomial table generated from the binary variables. Then, the parameters in each cell will be estimated using the sorted objects. However, in many situations, the estimated parameters are poor if the number of binary is large relative to the size of sample. Large binary variables will create too many multinomial cells which are empty, leading to high sparsity problem and finally give exceedingly poor performance for the constructed rule. In the worst case scenario, the rule cannot be constructed. To overcome such shortcomings, this study proposes new strategies to extract adequate variables that contribute to optimum performance of the rule. Combinations of two extraction techniques are introduced, namely 2PCA and PCA+MCA with new cutpoints of eigenvalue and total variance explained, to determine adequate extracted variables which lead to minimum misclassification rate. The outcomes from these extraction techniques are used to construct the smoothed location models, which then produce two new approaches of classification called 2PCALM and 2DLM. Numerical evidence from simulation studies demonstrates that the computed misclassification rate indicates no significant difference between the extraction techniques in normal and non-normal data. Nevertheless, both proposed approaches are slightly affected for non-normal data and severely affected for highly overlapping groups. Investigations on some real data sets show that the two approaches are competitive with, and better than other existing classification methods. The overall findings reveal that both proposed approaches can be considered as improvement to the location model, and alternatives to other classification methods particularly in handling mixed variables with large binary size

Universiti Utara Malaysia: UUM eTheses

A Comprehensive Review and Open Challenges on Visual Question Answering Models

Author: Gupta Ashutosh
Kalla Mukesh
Koshti Dipali
Shaik Fasi Ahamad
Sharma Arvind
Publication venue: 'Universidad de Santander - UDES'
Publication date: 01/09/2023
Field of study

Users are now able to actively interact with images and pose different questions based on images, thanks to recent developments in artificial intelligence. In turn, a response in a natural language answer is expected. The study discusses a variety of datasets that can be used to examine applications for visual question-answering (VQA), as well as their advantages and disadvantages. Four different forms of VQA models—simple joint embedding-based models, attention-based models, knowledge-incorporated models, and domain-specific VQA models—are in-depth examined in this article. We also critically assess the drawbacks and future possibilities of all current state-of-the-art (SoTa), end-to-end VQA models. Finally, we present the directions and guidelines for further development of the VQA models

Revistas UDES (Universidad de Santander)

Estimating Vertical Visual Field Luminance Measurements From Ceiling-Based Measurements Using Machine Learning

Author: Chinwa Songwa Prince U.
Publication venue
Publication date: 25/09/2020
Field of study

Pure OAI Repository