Search CORE

9 research outputs found

3D Object Reconstruction from Imperfect Depth Data Using Extended YOLOv3 Network

Author: Damaševičius Robertas
Ho Edmond
Kulikajevas Audrius
Maskeliūnas Rytis
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

State-of-the-art intelligent versatile applications provoke the usage of full 3D, depth-based streams, especially in the scenarios of intelligent remote control and communications, where virtual and augmented reality will soon become outdated and are forecasted to be replaced by point cloud streams providing explorable 3D environments of communication and industrial data. One of the most novel approaches employed in modern object reconstruction methods is to use a priori knowledge of the objects that are being reconstructed. Our approach is different as we strive to reconstruct a 3D object within much more difficult scenarios of limited data availability. Data stream is often limited by insufficient depth camera coverage and, as a result, the objects are occluded and data is lost. Our proposed hybrid artificial neural network modifications have improved the reconstruction results by 8.53 which allows us for much more precise filling of occluded object sides and reduction of noise during the process. Furthermore, the addition of object segmentation masks and the individual object instance classification is a leap forward towards a general-purpose scene reconstruction as opposed to a single object reconstruction task due to the ability to mask out overlapping object instances and using only masked object area in the reconstruction process

Northumbria Research Link

KTUePubl (Repository of Kaunas University of Technology)

3DGen: Triplane Latent Diffusion for Textured Mesh Generation

Author: Gupta Anchit
Jones Ian
Nie Yixin
Oğuz Barlas
Xiong Wenhan
Publication venue
Publication date: 27/03/2023
Field of study

Latent diffusion models for image generation have crossed a quality threshold which enabled them to achieve mass adoption. Recently, a series of works have made advancements towards replicating this success in the 3D domain, introducing techniques such as point cloud VAE, triplane representation, neural implicit surfaces and differentiable rendering based training. We take another step along this direction, combining these developments in a two-step pipeline consisting of 1) a triplane VAE which can learn latent representations of textured meshes and 2) a conditional diffusion model which generates the triplane features. For the first time this architecture allows conditional and unconditional generation of high quality textured or untextured 3D meshes across multiple diverse categories in a few seconds on a single GPU. It outperforms previous work substantially on image-conditioned and unconditional generation on mesh quality as well as texture generation. Furthermore, we demonstrate the scalability of our model to large datasets for increased quality and diversity. We will release our code and trained models

arXiv.org e-Print Archive

Signature Verification Using Siamese Convolutional Neural Networks

Author: Chinonso Okoli Chika Yinka-Banjo &
Publication venue: Covenant University, Ota, Nigeria
Publication date: 17/12/2019
Field of study

This research entails the processes undergone in building a Siamese Neural Network for Signature Verification. This Neural Network which uses two similar base neural networks as its underlying architecture was built, trained and evaluated in this project. The base networks were made up of two similar convolutional neural networks sharing the same weights during training. The architecture commonly known as the Siamese network helped reduce the amount of training data needed for its implementation and thus increased the model’s efficiency by 13%. The convolutional network was made up of three convolutional layers, three pooling layers and one fully connected layer onto which the final results were passed to the contrastive loss function for comparison. A threshold function determined if the signatures were forged or not. An accuracy of 78% initially achieved led to the tweaking and improvement of the model to achieve a better prediction accuracy of 93%

Covenant Journals (Covenant University)

Deep-Learning-Based 3-D Surface Reconstruction—A Survey

Author: Benediktsson Jón Atli
Cavallaro Gabriele
Debus Charlotte
Farshian Anis
Götz Markus
Nießner Matthias
Streit Achim
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 19/12/2023
Field of study

In the last decade, deep learning (DL) has significantly impacted industry and science. Initially largely motivated by computer vision tasks in 2-D imagery, the focus has shifted toward 3-D data analysis. In particular, 3-D surface reconstruction, i.e., reconstructing a 3-D shape from sparse input, is of great interest to a large variety of application fields. DL-based approaches show promising quantitative and qualitative surface reconstruction performance compared to traditional computer vision and geometric algorithms. This survey provides a comprehensive overview of these DL-based methods for 3-D surface reconstruction. To this end, we will first discuss input data modalities, such as volumetric data, point clouds, and RGB, single-view, multiview, and depth images, along with corresponding acquisition technologies and common benchmark datasets. For practical purposes, we also discuss evaluation metrics enabling us to judge the reconstructive performance of different methods. The main part of the document will introduce a methodological taxonomy ranging from point- and mesh-based techniques to volumetric and implicit neural approaches. Recent research trends, both methodological and for applications, are highlighted, pointing toward future developments

KITopen

Explain what you see:argumentation-based learning and robotic vision

Author: Ayoobi Hamed
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2023
Field of study

In this thesis, we have introduced new techniques for the problems of open-ended learning, online incremental learning, and explainable learning. These methods have applications in the classification of tabular data, 3D object category recognition, and 3D object parts segmentation. We have utilized argumentation theory and probability theory to develop these methods. The first proposed open-ended online incremental learning approach is Argumentation-Based online incremental Learning (ABL). ABL works with tabular data and can learn with a small number of learning instances using an abstract argumentation framework and bipolar argumentation framework. It has a higher learning speed than state-of-the-art online incremental techniques. However, it has high computational complexity. We have addressed this problem by introducing Accelerated Argumentation-Based Learning (AABL). AABL uses only an abstract argumentation framework and uses two strategies to accelerate the learning process and reduce the complexity. The second proposed open-ended online incremental learning approach is the Local Hierarchical Dirichlet Process (Local-HDP). Local-HDP aims at addressing two problems of open-ended category recognition of 3D objects and segmenting 3D object parts. We have utilized Local-HDP for the task of object part segmentation in combination with AABL to achieve an interpretable model to explain why a certain 3D object belongs to a certain category. The explanations of this model tell a user that a certain object has specific object parts that look like a set of the typical parts of certain categories. Moreover, integrating AABL and Local-HDP leads to a model that can handle a high degree of occlusion

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

On the Automation and Diagnosis of Visual Intelligence

Author: Liu Chenxi
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 08/01/2021
Field of study

One of the ultimate goals of computer vision is to equip machines with visual intelligence: the ability to understand a scene at the level that is indistinguishable from human's. This not only requires detecting the 2D or 3D locations of objects, but also recognizing their semantic categories, or even higher level interactions. Thanks to decades of vision research as well as recent developments in deep learning, we are closer to this goal than ever. But to keep closing the gap, more research is needed on two themes. One, current models are still far from perfect, so we need a mechanism to keep proposing new, better models to improve performance. Two, while we are pushing for performance, it is also important to do careful analysis and diagnosis of existing models, to make sure we are indeed moving in the right direction. In this dissertation, I study either of the two research themes for various steps in the visual intelligence pipeline. The first part of the dissertation focuses on category-level understanding of 2D images, which is arguably the most critical step in the visual intelligence pipeline as it bridges vision and language. The theme is on automating the process of model improvement: in particular, the architecture of neural networks. The second part extends the visual intelligence pipeline along the language side, and focuses on the more challenging language-level understanding of 2D images. The theme also shifts to diagnosis, by examining existing models, proposing interpretable models, or building diagnostic datasets. The third part continues in the diagnosis theme, this time extending along the vision side, focusing on how incorporating 3D scene knowledge may facilitate the evaluation of image recognition models

JScholarship

Reconstruction of 3D Object Shape Using Hybrid Modular Neural Network Architecture Trained on 3D Models from ShapeNetCore Dataset

Author: Audrius Kulikajevas
Baldwin
Kainz
Robertas Damaševičius
Rytis Maskeliūnas
Sanjay Misra
Publication venue: 'MDPI AG'
Publication date
Field of study

Crossref

Reconstruction of 3D Object Shape Using Hybrid Modular Neural Network Architecture Trained on 3D Models from <i>ShapeNetCore</i> Dataset

Author: Audrius Kulikajevas
Robertas Damaševičius
Rytis Maskeliūnas
Sanjay Misra
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Depth-based reconstruction of three-dimensional (3D) shape of objects is one of core problems in computer vision with a lot of commercial applications. However, the 3D scanning for point cloud-based video streaming is expensive and is generally unattainable to an average user due to required setup of multiple depth sensors. We propose a novel hybrid modular artificial neural network (ANN) architecture, which can reconstruct smooth polygonal meshes from a single depth frame, using a priori knowledge. The architecture of neural network consists of separate nodes for recognition of object type and reconstruction thus allowing for easy retraining and extension for new object types. We performed recognition of nine real-world objects using the neural network trained on the ShapeNetCore model dataset. The results evaluated quantitatively using the Intersection-over-Union (IoU), Completeness, Correctness and Quality metrics, and qualitative evaluation by visual inspection demonstrate the robustness of the proposed architecture with respect to different viewing angles and illumination conditions

KTUePubl (Repository of Kaunas University of Technology)

Directory of Open Access Journals