Search CORE

6,415 research outputs found

Fast 3D reconstruction with single shot technology : engineering and computing challenges

Author: Rodrigues Marcos
Publication venue: Sri Lanka Institute of Information Technology
Publication date: 01/01/2012
Field of study

Fast 3D reconstruction with single shot technology: the GMPR 3D scanning technologies provide fast wide area scanning from an instantaneous shot. A surface can be reconstructed in 40 milliseconds from a pattern of stripes projected on the target object. It operates on a single image or on a video sequence both in the near-infrared (NIR) and visible spectra. In this talk we are going to describe the engineering and computing principles behind the technologies, highlight the main achievements of our research to date and discuss a number of remaining challenge

Sheffield Hallam University Research Archive

Fast 3D Reconstruction using Structured Light Methods

Author: Rodrigues Marcos
Publication venue
Publication date
Field of study

In this presentation we discuss the use of structured light scanners for the general problem of 3D surface reconstruction. We show that projecting patterns of light provide an inexpensive means of consistent 3D scanning at high resolution, in real-time and from single images. The main problem of such techniques is pattern decoding or stripe indexing, which can be substantially non-trivial and difficult to overcome in a reliable way. We discuss existing techniques and show how a minimal light coding in the projected stripes can resolve inherent ambiguities found in stripe patterns across surface discontinuities. We also discuss how our real-time solution using structured near-infrared light can overcome ambient illumination and used in a variety of medical contexts

Sheffield Hallam University Research Archive

Novel methods for real-time 3D facial recognition

Author: Robinson Alan
Rodrigues Marcos
Publication venue: 'Athens Institute for Education and Research ATINER'
Publication date: 01/01/2010
Field of study

In this paper we discuss our approach to real-time 3D face recognition. We argue the need for real time operation in a realistic scenario and highlight the required pre- and post-processing operations for effective 3D facial recognition. We focus attention to some operations including face and eye detection, and fast post-processing operations such as hole filling, mesh smoothing and noise removal. We consider strategies for hole filling such as bilinear and polynomial interpolation and Laplace and conclude that bilinear interpolation is preferred. Gaussian and moving average smoothing strategies are compared and it is shown that moving average can have the edge over Gaussian smoothing. The regions around the eyes normally carry a considerable amount of noise and strategies for replacing the eyeball with a spherical surface and the use of an elliptical mask in conjunction with hole filling are compared. Results show that the elliptical mask with hole filling works well on face models and it is simpler to implement. Finally performance issues are considered and the system has demonstrated to be able to perform real-time 3D face recognition in just over 1s 200ms per face model for a small database

Sheffield Hallam University Research Archive

Developing Interaction 3D Models for E-Learning Applications

Author: Robinson Alan
Rodrigues Marcos
Publication venue: Laborciencia ltd
Publication date: 01/01/2009
Field of study

Some issues concerning the development of interactive 3D models for e-learning applications are considered. Given that 3D data sets are normally large and interactive display demands high performance computation, a natural solution would be placing the computational burden on the client machine rather than on the server. Mozilla and Google opted for a combination of client-side languages, JavaScript and OpenGL, to handle 3D graphics in a web browser (Mozilla 3D and O3D respectively). Based on the O3D model, core web technologies are considered and an example of the full process involving the generation of a 3D model and their interactive visualization in a web browser is described. The challenging issue of creating realistic 3D models of objects in the real world is discussed and a method based on line projection for fast 3D reconstruction is presented. The generated model is then visualized in a web browser. The experiments demonstrate that visualization of 3D data in a web browser can provide quality user experience. Moreover, the development of web applications are facilitated by O3D JavaScript extension allowing web designers to focus on 3D contents generation

Sheffield Hallam University Research Archive

Real-time 3D Face Recognition using Line Projection and Mesh Sampling

Author: Robinson Alan
Rodrigues Marcos
Publication venue: Eurographics Association
Publication date: 10/04/2011
Field of study

The main contribution of this paper is to present a novel method for automatic 3D face recognition based on sampling a 3D mesh structure in the presence of noise. A structured light method using line projection is employed where a 3D face is reconstructed from a single 2D shot. The process from image acquisition to recognition is described with focus on its real-time operation. Recognition results are presented and it is demonstrated that it can perform recognition in just over one second per subject in continuous operation mode and thus, suitable for real time operation

Sheffield Hallam University Research Archive

Efficient 3D data compression through parameterization of free-form surface patches

Author: Osman Abdulsslam
Robinson Alan
Rodrigues Marcos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/04/2011
Field of study

This paper presents a new method for 3D data compression based on parameterization of surface patches. The technique is applied to data that can be defined as single valued functions; this is the case for 3D patches obtained using standard 3D scanners. The method defines a number of mesh cutting planes and the intersection of planes on the mesh defines a set of sampling points. These points contain an explicit structure that allows us to define parametrically both x and y coordinates. The z values are interpolated using high degree polynomials and results show that compressions over 99% are achieved while preserving the quality of the mesh

Sheffield Hallam University Research Archive

Text classification by Convolution Networks for Data-Driven Decision Making

Author: Rodrigues Marcos
Publication venue: 'Athens Institute for Education and Research ATINER'
Publication date: 31/07/2017
Field of study

Recent advances in automation and data-driven intelligence from sophisticated Artificial Intelligence (AI) technologies have impacted on all areas of knowledge and economic activity. AI Deep Learning is a method of learning and extracting knowledge from large amounts of data. AI algorithms iteratively learn from data, finding hidden features and providing insights without explicitly programmed features. Text classification can be cast as a generic problem whose solution can have significant impacts on data-driven decision processes and ERP-Enterprise Resource Planning information systems. Normally, classification is carried out from a given taxonomy. The causes for wrong classification may arise from inconsistent taxonomies, incomplete descriptions, wrong interpretation of category, inconsistent language translation, human error, algorithm design and so on. In this paper, we address the issue of automatic product classification from unconstrained textual descriptions using machine learning techniques. Rather than defining words in a vocabulary (as normally is the case for instance, with Google’s word2vec technique) this research focuses on character-based classification through a temporal convolution network as in Crepe (Character-level Convolutional Networks for Text Classification). The advantage is that instead of defining a vocabulary with tens of thousands of words, the vocabulary is made up of a small character set composed of the letters a-z, numbers 0-9, and special characters. Furthermore, because in any language words are defined by a sequence of characters, the relationships between the characters within a word or words are learned from the temporal convolution. This negates the need to learn words per se. The research used product descriptions from 6 categories: bakery, chilled, dairy, drinks, fruit and vegetables, meat and fish. A total of 8612 samples were used which were separated into a training set (7751 samples corresponding to 90% of the data) and unseen test set (861 samples or 10% of the data). The network has 15 convolution layers followed by 2 fully connected layers. The network was implemented using the Torch Framework on a Mac Pro running macOS Sierra 3.5GHz 6-core Intel Xeon E5 processor with 16GB of memory. The achieved overall accuracy of 91% is impressive given that the classification features were extracted from character sequences only and that descriptions are extremely short. It is shown that character-based classification is a valid solution for short descriptions and we are now investigating alternative network designs and expanding the training set

Sheffield Hallam University Research Archive

Partial Differential Equations for 3D Data Compression and Reconstruction

Author: Osman Abdusslam
Robinson Alan
Rodrigues Marcos
Publication venue
Publication date
Field of study

This paper describes a PDE-based method for 3D reconstruction of surface patches. The PDE method is exploited using data obtained from standard 3D scanners. First the original surface data are intersected by a number of cutting planes whose intersection points on the mesh are represented by Fourier transforms in each plane. Information on the number of vertices and scale of the surface are defined and, together, these efficiently define the compressed data. The PDE method is then applied at the reconstruction stage by defining PDE surface patches between the cutting planes

Sheffield Hallam University Research Archive

Improving product classification using generative recurrent networks

Author: Rodrigues Marcos
Publication venue: 'Athens Institute for Education and Research ATINER'
Publication date: 31/07/2018
Field of study

The issue addressed in this paper is related to machine learning techniques for automatic classification of product descriptions. The problem arises when database entries do no perfectly match and so it is questionable whether a description is related or not to the same item, product, or service. A typical example is merging disparate databases that is required, for instance, when one business buys off a competitor. An obvious solution would be to train an AI system to perform classification. The problem is that AI deep learning networks require vast amounts of training data, normally in tens or hundreds of thousand samples and normally such data are not available. We have investigated network models to augment the training data set in a flexible but reliable way. The principle is to train a network with the objective of generating new data similar but not exactly the same as the input data. Validation of the newly generated data is performed by a second network which has been trained on the original data. A simple binary decision (yes/no) is output whether or not generated data has enough or acceptable similarity with the original data. Accepted data would eventually make part of an augmented training set, improving the network ability to classify unseen data. We designed and implemented a recurrent network with Keras, an open source neural network library written in Python. The network is based on the LSTM-Long-Short Term Memory model which has proved useful to a large number of problems with time dependencies. The encoding of product description is character-based so, once trained, the network outputs a character and tries to predict what the next character would be. With an appropriate training set to learn the structure of the data, such networks can output valid vectors. We show that LSTMs are a good solution to the problem together with character-based text encoding and these represent the state-of-the-art in recurrent neural networks. Future work involves improvements to the network design model and testing SimpleRNN or GRU-Gate Recurrent Unit in place of LSTMs and fine-tuning of network parameters

Sheffield Hallam University Research Archive