38 research outputs found
A survey of visual preprocessing and shape representation techniques
Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention)
Recommended from our members
A Novel Inpainting Framework for Virtual View Synthesis
Multi-view imaging has stimulated significant research to enhance the user experience of free viewpoint video, allowing interactive navigation between views and the freedom to select a desired view to watch. This usually involves transmitting both textural and depth information captured from different viewpoints to the receiver, to enable the synthesis of an arbitrary view. In rendering these virtual views, perceptual holes can appear due to certain regions, hidden in the original view by a closer object, becoming visible in the virtual view. To provide a high quality experience these holes must be filled in a visually plausible way, in a process known as inpainting. This is challenging because the missing information is generally unknown and the hole-regions can be large. Recently depth-based inpainting techniques have been proposed to address this challenge and while these generally perform better than non-depth assisted methods, they are not very robust and can produce perceptual artefacts.
This thesis presents a new inpainting framework that innovatively exploits depth and textural self-similarity characteristics to construct subjectively enhanced virtual viewpoints. The framework makes three significant contributions to the field: i) the exploitation of view information to jointly inpaint textural and depth hole regions; ii) the introduction of the novel concept of self-similarity characterisation which is combined with relevant depth information; and iii) an advanced self-similarity characterising scheme that automatically determines key spatial transform parameters for effective and flexible inpainting.
The presented inpainting framework has been critically analysed and shown to provide superior performance both perceptually and numerically compared to existing techniques, especially in terms of lower visual artefacts. It provides a flexible robust framework to develop new inpainting strategies for the next generation of interactive multi-view technologies
A NEW TECHNIQUE IN MOBILE ROBOT SIMULTANEOUS LOCALIZATION AND MAPPING
ABSTRACT In field or indoor environments it is usually not possible to provide service robots with detailed a priori environment and task models. In such environments, robots will need to create a dimensionally accurate geometric model by moving around and scanning the surroundings with their sensors, while minimizing the complexity of the required sensing hardware. In this work, an iterative algorithm is proposed to plan the visual exploration strategy of service robots, enabling them to efficiently build a graph model of their environment without the need of costly sensors. In this algorithm, the information content present in sub-regions of a 2-D panoramic image of the environment is determined from the robot's current location using a single camera fixed on the mobile robot. Using a metric based on Shannon's information theory, the algorithm determines, from the 2-D image, potential locations of nodes from which to further image the environment. Using a feature tracking process, the algorithm helps navigate the robot to each new node, where the imaging process is repeated. A Mellin transform and tracking process is used to guide the robot back to a previous node. This imaging, evaluation, branching and retracing its steps continues until the robot has mapped the environment to a pre-specified level of detail. The effectiveness of this algorithm is verified experimentally through the exploration of an indoor environment by a single mobile robot agent using a limited sensor suite. KEYWORDS: Service robots, visual mapping, selflocalization, information theory, Mellin transform. RESUMO Usualmente não é possÃvel fornecer a priori a robôs móveis autônomos um mapa detalhado de seu ambiente de trabalho. Nestes casos, o robô precisa criar um modelo geométrico preciso movendo-se pelo ambiente e utilizando seus sensores. Neste trabalho, um algoritmo iterativo é proposto para planejar a estratégia de exploração de robôs móveis autôno-mos, permitindo-os construir de forma eficiente um modelo do ambiente em forma de grafo sem a necessidade de sensores de alto custo. Neste algoritmo, o conteúdo de informação presente em sub-regiões de uma imagem panorâmica 2-D do ambiente é determinada a partir da posição atual do robô usando uma única câmera fixada em sua estrutura. Usando uma métrica baseada na teoria da informação de Shannon, o algoritmo determina, a partir da imagem 2-D, localizações potenciais para novos nós do grafo, a partir dos quais serão tomadas novas imagens panorâmicas para prosseguir com a exploração. Uma transformada de Mellin é usada para guiar o robô de volta a um nó previamente explorado. Este processo continua até que todo o ambiente tenha sido explorado em um nÃvel de detalhes pré-especificado. A eficácia do algoritmo é verificada experimentalmente através da exploração de um ambiente interno por um agente robótico móvel dispondo apenas de um conjunto limitado de sensores. PALAVRAS-CHAVE: Robôs móveis, mapeamento visual, auto-localização, teoria da informação, transformada de Mellin
Mètode d'extracció multiparamètrica de caracterÃstiques de textura orientat a la segmentació d'imatges
Tal com es veurà en el següent capÃtol d'antecedents, existeixen formes molt variades d'afrontar l'anà lisi de textures però cap d'elles està orientada al cà lcul en temps real (video rate). Degut a la manca de mètodes que posin tant d'èmfasi en el temps de processat, l'objectiu d'aquesta tesi és definir i desenvolupar un nou mètode d'extracció de caracterÃstiques de textura que treballi en temps real. Per aconseguir aquesta alta velocitat d'operació, un altre objectiu és presentar el disseny d'una arquitectura especÃfica per implementar l'algorisme de cà lcul dels parà metres de textura definits, aixà com també l'algorisme de classificació dels parà metres i la segmentació de la imatge en regions de textura semblant.En el capÃtol 2 s'expliquen els diversos mètodes més rellevants dins la caracterització de textures. Es veuran els mètodes més importants tant pel que fa als enfocaments estadÃstics com als estructurals. També en el mateix capÃtol se situa el nou mètode presentat en aquesta tesi dins els diferents enfocaments principals que existeixen. De la mateixa manera es fa una breu ressenya a la sÃntesi de textures, una manera d'avaluar quantitativament la caracterització de la textura d'una imatge. Ens centrarem principalment, en el capÃtol 3, en l'explicació del mètode presentat en aquest treball: s'introduiran els parà metres de textura proposats, la seva necessitat i definicions. Al ser parà metres altament perceptius i no seguir cap model matemà tic, en aquest mateix capÃtol s'utilitza una tècnica estadÃstica anomenada anà lisi discriminant per demostrar que tots els parà metres introdueixen suficient informació per a la separabilitat de regions de textura i veure que tots ells són necessaris en la discriminació de les textures.Dins el capÃtol 4 veurem com es tracta la informació subministrada pel sistema d'extracció de caracterÃstiques per tal de classificar les dades i segmentar la imatge en funció de les seves textures. L'etapa de reconeixement de patrons es durà a terme en dues fases: aprenentatge i treball. També es presenta un estudi comparatiu entre diversos mètodes de classificació de textures i el mètode presentat en aquesta tesi; en ell es veu la bona funcionalitat del mètode en un temps de cà lcul realment reduït. S'acaba el capÃtol amb una anà lisi de la robustesa del mètode introduint imatges amb diferents nivells de soroll aleatori. En el capÃtol 5 es presentaran els resultats obtinguts mitjançant l'extracció de caracterÃstiques de textura a partir de diverses aplicacions reals. S'aplica el nostre mètode en aplicacions d'imatges aèries i en entorns agrÃcoles i sobre situacions que requereixen el processament en temps real com són la segmentació d'imatges de carreteres i una aplicació industrial d'inspecció i control de qualitat en l'estampació de teixits. Al final del capÃtol fem unes consideracions sobre dos efectes que poden influenciar en l'obtenció correcta dels resultats: zoom i canvis de perspectiva en les imatges de textura.En el capÃtol 6 es mostrarà l'arquitectura que s'ha dissenyat expressament per al cà lcul dels parà metres de textura en temps real. Dins el capÃtol es presentarà l'algorisme per a l'assignació de grups de textura i es demostrarà la seva velocitat d'operació a video rate.Finalment, en el capÃtol 7 es presentaran les conclusions i les lÃnies de treball futures que es deriven d'aquesta tesi, aixà com els articles que hem publicat en relació a aquest treball i a l'anà lisi de textures. Les referències bibliogrà fiques i els apèndixs conclouen el treball
Local and deep texture features for classification of natural and biomedical images
Developing efficient feature descriptors is very important in many computer vision applications including biomedical image analysis. In the past two decades and before the popularity of deep learning approaches in image classification, texture features proved to be very effective to capture the gradient variation in the image. Following the success of the Local Binary Pattern (LBP) descriptor, many variations of this descriptor were introduced to further improve the ability of obtaining good classification results. However, the problem of image classification gets more complicated when the number of images increases as well as the number of classes. In this case, more robust approaches must be used to address this problem. In this thesis, we address the problem of analyzing biomedical images by using a combination of local and deep features. First, we propose a novel descriptor that is based on the motif Peano scan concept called Joint Motif Labels (JML). After that, we combine the features extracted from the JML descriptor with two other descriptors called Rotation Invariant Co-occurrence among Local Binary Patterns (RIC-LBP) and Joint Adaptive Medina Binary Patterns (JAMBP). In addition, we construct another descriptor called Motif Patterns encoded by RIC-LBP and use it in our classification framework. We enrich the performance of our framework by combining these local descriptors with features extracted from a pre-trained deep network called VGG-19. Hence, the 4096 features of the Fully Connected 'fc7' layer are extracted and combined with the proposed local descriptors. Finally, we show that Random Forests (RF) classifier can be used to obtain superior performance in the field of biomedical image analysis. Testing was performed on two standard biomedical datasets and another three standard texture datasets. Results show that our framework can beat state-of-the-art accuracy on the biomedical image analysis and the combination of local features produce promising results on the standard texture datasets.Includes bibliographical reference
Detection of copy-move forgery in digital images using different computer vision approaches
Image forgery detection approaches are many and varied, but they generally all serve
the same objectives: detect and localize the forgery. Copy-move forgery detection
(CMFD) is widely spread and must challenge approach. In this thesis, We first investigate
the problems and the challenges of the existed algorithms to detect copy-move
forgery in digital images and then we propose integrating multiple forensic strategies
to overcome these problems and increase the efficiency of detecting and localizing
forgery based on the same image input source. Test and evaluate our copy-move
forgery detector algorithm presented the outcome that has been enhanced by various
computer vision field techniques. Because digital image forgery is a growing problem
due to the increase in readily-available technology that makes the process relatively
easy for forgers, we propose strategies and applications based on the PatchMatch
algorithm and deep neural network learning (DNN). We further focus on the convolutional
neural network (CNN) architecture approach in a generative adversarial
network (GAN) and transfer learning environment. The F-measure score (FM), recall,
precision, accuracy, and efficiency are calculated in the proposed algorithms and
compared with a selection of literature algorithms using the same evaluation function
in order to make a fair evaluation. The FM score achieves 0.98, with an efficiency rate
exceeding 90.5% in most cases of active and passive forgery detection tasks, indicating
that the proposed methods are highly robust. The output results show the high efficiency of detecting and localizing the forgery across different image formats for active
and passive forgery detection. Therefore, the proposed methods in this research
successfully overcome the main investigated issues in copy-move forgery detection as
such: First, increase efficiency in copy-move forgery detection under a wide range
of manipulation process to a copy-moved image. Second, detect and localized the
copy-move forgery patches versus the pristine patches in the forged image. Finally,
our experiments show the overall validation accuracy based on the proposed deep
learning approach is 90%, according to the iteration limit. Further enhancement of
the deep learning and learning transfer approach is recommended for future work