Search CORE

14 research outputs found

The Department of Electrical and Computer Engineering Newsletter

Author: University of Dayton. Department of Electrical and Computer Engineering
Publication venue: eCommons
Publication date: 01/04/2012
Field of study

Spring 2012 News and notes for University of Dayton\u27s Department of Electrical and Computer Engineering.https://ecommons.udayton.edu/ece_newsletter/1002/thumbnail.jp

University of Dayton

A CNN based hybrid approach towards automatic image registration

Author: Arun Pattathal Vijayakumar
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 26/09/2013
Field of study

Image registration is a key component of spatial analyses that involve different data sets of the same area. Automatic approaches in this domain have witnessed the application of several intelligent methodologies over the past decade; however accuracy of these approaches have been limited due to the inability to properly model shape as well as contextual information. In this paper, we investigate the possibility of an evolutionary computing based framework towards automatic image registration. Cellular Neural Network has been found to be effective in improving feature matching as well as resampling stages of registration, and complexity of the approach has been considerably reduced using corset optimization. CNN-prolog based approach has been adopted to dynamically use spectral and spatial information for representing contextual knowledge. The salient features of this work are feature point optimisation, adaptive resampling and intelligent object modelling. Investigations over various satellite images revealed that considerable success has been achieved with the procedure. Methodology also illustrated to be effective in providing intelligent interpretation and adaptive resampling

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)

Information fusion in content based image retrieval: A comprehensive overview

Author: GIACINTO GIORGIO
PIRAS LUCA
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

An ever increasing part of communication between persons involve the use of pictures, due to the cheap availability of powerful cameras on smartphones, and the cheap availability of storage space. The rising popularity of social networking applications such as Facebook, Twitter, Instagram, and of instant messaging applications, such as WhatsApp, WeChat, is the clear evidence of this phenomenon, due to the opportunity of sharing in real-time a pictorial representation of the context each individual is living in. The media rapidly exploited this phenomenon, using the same channel, either to publish their reports, or to gather additional information on an event through the community of users. While the real-time use of images is managed through metadata associated with the image (i.e., the timestamp, the geolocation, tags, etc.), their retrieval from an archive might be far from trivial, as an image bears a rich semantic content that goes beyond the description provided by its metadata. It turns out that after more than 20 years of research on Content-Based Image Retrieval (CBIR), the giant increase in the number and variety of images available in digital format is challenging the research community. It is quite easy to see that any approach aiming at facing such challenges must rely on different image representations that need to be conveniently fused in order to adapt to the subjectivity of image semantics. This paper offers a journey through the main information fusion ingredients that a recipe for the design of a CBIR system should include to meet the demanding needs of users

Archivio istituzionale della ricerca - Università di Cagliari

A Review of Environmental Context Detection for Navigation Based on Multiple Sensors

Author: Feriol Florent
Vivet Damien
Watanabe Yoko
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Current navigation systems use multi-sensor data to improve the localization accuracy, but often without certitude on the quality of those measurements in certain situations. The context detection will enable us to build an adaptive navigation system to improve the precision and the robustness of its localization solution by anticipating possible degradation in sensor signal quality (GNSS in urban canyons for instance or camera-based navigation in a non-textured environment). That is why context detection is considered the future of navigation systems. Thus, it is important firstly to define this concept of context for navigation and to find a way to extract it from available information. This paper overviews existing GNSS and on-board vision-based solutions of environmental context detection. This review shows that most of the state-of-the art research works focus on only one type of data. It confirms that the main perspective of this problem is to combine different indicators from multiple sensors

Open Archive Toulouse Archive Ouverte

Developing affordable high-throughput plant phenotyping methods for breeding of cereals and tuber crops

Author: Leiva Fernanda
Publication venue
Publication date: 01/01/2023
Field of study

High-throughput plant phenotyping (HTPP) is a fast, accurate, and non-destructive process for evaluating plants' health and environmental adaptability. HTPP accelerates the identification of agronomic traits of interest, eliminates subjectivism (which is innate to humans), and facilitates the development of adapted genotypes. Current HTPP methods often rely on imaging sensors and computer vision both in the field and under controlled (indoor) conditions. However, their use is limited by the costs and complexity of the necessary instrumentation, data analysis tools, and software. This issue could be overcome by developing more cost-efficient and user-friendly methods that let breeders, farmers, and stakeholders access the benefits of HTPP. To assist such efforts, this thesis presents an ensemble of dedicated affordable phenotyping methods using RGB imaging for a range of key applications under controlled conditions. The affordable Phenocave imaging system for use in controlled conditions was developed to facilitate studies on the effects of abiotic stresses by gathering data on important plant characteristics related to growth, yield, and adaptation to growing conditions and cultivation systems. Phenocave supports imaging sensors including visible (RGB), spectroscopic (multispectral and hyperspectral), and thermal imaging. Additionally, a pipeline for RGB image analysis was implemented as a plugin for the free and easy-to-use software ImageJ. This plugin has since proven to be an accurate alternative to conventional measurements that produces highly reproducible results. A subsequent study was conducted to evaluate the effects of heat and drought stress on plant growth and grain nutrient composition in wheat, an important staple cereal in Sweden. The effects of stress on plant growth were evaluated using image analysis, while stress-induced changes in the abundance of key plant compounds were evaluated by analyzing the nutrient composition of grains via chromatography. This led to the discovery of genotypes whose harvest quality remains stable under heat and drought stress. The next objective was to evaluate biotic stress; for this case, the effect of the fungal disease Fusarium head blight (FHB) that affects grain development in wheat was investigated. For this purpose, seed phenotyping parameters were used to determine the components and settings of a statistical model, which predicts the occurrence of FHB. The results reveal that grain morphology evaluations, such as length and width, were found to be significantly affected by the disease. Another study was carried out to estimate the disease severity of the common scab (CS) in potatoes, a widely popular food source. CS occurs on the tubers and reduces their visual appeal, significantly affecting their market value. Tubers were analyzed by a deep learning-based method to estimate disease lesion areas caused by CS. Results showed a high correlation between the predictions and expert visual scorings of the disease and proved to be a potential tool for the selection of genotypes that fulfill the market standards and resistance to CS. Both case studies highlight the role of imaging in plant health monitoring and its integration into the larger picture of plant health management. The methods presented in this work are a starting point for bridging the gap between costs and accessibility to imaging technology. These are affordable and user-friendly resources for generating pivotal knowledge on plant development and genotype selection. In the future, image acquisition of all the methods can be integrated into the Phenocave system, potentially allowing for a more automated and efficient plant health monitoring process, leading to the identification of tolerant genotypes to biotic and abiotic stresses

Epsilon Open Archive

Reconocimiento de gestos dinámicos y su aplicación al lenguaje de señas

Author: Ronchetti Franco
Publication venue
Publication date: 23/03/2017
Field of study

El reconocimiento automático de gestos humanos es un problema multidisciplinar complejo y no resuelto aún de forma completa. Desde la aparición de tecnologías de captura de video digital existen intentos de reconocer gestos dinámicos con diferentes fines. La incorporación de nuevas tecnologías como sensores de profundidad o cámaras de alta resolución, así como la mayor capacidad de procesamiento de los dispositivos actuales, permiten el desarrollo de nuevas tecnologías capaces de detectar diferentes movimientos y actuar en tiempo real. A diferencia del reconocimiento de la voz hablada, que lleva más de 40 años de investigación, esta temática es relativamente nueva en el ambiente científico, y evoluciona de forma acelerada a medida que aparecen nuevos dispositivos así como nuevos algoritmos de visión por computador. La captura y reconocimiento de gestos dinámicos permite que sean utilizados en diversas áreas de aplicación como por ejemplo monitoreo de pacientes médicos, control en un entorno de videojuego, navegación y manipulación de entornos virtuales, traducción de léxicos de la lengua de señas, entre otras aplicaciones de interés. Particularmente la lengua de señas puede entenderse como un problema particular del reconocimiento de gestos dinámicos, el cual es sumamente apreciado en los últimos tiempos por distintas instituciones, ya que permite una ayuda directa a personas hipoacúsicas. Para poder utilizar un sistema de reconocimiento automático de lengua de señas para traducir los gestos de un intérprete, es necesario afrontar una serie de diversas tareas. En primer lugar existen diferentes enfoques dependiendo el dispositivo de sensado a utilizar. Si bien existen dispositivos invasivos como guantes de datos, en esta Tesis se analizan sólo dispositivos no invasivos de dos tipos: las cámaras RGB convencionales, y las cámaras de profundidad (con particular interés en los nuevos dispositivos RGB-d). Una vez capturado el gesto se requiere de diversas etapas de pre-procesamiento para identificar regiones de interés como las manos y rostro del sujeto/intérprete, para luego identificar las diferentes trayectorias del gesto realizado. Además, particularmente para la lengua de señas existe una variabilidad enorme en las diferentes posturas o configuraciones que la mano puede tener, lo cual hace a esta disciplina una problemática particularmente compleja. Para afrontar esto es necesario una correcta generación de descriptores tanto estáticos como dinámicos. Este es uno de los ejes principales investigados en esta Tesis. Además, debido a que cada región presenta gramáticas de lenguaje específicas, se requiere la disposición de una base de datos de la Lengua de Señas Argentina (LSA), inexistente hasta el momento. En base a los motivos mencionados anteriormente, esta Tesis tiene como objetivo general desarrollar un proceso completo de interpretación y traducción de la Lengua de Señas Argentina a través de videos obtenidos con una cámara RGB. En primer lugar se realizó un estudio del estado del arte en el reconocimiento de gestos. Se investigaron técnicas inteligentes para el procesamiento de imágenes y video así como los diferentes tipos de descriptores existentes en la actualidad. Como trabajo preliminar se desarrolló una estrategia capaz de procesar acciones humanas capturadas con un dispositivo MS Kinect. La estrategia desarrollada implementa una red neuronal SOM probabilística (ProbSOM) con un descriptor específicamente diseñado para retener información temporal. Este trabajo permitió superar los resultados existentes hasta el momento para dos bases de datos reconocidas. En el campo de la lengua de señas se realizaron dos aportes principales. En primer lugar se desarrolló una base de datos específica para el reconocimiento de señas argentinas. Esto incluyó una base de datos de imágenes con 16 configuraciones de las más utilizadas en el lenguaje, junto con una base de datos de videos de alta resolución con 64 señas distintas, con un total de 3200 videos. Estas bases de datos se grabaron con 10 intérpretes diferentes y varias repeticiones, permitiendo así su uso con técnicas clásicas de aprendizaje automático. Además, en estas bases de datos los intérpretes utilizaron guantes de color, en forma de marcador. Esto se realizó con el fin de facilitar la tarea de segmentar las manos de las imágenes/videos y así poder avanzar con el resto de las etapas de clasificación. De este modo, se da la posibilidad a nuevos investigadores de evaluar otros algoritmos de reconocimiento sin la necesidad de preocuparse por esta etapa de segmentación. En segundo lugar, se diseñaron e implementaron dos métodos de clasificación de señas, los cuales fueron evaluados satisfactoriamente en las bases de datos antes mencionadas. El primer método está dedicado a la clasificación de configuraciones de manos (gestos estáticos). Aquí se utilizó un agrupamiento probabilístico para clasificar correctamente las 16 configuraciones posibles de la base de datos, logrando un reconocedor simple y potente. El segundo modelo de clasificación permitió la clasificación de señas segmentadas en videos. Este último consta de un sistema probabilístico basado en la información capturada de las dos manos, donde para cada una se evalúan tres componentes principales: la posición, la configuración y el movimiento de las manos. Esta discriminación permitió tener un sistema modular, con diferentes sub-clasificadores capaces de intercambiarse y evaluarse de modo independiente. Para lograr obtener descriptores adecuados para estos subsistemas, es necesario realizar un procesamiento que involucra la correcta segmentación y seguimiento de las manos del intérprete, clasificación de las distintas configuraciones y una correcta representación de la información del movimiento. Para evaluar los modelos desarrollados se realizaron diversas pruebas sobre las bases de datos desarrolladas. En primer lugar se realizaron pruebas de validación cruzada utilizando un porcentaje de las pruebas como entrenamiento y el resto para testeo. Adicionalmente se realizó también una evaluación de cuán robusto es el sistema al incorporar nuevos intérpretes, desconocidos hasta el momento. De este modo, 9 de los 10 individuos de la base de datos fueron utilizados como datos de entrada del sistema, evaluando con el individuo restante. Todos estos experimentos mostraron excelentes resultados, con una tasa de error menor al 5%. Por otro lado, para evaluar la eficacia del modelo implementado, se cambiaron algunos de los sub-clasificadores por técnicas más conocidas en la literatura como Modelos de Markov o Redes Neuronales FeedForward, mostrando solidez en las estrategias propuestas en esta Tesis.Doctor en Ciencias Informática

Publikationer från KTH

Centro de Servicios en Gestión de Información

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Recommended from our members

Detailed and Practical 3D Reconstruction with Advanced Photometric Stereo Modelling

Author: Logothetis Fotios
Publication venue: University of Cambridge
Publication date: 04/05/2019
Field of study

Object 3D reconstruction has always been one of the main objectives of computer vision. After many decades of research, most techniques are still unsuccessful at recovering high resolution surfaces, especially for objects with limited surface texture. Moreover, most shiny materials are particularly hard to reconstruct. Photometric Stereo (PS), which operates by capturing multiple images under changing illumination has traditionally been one of the most successful techniques at recovering a large amount of surface details, by exploiting the relationship between shading and local shape. However, using PS has been highly impractical because most approaches are only applicable in a very controlled lab setting and limited to objects experiencing diffuse reflection. Nevertheless, recent advances in differential modelling have made complicated Photometric Stereo models possible and variational optimisations for these kinds of models show remarkable resilience to real world imperfections such as non-Gaussian noise and other outliers. Thus, a highly accurate, photometric-based reconstruction system is now possible. The contribution of this thesis is threefold. First of all, the Photometric Stereo model is extended in order to be able to deal with arbitrary ambient lighting. This is a step towards acquisition in a non-fully controlled lab setting. Secondly, the need for a priori knowledge of the light source brightness and attenuation characteristics is relaxed as an alternating optimisation procedure is proposed which is able to estimate these parameters. This extension allows for quick acquisition with inexpensive LEDs that exhibit unpredictable illumination characteristics (flickering etc). Finally, a volumetric parameterisation is proposed which allows one to tackle the multi-view Photometric Stereo problem in a similar manner, in a simple unified differential model. This final extension allows for complete object reconstruction merging information from multiple images taken from multiple viewpoints and variable illumination. The theoretical work in this thesis is experimentally evaluated in a number of challenging real world experiments, with data captured by custom-made hardware. In addition, the applicability of the generality of the proposed models is demonstrated by presenting a differential model for the shape of polarisation problem, which leads to a unified optimisation problem, fusing information from both methods. This allows for the acquisition of geometrical information about objects such as semi-transparent glass, hitherto hard to deal with

Apollo (Cambridge)

AIUCD2016 - Book of Abstracts

Author
Publication venue: AIUCD
Publication date: 01/01/2017
Field of study

Questo volume raccoglie gli abstract dei contributi accolti al convegno AIUCD 2016, dal titolo "Edizioni digitali: rappresentazione, interoperabilità, analisi del testo e infrastrutture" (Digital editions: representation, interoperability, text analysis and infrastructures). Si tratta del quinto convegno dell'Associazione di Informatica Umanistica e Cultura Digitale (AIUCD), tenutosi a Venezia dal 7 al 9 Settembre 2016, che è stato infatti dedicato alla rappresentazione e allo studio del testo sotto vari punti di vista (risorse, analisi, infrastrutture di pubblicazione), con lo scopo di far dialogare intorno al testo filologi, storici, umanisti digitali, linguisti computazionali, logici, informatici e ingegneri informatici. Il presente volume raccoglie dunque gli abstract dei soli interventi accettati al convegno, che hanno ottenuto il parere favorevole da parte di valutatori esperti della materia, attraverso un processo di revisione anonima sotto la responsabilità del Comitato Scientifico di AIUCD 2016

AMS Acta