28 research outputs found
Visual vocabularies for category-level object recognition
This thesis focuses on the study of visual vocabularies for category-level object recognition. Specifically, we state novel approaches for building visual codebooks. Our aim is not just to obtain more discriminative and more compact visual codebooks, but to bridge the gap between visual features and semantic concepts. A novel approach for obtaining class representative visual words is presented. It is based on a maximisation procedure, i. e. the Cluster Precision Maximisation (CPM), of a novel cluster precision criterion, and on an adaptive threshold refinement scheme for agglomerative clustering algorithms based on correlation clustering techniques. The objective is to increase the vocabulary compactness while at the same time improve the recognition rate and further increase the representativeness of the visual words. Moreover, we describe a novel clustering aggregation based approach for building efficient and semantic visual vocabularies. It consist of a novel framework for incorporating neighboring appearances of local descriptors into the vocabulary construction, and a rigorous approach for adding meaningful spatial coherency among the local features into the visual codebooks. We also propose an efficient high-dimensional data clustering algorithm, the Fast Reciprocal Nearest Neighbours (Fast-RNN). Our approach, which is a speeded up version of the standard RNN algorithm, is based on the projection search paradigm. Finally, we release a new database of images called Image Collection of Annotated Real-world Objects (ICARO), which is especially designed for evaluating category-level object recognition systems. An exhaustive comparison of ICARO with other well-known datasets used within the same context is carried out. We also propose a benchmark for both object classification and detection
Visual vocabularies for category-level object recognition
This thesis focuses on the study of visual vocabularies for category-level object recognition. Specifically, we state novel approaches for building visual codebooks. Our aim is not just to obtain more discriminative and more compact visual codebooks, but to bridge the gap between visual features and semantic concepts. A novel approach for obtaining class representative visual words is presented. It is based on a maximisation procedure, i. e. the Cluster Precision Maximisation (CPM), of a novel cluster precision criterion, and on an adaptive threshold refinement scheme for agglomerative clustering algorithms based on correlation clustering techniques. The objective is to increase the vocabulary compactness while at the same time improve the recognition rate and further increase the representativeness of the visual words. Moreover, we describe a novel clustering aggregation based approach for building efficient and semantic visual vocabularies. It consist of a novel framework for incorporating neighboring appearances of local descriptors into the vocabulary construction, and a rigorous approach for adding meaningful spatial coherency among the local features into the visual codebooks. We also propose an efficient high-dimensional data clustering algorithm, the Fast Reciprocal Nearest Neighbours (Fast-RNN). Our approach, which is a speeded up version of the standard RNN algorithm, is based on the projection search paradigm. Finally, we release a new database of images called Image Collection of Annotated Real-world Objects (ICARO), which is especially designed for evaluating category-level object recognition systems. An exhaustive comparison of ICARO with other well-known datasets used within the same context is carried out. We also propose a benchmark for both object classification and detection
Road Sign Analysis Using Multisensory Data
This paper deals with the problem of estimating the following road sign parameters: height, dimensions, visibility distance and partial occlusions. This work belongs to a framework whose main applications involve road sign maintenance, driver assistance, and inventory systems. From this paper we suggest a multisensory system composed from two cameras, a GPS receiver, and a distance measurement device,all of them installed in a car. The process consists of several steps which include road sign detection, recognition and tracking , and road signs parameters estimation. From some trigonometric properties, and a camera model, the information provided by the tracking subsystem and the distance measurement sensors, we estimate the road signs parameters.Results show that the described calculation methodology offers a correct
estimation for all types of traffic signs
Road Sign Analysis Using Multisensory Data
This paper deals with the problem of estimating the following road sign parameters: height, dimensions, visibility distance and partial occlusions. This work belongs to a framework whose main applications involve road sign maintenance, driver assistance, and inventory systems. From this paper we suggest a multisensory system composed from two cameras, a GPS receiver, and a distance measurement device,all of them installed in a car. The process consists of several steps which include road sign detection, recognition and tracking , and road signs parameters estimation. From some trigonometric properties, and a camera model, the information provided by the tracking subsystem and the distance measurement sensors, we estimate the road signs parameters.Results show that the described calculation methodology offers a correct
estimation for all types of traffic signs
Neurofibroma plexiforme en mucosa yugal: presentación de un caso clínico
Presentamos un caso clínico de neurofibroma plexiforme localizado
en región geniana, a nivel submucoso. Su interés radica
en que, a pesar de ser el tumor de origen neurógeno más frecuente,
es una entidad poco habitual y que rara vez se localiza
a nivel intraoral. Por otra parte, la variedad plexiforme es todavía
menos frecuente. Desde el punto de vista clínico, se manifiestan
como lesiones anodinas, con escasa sintomatología, que
cuando aparece es derivada de la compresión nerviosa. En nuestro
caso el tumor era asintomático salvo por el tamaño.
Radiológicamente no existe una imagen definitiva. Tiene relación
con determinados síndromes poliglandulares y facomatosis.
El tratamiento es básicamente quirúrgico aunque existen dudas
de la idoneidad del mismo y se están buscando nuevas vías de
tratamiento. Aprovechando la descripción del caso se realiza
una revisión bibliográfica incidiendo en la epidemiología, comportamiento
clínico, métodos diagnósticos, así como en el tratamiento
de este tipo de tumores benignos.The case reported deals with a solitary plexiform neurofibroma
affecting the cheek submucosa.
Neurofibroma is an uncommon tumor which rarely appears in
oral cavity but it represents the most common neurogenic tumor.
Furthermore, plexiform variety is less frequent. Clinically,
oral neurofibromas usually apperars as anodyne and asintomatic
lesions. Sometimes, they produce nervous compression. In this
case, tumor is big but asintomatic. There is no definitive
radiologic image. It has association with poliglandular syndroms
and phacomatosis. The treatment of choice is excision. There
are doubts of the surgical results so that some authors are looking
for new non-surgical treatments. The clinical characteristics,
epidemiology, diagnosis and treatment are described as soon as
a bibliographic revision
Visual Semantic Navigation with Real Robots
Visual Semantic Navigation (VSN) is the ability of a robot to learn visual
semantic information for navigating in unseen environments. These VSN models
are typically tested in those virtual environments where they are trained,
mainly using reinforcement learning based approaches. Therefore, we do not yet
have an in-depth analysis of how these models would behave in the real world.
In this work, we propose a new solution to integrate VSN models into real
robots, so that we have true embodied agents. We also release a novel ROS-based
framework for VSN, ROS4VSN, so that any VSN-model can be easily deployed in any
ROS-compatible robot and tested in a real setting. Our experiments with two
different robots, where we have embedded two state-of-the-art VSN agents,
confirm that there is a noticeable performance difference of these VSN
solutions when tested in real-world and simulation environments. We hope that
this research will endeavor to provide a foundation for addressing this
consequential issue, with the ultimate aim of advancing the performance and
efficiency of embodied agents within authentic real-world scenarios. Code to
reproduce all our experiments can be found at
https://github.com/gramuah/ros4vsn
Assistive Robot with an AI-Based Application for the Reinforcement of Activities of Daily Living: Technical Validation with Users Affected by Neurodevelopmental Disorders
In this work, we propose the first study of a technical validation of an assistive robotic platform, which has been designed to assist people with neurodevelopmental disorders. The platform is called LOLA2 and it is equipped with an artificial intelligence-based application to reinforce the learning of daily life activities in people with neurodevelopmental problems. LOLA2 has been integrated with an ROS-based navigation system and a user interface for healthcare professionals and their patients to interact with it. Technically, we have been able to embed all these modules into an NVIDIA Jetson Xavier board, as well as an artificial intelligence agent for online action detection (OAD). This OAD approach provides a detailed report on the degree of performance of a set of daily life activities that are being learned or reinforced by users. All the human–robot interaction process to work with users with neurodevelopmental disorders has been designed by a multidisciplinary team. Among its main features are the ability to control the robot with a joystick, a graphical user interface application that shows video tutorials with the activities to reinforce or learn, and the ability to monitor the progress of the users as they complete tasks. The main objective of the assistive robotic platform LOLA2 is to provide a system that allows therapists to track how well the users understand and perform daily tasks. This paper focuses on the technical validation of the proposed platform and its application. To do so, we have carried out a set of tests with four users with neurodevelopmental problems and special physical conditions under the supervision of the corresponding therapeutic personnel. We present detailed results of all interventions with end users, analyzing the usability, effectiveness, and limitations of the proposed technology. During its initial technical validation with real users, LOLA2 was able to detect the actions of users with disabilities with high precision. It was able to distinguish four assigned daily actions with high accuracy, but some actions were more challenging due to the physical limitations of the users. Generally, the presence of the robot in the therapy sessions received excellent feedback from medical professionals as well as patients. Overall, this study demonstrates that our developed robot is capable of assisting and monitoring people with neurodevelopmental disorders in performing their daily living tasks.This research was funded by project AIRPLANE, with reference PID2019-104323RB-C31, of Spain’s Ministry of Science and Innovation