8 research outputs found
SkiMap: An Efficient Mapping Framework for Robot Navigation
We present a novel mapping framework for robot navigation which features a
multi-level querying system capable to obtain rapidly representations as
diverse as a 3D voxel grid, a 2.5D height map and a 2D occupancy grid. These
are inherently embedded into a memory and time efficient core data structure
organized as a Tree of SkipLists. Compared to the well-known Octree
representation, our approach exhibits a better time efficiency, thanks to its
simple and highly parallelizable computational structure, and a similar memory
footprint when mapping large workspaces. Peculiarly within the realm of mapping
for robot navigation, our framework supports realtime erosion and
re-integration of measurements upon reception of optimized poses from the
sensor tracker, so as to improve continuously the accuracy of the map.Comment: Accepted by International Conference on Robotics and Automation
(ICRA) 2017. This is the submitted version. The final published version may
be slightly differen
American sign language posture understanding with deep neural networks
Sign language is a visually oriented, natural, nonverbal communication medium. Having shared similar linguistic properties with its respective spoken language, it consists of a set of gestures, postures and facial expressions. Though, sign language is a mode of communication between deaf people, most other people do not know sign language interpretations. Therefore, it would be constructive if we can translate the sign postures artificially. In this paper, a capsule-based deep neural network sign posture translator for an American Sign Language (ASL) fingerspelling (posture), has been presented. The performance validation shows that the approach can successfully identify sign language, with accuracy like 99%. Unlike previous neural network approaches, which mainly used fine-tuning and transfer learning from pre-trained models, the developed capsule network architecture does not require a pre-trained model. The framework uses a capsule network with adaptive pooling which is the key to its high accuracy. The framework is not limited to sign language understanding, but it has scope for non-verbal communication in Human-Robot Interaction (HRI) also
Dream Formulations and Deep Neural Networks: Humanistic Themes in the Iconology of the Machine-Learned Image
This paper addresses the interpretability of deep learning-enabled image
recognition processes in computer vision science in relation to theories in art
history and cognitive psychology on the vision-related perceptual capabilities
of humans. Examination of what is determinable about the machine-learned image
in comparison to humanistic theories of visual perception, particularly in
regard to art historian Erwin Panofsky's methodology for image analysis and
psychologist Eleanor Rosch's theory of graded categorization according to
prototypes, finds that there are surprising similarities between the two that
suggest that researchers in the arts and the sciences would have much to
benefit from closer collaborations. Utilizing the examples of Google's
DeepDream and the Machine Learning and Perception Lab at Georgia Tech's
Grad-CAM: Gradient-weighted Class Activation Mapping programs, this study
suggests that a revival of art historical research in iconography and formalism
in the age of AI is essential for shaping the future navigation and
interpretation of all machine-learned images, given the rapid developments in
image recognition technologies.Comment: 29 pages, 8 Figures, This paper was originally presented as Dream
Formulations and Image Recognition: Algorithms for the Study of Renaissance
Art, at Critical Approaches to Digital Art History, The Villa I Tatti, The
Harvard University Center for Italian Renaissance Studies and The Newberry
Center for Renaissance Studies, Renaissance Society of America Annual
Meeting, Chicago, 31 March 201
Visual complexity modelling based on image features fusion of multiple kernels
[Abstract] Humans’ perception of visual complexity is often regarded as one of the key principles of aesthetic order, and is intimately related to the physiological, neurological and, possibly, psychological characteristics of the human mind. For these reasons, creating accurate computational models of visual complexity is a demanding task. Building upon on previous work in the field (Forsythe et al., 2011; Machado et al., 2015) we explore the use of Machine Learning techniques to create computational models of visual complexity. For that purpose, we use a dataset composed of 800 visual stimuli divided into five categories, describing each stimulus by 329 features based on edge detection, compression error and Zipf’s law. In an initial stage, a comparative analysis of representative state-of-the-art Machine Learning approaches is performed. Subsequently, we conduct an exhaustive outlier analysis. We analyze the impact of removing the extreme outliers, concluding that Feature Selection Multiple Kernel Learning obtains the best results, yielding an average correlation to humans’ perception of complexity of 0.71 with only twenty-two features. These results outperform the current state-of-the-art, showing the potential of this technique for regression.Xunta de Galicia; GRC2014/049Portuguese Foundation for Science and Technology;
SBIRC; PTDC/EIA EIA/115667/2009Xunta de Galicia; Ref. XUGA-PGIDIT-10TIC105008-PRMinisterio de Ciencia y TecnologĂa; TIN2008-06562/TINMinisterio de EcnomĂa y Competitividad; FJCI-2015-2607
Intérprete "artificial" de lengua de signos
Este Trabajo de Fin de Grado (TFG) busca crear un modelo de Inteligencia Artificial (IA) basado en el uso de Redes Neuronales Convolucionales (CNN) que sea capaz de detectar las diferentes letras del alfabeto de la Lengua de Signos Americana (LSA) que sean estáticas, es decir, que no requieran movimiento para ser reconocidas. Para lograr este objetivo, lo primero que se hizo fue realizar una serie de tutoriales para obtener un mayor conocimiento sobre el funcionamiento de las CNN. Con este conocimiento adquirido, se comenzó a preparar el modelo añadiendo distintas capas convolucionales, capas densas y capas de aumento de datos, las cuales resultaron fundamentales para obtener buenos resultados, con una tasa de acierto de hasta el 94%. Una vez que el modelo estuvo preparado, se creó una interfaz de reconocimiento en tiempo real mediante el uso de la cámara. Tras la implementación conjunta del modelo con el funcionamiento de la cámara, se obtuvieron resultados muy satisfactorios en el reconocimiento de los signos del alfabeto de la LSA, logrando asà el objetivo inicial de crear un modelo capaz de reconocer la LSA
A Comprehensive Review of AI-enabled Unmanned Aerial Vehicle: Trends, Vision , and Challenges
In recent years, the combination of artificial intelligence (AI) and unmanned
aerial vehicles (UAVs) has brought about advancements in various areas. This
comprehensive analysis explores the changing landscape of AI-powered UAVs and
friendly computing in their applications. It covers emerging trends, futuristic
visions, and the inherent challenges that come with this relationship. The
study examines how AI plays a role in enabling navigation, detecting and
tracking objects, monitoring wildlife, enhancing precision agriculture,
facilitating rescue operations, conducting surveillance activities, and
establishing communication among UAVs using environmentally conscious computing
techniques. By delving into the interaction between AI and UAVs, this analysis
highlights the potential for these technologies to revolutionise industries
such as agriculture, surveillance practices, disaster management strategies,
and more. While envisioning possibilities, it also takes a look at ethical
considerations, safety concerns, regulatory frameworks to be established, and
the responsible deployment of AI-enhanced UAV systems. By consolidating
insights from research endeavours in this field, this review provides an
understanding of the evolving landscape of AI-powered UAVs while setting the
stage for further exploration in this transformative domain