108,235 research outputs found
Fireground location understanding by semantic linking of visual objects and building information models
This paper presents an outline for improved localization and situational awareness in fire emergency situations based on semantic technology and computer vision techniques. The novelty of our methodology lies in the semantic linking of video object recognition results from visual and thermal cameras with Building Information Models (BIM). The current limitations and possibilities of certain building information streams in the context of fire safety or fire incident management are addressed in this paper. Furthermore, our data management tools match higher-level semantic metadata descriptors of BIM and deep-learning based visual object recognition and classification networks. Based on these matches, estimations can be generated of camera, objects and event positions in the BIM model, transforming it from a static source of information into a rich, dynamic data provider. Previous work has already investigated the possibilities to link BIM and low-cost point sensors for fireground understanding, but these approaches did not take into account the benefits of video analysis and recent developments in semantics and feature learning research. Finally, the strengths of the proposed approach compared to the state-of-the-art is its (semi -)automatic workflow, generic and modular setup and multi-modal strategy, which allows to automatically create situational awareness, to improve localization and to facilitate the overall fire understanding
IR-Depth Face Detection and Lip Localization Using Kinect V2
Face recognition and lip localization are two main building blocks in the development of audio visual automatic speech recognition systems (AV-ASR). In many earlier works, face recognition and lip localization were conducted in uniform lighting conditions with simple backgrounds. However, such conditions are seldom the case in real world applications. In this paper, we present an approach to face recognition and lip localization that is invariant to lighting conditions. This is done by employing infrared and depth images captured by the Kinect V2 device. First we present the use of infrared images for face detection. Second, we use the face’s inherent depth information to reduce the search area for the lips by developing a nose point detection. Third, we further reduce the search area by using a depth segmentation algorithm to separate the face from its background. Finally, with the reduced search range, we present a method for lip localization based on depth gradients. Experimental results demonstrated an accuracy of 100% for face detection, and 96% for lip localization
Visual location awareness for mobile robots using feature-based vision
Department Head: L. Darrell Whitley.2010 Spring.Includes bibliographical references (pages 48-50).This thesis presents an evaluation of feature-based visual recognition paradigm for the task of mobile robot localization. Although many works describe feature-based visual robot localization, they often do so using complex methods for map-building and position estimation which obscure the underlying vision systems' performance. One of the main contributions of this work is the development of an evaluation algorithm employing simple models for location awareness with focus on evaluating the underlying vision system. While SeeAsYou is used as a prototypical vision system for evaluation, the algorithm is designed to allow it to be used with other feature-based vision systems as well. The main result is that feature-based recognition with SeeAsYou provides some information but is not strong enough to reliably achieve location awareness without the temporal context. Adding a simple temporal model, however, suggests a more reliable localization performance
Application of Channel Modeling for Indoor Localization Using TOA and RSS
Recently considerable attention has been paid to indoor geolocation using wireless local area networks (WLAN) and wireless personal area networks (WPAN) devices. As more applications using these technologies are emerging in the market, the need for accurate and reliable localization increases. In response to this need, a number of technologies and associated algorithms have been introduced in the literature. These algorithms resolve the location either by using estimated distances between a mobile station (MS) and at least three reference points (via triangulation) or pattern recognition through radio frequency (RF) fingerprinting. Since RF fingerprinting, which requires on site measurements is a time consuming process, it is ideal to replace this procedure with the results obtained from radio channel modeling techniques. Localization algorithms either use the received signal strength (RSS) or time of arrival (TOA) of the received signal as their localization metric. TOA based systems are sensitive to the available bandwidth, and also to the occurrence of undetected direct path (UDP) channel conditions, while RSS based systems are less sensitive to the bandwidth and more resilient to UDP conditions. Therefore, the comparative performance evaluation of different positioning systems is a multifaceted and challenging problem. This dissertation demonstrates the viability of radio channel modeling techniques to eliminate the costly fingerprinting process in pattern recognition algorithms by introducing novel ray tracing (RT) assisted RSS and TOA based algorithms. Two sets of empirical data obtained by radio channel measurements are used to create a baseline for comparative performance evaluation of localization algorithms. The first database is obtained by WiFi RSS measurements in the first floor of the Atwater Kent laboratory; an academic building on the campus of WPI; and the other by ultra wideband (UWB) channel measurements in the third floor of the same building. Using the results of measurement campaign, we specifically analyze the comparative behavior of TOA- and RSS-based indoor localization algorithms employing triangulation or pattern recognition with different bandwidths adopted in WLAN and WPAN systems. Finally, we introduce a new RT assisted hybrid RSS-TOA based algorithm which employs neural networks. The resulting algorithm demonstrates a superior performance compared to the conventional RSS and TOA based algorithms in wideband systems
Text localization and recognition in natural scene images
Text localization and recognition (text spotting) in natural scene images is an interesting task that finds many practical applications. Algorithms for text spotting may be used in helping visually impaired subjects during navigation in unknown environments; building autonomous driving systems that automatically avoid collisions with pedestrians or automatically identify speed limits and warn the driver about possible infractions that are being committed; and to ease or solve some tedious and repetitive data entry tasks that are still manually carried out by humans. While Optical Character Recognition (OCR) from scanned documents is a solved problem, the same cannot be said for text spotting in natural images. In fact, this latest class of images contains plenty of difficult situations that algorithms for text spotting need to deal with in order to reach acceptable recognition rates. During my PhD research I focused my studies on the development of novel systems for text localization and recognition in natural scene images. The two main works that I have presented during these three years of PhD studies are presented in this thesis: (i) in my first work I propose a hybrid system which exploits the key ideas of region-based and connected components (CC)-based text localization approaches to localize uncommon fonts and writings in natural images; (ii) in my second work I describe a novel deep-based system which exploits Convolutional Neural Networks and enhanced stable CC to achieve good text spotting results on challenging data sets. During the development of both these methods, my focus has always been on maintaining an acceptable computational complexity and a high reproducibility of the achieved results
Text localization and recognition in natural scene images
Text localization and recognition (text spotting) in natural scene images is an interesting task that finds many practical applications. Algorithms for text spotting may be used in helping visually impaired subjects during navigation in unknown environments; building autonomous driving systems that automatically avoid collisions with pedestrians or automatically identify speed limits and warn the driver about possible infractions that are being committed; and to ease or solve some tedious and repetitive data entry tasks that are still manually carried out by humans. While Optical Character Recognition (OCR) from scanned documents is a solved problem, the same cannot be said for text spotting in natural images. In fact, this latest class of images contains plenty of difficult situations that algorithms for text spotting need to deal with in order to reach acceptable recognition rates. During my PhD research I focused my studies on the development of novel systems for text localization and recognition in natural scene images. The two main works that I have presented during these three years of PhD studies are presented in this thesis: (i) in my first work I propose a hybrid system which exploits the key ideas of region-based and connected components (CC)-based text localization approaches to localize uncommon fonts and writings in natural images; (ii) in my second work I describe a novel deep-based system which exploits Convolutional Neural Networks and enhanced stable CC to achieve good text spotting results on challenging data sets. During the development of both these methods, my focus has always been on maintaining an acceptable computational complexity and a high reproducibility of the achieved results
Exploring Food Detection using CNNs
One of the most common critical factors directly related to the cause of a
chronic disease is unhealthy diet consumption. In this sense, building an
automatic system for food analysis could allow a better understanding of the
nutritional information with respect to the food eaten and thus it could help
in taking corrective actions in order to consume a better diet. The Computer
Vision community has focused its efforts on several areas involved in the
visual food analysis such as: food detection, food recognition, food
localization, portion estimation, among others. For food detection, the best
results evidenced in the state of the art were obtained using Convolutional
Neural Network. However, the results of all these different approaches were
gotten on different datasets and therefore are not directly comparable. This
article proposes an overview of the last advances on food detection and an
optimal model based on GoogLeNet Convolutional Neural Network method, principal
component analysis, and a support vector machine that outperforms the state of
the art on two public food/non-food datasets
Study and development of a reliable fiducials-based localization system for multicopter UAVs flying indoor
openThe recent evolution of technology in automation, agriculture, IoT, and aerospace fields
has created a growing demand for mobile robots capable of autonomous operation and
movement to accomplish various tasks. Aerial platforms are expected to play a central
role in the future due to their versatility and swift intervention capabilities. However,
the effective utilization of these platforms faces a significant challenge due to localization,
which is a vital aspect for their interaction with the surrounding environment.
While GNSS localization systems have established themselves as reliable solutions for
open-space scenarios, the same approach is not viable for indoor settings, where localization
remains an open problem as it is witnessed by the lack of extensive literature on
the topic.
In this thesis, we address this challenge by proposing a dependable solution for small
multi-rotor UAVs using a Visual Inertial Odometry localization system. Our KF-based
localization system reconstructs the pose by fusing data from onboard sensors. The primary
source of information stems from the recognition of AprilTags fiducial markers,
strategically placed in known positions to form a “map”.
Building upon prior research and thesis work conducted at our university, we extend
and enhance this system. We begin with a concise introduction, followed by a justification
of our chosen strategies based on the current state of the art. We provide an
overview of the key theoretical, mathematical, and technical aspects that support our
work. These concepts are fundamental to the design of innovative strategies that address
challenges such as data fusion from different AprilTag recognition and the elimination
of misleading measurements. To validate our algorithms and their implementation,
we conduct experimental tests using two distinct platforms by using localization
accuracy and computational complexity as performance indices to demonstrate the
practical viability of our proposed system.
By tackling the critical issue of indoor localization for aerial platforms, this thesis tries
to give some contribution to the advancement of robotics technology, opening avenues
for enhanced autonomy and efficiency across various domains.The recent evolution of technology in automation, agriculture, IoT, and aerospace fields
has created a growing demand for mobile robots capable of autonomous operation and
movement to accomplish various tasks. Aerial platforms are expected to play a central
role in the future due to their versatility and swift intervention capabilities. However,
the effective utilization of these platforms faces a significant challenge due to localization,
which is a vital aspect for their interaction with the surrounding environment.
While GNSS localization systems have established themselves as reliable solutions for
open-space scenarios, the same approach is not viable for indoor settings, where localization
remains an open problem as it is witnessed by the lack of extensive literature on
the topic.
In this thesis, we address this challenge by proposing a dependable solution for small
multi-rotor UAVs using a Visual Inertial Odometry localization system. Our KF-based
localization system reconstructs the pose by fusing data from onboard sensors. The primary
source of information stems from the recognition of AprilTags fiducial markers,
strategically placed in known positions to form a “map”.
Building upon prior research and thesis work conducted at our university, we extend
and enhance this system. We begin with a concise introduction, followed by a justification
of our chosen strategies based on the current state of the art. We provide an
overview of the key theoretical, mathematical, and technical aspects that support our
work. These concepts are fundamental to the design of innovative strategies that address
challenges such as data fusion from different AprilTag recognition and the elimination
of misleading measurements. To validate our algorithms and their implementation,
we conduct experimental tests using two distinct platforms by using localization
accuracy and computational complexity as performance indices to demonstrate the
practical viability of our proposed system.
By tackling the critical issue of indoor localization for aerial platforms, this thesis tries
to give some contribution to the advancement of robotics technology, opening avenues
for enhanced autonomy and efficiency across various domains
Low-effort place recognition with WiFi fingerprints using deep learning
Using WiFi signals for indoor localization is the main localization modality
of the existing personal indoor localization systems operating on mobile
devices. WiFi fingerprinting is also used for mobile robots, as WiFi signals
are usually available indoors and can provide rough initial position estimate
or can be used together with other positioning systems. Currently, the best
solutions rely on filtering, manual data analysis, and time-consuming parameter
tuning to achieve reliable and accurate localization. In this work, we propose
to use deep neural networks to significantly lower the work-force burden of the
localization system design, while still achieving satisfactory results.
Assuming the state-of-the-art hierarchical approach, we employ the DNN system
for building/floor classification. We show that stacked autoencoders allow to
efficiently reduce the feature space in order to achieve robust and precise
classification. The proposed architecture is verified on the publicly available
UJIIndoorLoc dataset and the results are compared with other solutions
- …