280 research outputs found
Pedestrian Navigation using Artificial Neural Networks and Classical Filtering Techniques
The objective of this thesis is to explore the improvements achieved through using classical filtering methods with Artificial Neural Network (ANN) for pedestrian navigation techniques. ANN have been improving dramatically in their ability to approximate various functions. These neural network solutions have been able to surpass many classical navigation techniques. However, research using ANN to solve problems appears to be solely focused on the ability of neural networks alone. The combination of ANN with classical filtering methods has the potential to bring beneficial aspects of both techniques to increase accuracy in many different applications. Pedestrian navigation is used as a medium to explore this process using a localization and a Pedestrian Dead Reckoning (PDR) approach. Pedestrian navigation is primarily dominated by Global Positioning System (GPS) based navigation methods, but urban and indoor environments pose difficulties for using GPS for navigation. A novel urban data set is created for testing various localization and PDR based pedestrian navigation solutions. Cell phone data is collected including images, accelerometer, gyroscope, and magnetometer data to train the ANN. The ANN methods are explored first trying to achieve a low root mean square error (RMSE) of the predicted and original trajectory. After analyzing the localization and PDR solutions they are combined into an extended Kalman Filter (EKF) to achieve a 20% reduction in the RMSE. This takes the best localization results of 35m combined with underperforming PDR solution with a 171m RMSE to create an EKF solution of 28m of a one hour test collect
Pedestrian detection and tracking using stereo vision techniques
Automated pedestrian detection, counting and tracking has received significant attention from the computer vision community of late. Many of the person detection techniques described so far in the literature work well in controlled environments, such as laboratory settings with a small number of people. This allows various assumptions to be made that simplify this complex problem. The performance of these techniques, however, tends to deteriorate when presented with unconstrained environments where pedestrian appearances, numbers, orientations, movements, occlusions and lighting conditions violate these convenient assumptions. Recently, 3D stereo information has been proposed as a technique to overcome some of these issues and to guide pedestrian detection. This thesis presents such an approach, whereby after obtaining robust 3D information via a novel disparity estimation technique, pedestrian detection is performed via a 3D point clustering process within a region-growing framework. This clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. This pedestrian detection technique requires no external training and is able to robustly handle challenging real-world unconstrained environments from various camera positions and orientations. In addition, this thesis presents a continuous detect-and-track approach, with additional kinematic constraints and explicit occlusion analysis, to obtain robust temporal tracking of pedestrians over
time. These approaches are experimentally validated using challenging datasets consisting of both synthetic data and real-world sequences gathered from a number of environments. In each case, the techniques are evaluated using both 2D and 3D groundtruth methodologies
Object Tracking in Video with Part-Based Tracking by Feature Sampling
Visual tracking of arbitrary objects is an active research topic in computer vision, with applications across multiple disciplines including video surveillance, activity analysis, robot vision, and human computer interface. Despite great progress having been made in object tracking in recent years, it still remains a challenge to design trackers that can deal with difficult tracking scenarios, such as camera motion, object motion change, occlusion, illumination changes, and object deformation. A promising way of tackling these types of problems is to use a part-based method; one which models and tracks small regions of the object and estimates the location of the object based on the tracked part's positions. These approaches typically model parts of objects with histograms of various hand-crafted features extracted from the region in which the part is located. However, it is unclear how such relatively homogeneous regions should be represented to form an effective part-based tracker. In this thesis we present a part-based tracker that includes a model for object parts that is designed to empirically characterise the underlying colour distribution of an image region, representing it by pairs of randomly selected colour features and counts of how many pixels are similar to each feature. This novel feature representation is used to find probable locations for the part in future frames via a Bhattacharyya Distance-based metric, which is modified to prefer higher quality matches. Sets of candidate patch locations are generated by randomly generating non-shearing affine transformations of the part's previous locations and locally optimising the most likely sets of parts to allow for small intra-frame object deformations. We also present a study of model initialisation in online, model-free tracking and evaluate several techniques for selecting the regions of an image, given a target bounding box most likely to contain an object. The strengths and limitations of the combined tracker are evaluated on the VOT2016 and VOT2018 datasets using their evaluation protocol, which also allows an extensive evaluation of parameter robustness. The presented tracker is ranked first among part-based trackers on the VOT2018 dataset and is particularly robust to changes in object and camera motion, as well as object size changes
Fusion Of Multiple Inertial Measurements Units And Its Application In Reduced Cost, Size, Weight, And Power Synthetic Aperture Radars
Position navigation and timing (PNT) is the concept of determining where an object is on the Earth (position), the destination of the object (navigation), and when the object is in these positions (timing). In autonomous applications, these three attributes are crucial to determining the control inputs required to control and move the platform through an area. Traditionally, the position information is gathered using mainly a global positioning system (GPS) which can provide positioning sufficient for most PNT applications. However, GPS navigational solutions are limited by slower update rates, limited accuracy, and can be unreliable. GPS solutions update slower due to the signal having to travel a great distance from the satellite to the receiver. Additionally, the accuracy of the GPS solution relies on the environment of the receiver and the effects caused by additional reflections that introduce ambiguity into the positional solution. As result, the positional solution can become unstable or unreliable if the ambiguities are significant and greatly impact the accuracy of the positional solution. A common solution to addressing the shortcomings of the GPS solution is to introduce an additional sensor focused on measuring the physical state of the platform. The sensors popularly used are inertial measurement units (IMU) and can help provide faster positional accuracy as the transmission time is eliminated. Furthermore, the IMU is directly measuring physical forces that contribute to the position of the platform, therefore, the ambiguities caused by additional signal reflections are also eliminated. Although the introduction of the IMU helps mitigate some of the shortcomings of GPS, the sensors introduce a slightly different set of challenges. Since the IMUs directly measure the physical forces experienced by the platform, the position is estimated using these measurements. The estimates of position utilize the previously known position and estimate the changes to the position based on the accelerations measured by the IMUs. As the IMUs intrinsically have sensor noise and errors in their measurements, the noise errors
directly impact the accuracy of the position estimated. These inaccuracies are further compounded as the erroneous position estimate is now used as the basis for future position calculations. Inertial navigation systems (INS) have been developed to pair the IMUs with the GPS to overcome the challenges brought by each sensor independently. The data provided from each sensor is processed using a technique known as data fusion where the statistical likelihood of each positional solution is evaluated and used to estimate the most likely position solution given the observations from each sensor. Data fusion allows for the navigation solution to provide a positional solution at the sampling rate of the fastest sensor while also limiting the compounding errors intrinsic to using IMUs. Synthetic aperture radar (SAR) is an application that utilizes a moving radar to synthetically generate a larger aperture to create images of a target scene. The larger aperture allows for a finer spatial resolution resulting in higher quality SAR images. For synthetic aperture radar applications, the PNT solution is fundamental to producing a quality image as the range to a target is only reported by the radar. To form an image, the range to each target must be aligned over the coherent processing interval (CPI). In doing so, the energy reflected from the target as the radar is moving can be combined coherently and resolved to a pixel in the image product. In practice, the position of the radar is measured using a navigational solution utilizing a GPS and IMU. Inaccuracies in these solutions directly contribute to the image quality in a SAR system because the measured range from the radar will not agree with the calculated range to the location represented by the pixel. As a result, the final image becomes unfocused and the target will be blurred across multiple pixels.
For INS systems, increasing the accuracy of the final position estimate is dependent on the accuracy of the sensors in the system. An easy way to increase the accuracy of the INS solution is to upgrade to a higher grade IMU. As a result,
the errors compounded by the IMU estimations are minimized because the intrinsic noise perturbations are smaller. The trade-off is the IMU sensors increase in cost, size, weight, and power (C-SWAP) as the quality of the sensor increases. The
increase in C-SWAP is a challenge of utilizing higher grade IMUs in INS navigational solutions for SAR applications. This problem is amplified when developing miniaturized SAR systems. In this dissertation, a method of leveraging the benefits of data fusion to combine multiple IMUs to produce higher accuracy INS solutions is presented. Specifically, the C-SWAP can be reduced when utilizing lower-quality IMUs. The use of lower quality IMUs presents an additional challenge of providing positional solutions at the rates required for SAR. A method of interpolating the position provided by the fusion algorithm while maintaining positional accuracy is also presented in this dissertation.
The methods presented in this dissertation are successful in providing accurate positional solutions from lower C-SWAP INS. The presented methods are verified in simulations of motion paths and the results of the fusion algorithms are evaluated for accuracy. The presented methods are instrumented in both ground and flight tests and the results are compared to a 3rd party accurate position solution for an accuracy metric. Lastly, the algorithms are implemented in a miniaturized SAR system and both ground and airborne SAR tests are conducted to evaluate the effectiveness of the algorithms. In general, the designed algorithms are capable of producing positional accuracy at the rate required to focus SAR images in a miniaturized SAR system
Energy Minimization for Multiple Object Tracking
Multiple target tracking aims at reconstructing trajectories of several
moving targets in a dynamic scene, and is of significant relevance for a
large number of applications. For example, predicting a pedestrian’s
action may be employed to warn an inattentive driver and reduce road
accidents; understanding a dynamic environment will facilitate
autonomous robot navigation; and analyzing crowded scenes can prevent
fatalities in mass panics.
The task of multiple target tracking is challenging for various reasons:
First of all, visual data is often ambiguous. For example, the objects
to be tracked can remain undetected due to low contrast and occlusion.
At the same time, background clutter can cause spurious measurements
that distract the tracking algorithm. A second challenge arises when
multiple measurements appear close to one another. Resolving
correspondence ambiguities leads to a combinatorial problem that quickly
becomes more complex with every time step. Moreover, a realistic model
of multi-target tracking should take physical constraints into account.
This is not only important at the level of individual targets but also
regarding interactions between them, which adds to the complexity of the
problem.
In this work the challenges described above are addressed by means of
energy minimization. Given a set of object detections, an energy
function describing the problem at hand is minimized with the goal of
finding a plausible solution for a batch of consecutive frames. Such
offline tracking-by-detection approaches have substantially advanced the
performance of multi-target tracking. Building on these ideas, this
dissertation introduces three novel techniques for multi-target tracking
that extend the state of the art as follows: The first approach
formulates the energy in discrete space, building on the work of Berclaz
et al. (2009). All possible target locations are reduced to a regular
lattice and tracking is posed as an integer linear program (ILP),
enabling (near) global optimality. Unlike prior work, however, the
proposed formulation includes a dynamic model and additional constraints
that enable performing non-maxima suppression (NMS) at the level of
trajectories. These contributions improve the performance both
qualitatively and quantitatively with respect to annotated ground truth.
The second technical contribution is a continuous energy function for
multiple target tracking that overcomes the limitations imposed by
spatial discretization. The continuous formulation is able to capture
important aspects of the problem, such as target localization or motion
estimation, more accurately. More precisely, the data term as well as
all phenomena including mutual exclusion and occlusion, appearance,
dynamics and target persistence are modeled by continuous differentiable
functions. The resulting non-convex optimization problem is minimized
locally by standard conjugate gradient descent in combination with
custom discontinuous jumps. The more accurate representation of the
problem leads to a powerful and robust multi-target tracking approach,
which shows encouraging results on particularly challenging video
sequences.
Both previous methods concentrate on reconstructing trajectories, while
disregarding the target-to-measurement assignment problem. To unify both
data association and trajectory estimation into a single optimization
framework, a discrete-continuous energy is presented in Part III of this
dissertation. Leveraging recent advances in discrete optimization
(Delong et al., 2012), it is possible to formulate multi-target tracking
as a model-fitting approach, where discrete assignments and continuous
trajectory representations are combined into a single objective
function. To enable efficient optimization, the energy is minimized
locally by alternating between the discrete and the continuous set of
variables.
The final contribution of this dissertation is an extensive discussion
on performance evaluation and comparison of tracking algorithms, which
points out important practical issues that ought not be ignored
Recommended from our members
Automatic Multilevel Feature Abstraction in Adaptable Machine Vision Systems
Vision is a complex task which can be accomplished with apparent ease by biological systems, but for which the design of artificial systems is difficult. Although machine vision systems can be successfully designed for a specific task, under certain conditions, they are likely to fail if circumstances change. This was the motivation for the research into ways in which systems can be self-designing and adaptable to new visual tasks. The research was conducted in three vital areas of concern for machine vision systems.
The first area is finding a suitable architecture for forming an appropriate representation for the current task. The research investigated the application of Hypernetworks theory to building a multilevel, generally-applicable representation, through repeated application of a fundamental 'self-similarity' principle, that parts of objects assembled under a particular relation at one level, form whole objects at the next. Results show that this is potentially a powerful approach for autonomously generating an adaptable system-architecture suitable for multiple visual tasks.
The second area is the autonomous extraction of suitable low-level features, which the research investigated through random generation of minimally-constrained pixel-configurations and algorithmic generation of homogeneous and heterogeneous polygons. The results suggest that, despite the simplicity of the features making them vulnerable to image transformations, these are promising approaches worth developing further.
The third area is automatic feature selection. The research explored management of 'dimensionality' and of 'combinatorial explosion', as well as how to locate relevant features at multiple representation levels, in the context of 'emergence' of structure. Results indicate that this approach can find useful 'intermediate-level' constructs through analysis of the connectivity of the simplices representing objects at higher levels.
The research concludes that the proposed novel approaches to tackling the above issues, in particular the application of hypernetworks to the formation of multilevel representations and the resulting emergence of higher-level structure, is fruitful
A Comprehensive Review on Autonomous Navigation
The field of autonomous mobile robots has undergone dramatic advancements
over the past decades. Despite achieving important milestones, several
challenges are yet to be addressed. Aggregating the achievements of the robotic
community as survey papers is vital to keep the track of current
state-of-the-art and the challenges that must be tackled in the future. This
paper tries to provide a comprehensive review of autonomous mobile robots
covering topics such as sensor types, mobile robot platforms, simulation tools,
path planning and following, sensor fusion methods, obstacle avoidance, and
SLAM. The urge to present a survey paper is twofold. First, autonomous
navigation field evolves fast so writing survey papers regularly is crucial to
keep the research community well-aware of the current status of this field.
Second, deep learning methods have revolutionized many fields including
autonomous navigation. Therefore, it is necessary to give an appropriate
treatment of the role of deep learning in autonomous navigation as well which
is covered in this paper. Future works and research gaps will also be
discussed
- …