320 research outputs found
On the Utility of Representation Learning Algorithms for Myoelectric Interfacing
Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden
Visual place recognition for improved open and uncertain navigation
Visual place recognition localises a query place image by comparing it against a reference database of known place images, a fundamental element of robotic navigation.
Recent work focuses on using deep learning to learn image descriptors for this task
that are invariant to appearance changes from dynamic lighting, weather and seasonal
conditions. However, these descriptors: require greater computational resources than
are available on robotic hardware, have few SLAM frameworks designed to utilise
them, return a relative comparison between image descriptors which is difficult to interpret, cannot be used for appearance invariance in other navigation tasks such as
scene classification and are unable to identify query images from an open environment that have no true match in the reference database. This thesis addresses these
challenges with three contributions. The first is a lightweight visual place recognition
descriptor combined with a probabilistic filter to address a subset of the visual SLAM
problem in real-time. The second contribution combines visual place recognition and
scene classification for appearance invariant scene classification, which is extended
to recognise unknown scene classes when navigating an open environment. The final contribution uses comparisons between query and reference image descriptors to
classify whether they result in a true, or false positive localisation and whether a true
match for the query image exists in the reference database.Edinburgh Centre for Robotics and Engineering and Physical Sciences Research Council (EPSRC) fundin
Visual Guidance for Unmanned Aerial Vehicles with Deep Learning
Unmanned Aerial Vehicles (UAVs) have been widely applied in the military and civilian domains. In recent years, the operation mode of UAVs is evolving from teleoperation to autonomous flight. In order to fulfill the goal of autonomous flight, a reliable guidance system is essential. Since the combination of Global Positioning System (GPS) and Inertial Navigation System (INS) systems cannot sustain autonomous flight in some situations where GPS can be degraded or unavailable, using computer vision as a primary method for UAV guidance has been widely explored. Moreover, GPS does not provide any information to the robot on the presence of obstacles.
Stereo cameras have complex architecture and need a minimum baseline to generate disparity map. By contrast, monocular cameras are simple and require less hardware resources. Benefiting from state-of-the-art Deep Learning (DL) techniques, especially Convolutional Neural Networks (CNNs), a monocular camera is sufficient to extrapolate mid-level visual representations such as depth maps and optical flow (OF) maps from the environment. Therefore, the objective of this thesis is to develop a real-time visual guidance method for UAVs in cluttered environments using a monocular camera and DL.
The three major tasks performed in this thesis are investigating the development of DL techniques and monocular depth estimation (MDE), developing real-time CNNs for MDE, and developing visual guidance methods on the basis of the developed MDE system. A comprehensive survey is conducted, which covers Structure from Motion (SfM)-based methods, traditional handcrafted feature-based methods, and state-of-the-art DL-based methods. More importantly, it also investigates the application of MDE in robotics. Based on the survey, two CNNs for MDE are developed. In addition to promising accuracy performance, these two CNNs run at high frame rates (126 fps and 90 fps respectively), on a single modest power Graphical Processing Unit (GPU).
As regards the third task, the visual guidance for UAVs is first developed on top of the designed MDE networks. To improve the robustness of UAV guidance, OF maps are integrated into the developed visual guidance method. A cross-attention module is applied to fuse the features learned from the depth maps and OF maps. The fused features are then passed through a deep reinforcement learning (DRL) network to generate the policy for guiding the flight of UAV. Additionally, a simulation framework is developed which integrates AirSim, Unreal Engine and PyTorch. The effectiveness of the developed visual guidance method is validated through extensive experiments in the simulation framework
Transport 2040 : Impact of Technology on Seafarers - The Future of Work
https://commons.wmu.se/lib_reports/1091/thumbnail.jp
Dense Visual Simultaneous Localisation and Mapping in Collaborative and Outdoor Scenarios
Dense visual simultaneous localisation and mapping (SLAM) systems can produce 3D
reconstructions that are digital facsimiles of the physical space they describe. Systems that
can produce dense maps with this level of fidelity in real time provide foundational spatial
reasoning capabilities for many downstream tasks in autonomous robotics. Over the past
15 years, mapping small scale, indoor environments, such as desks and buildings, with a
single slow moving, hand-held sensor has been one of the central focuses of dense visual
SLAM research.
However, most dense visual SLAM systems exhibit a number of limitations which
mean they cannot be directly applied in collaborative or outdoors settings. The contribution
of this thesis is to address these limitations with the development of new systems and
algorithms for collaborative dense mapping, efficient dense alternation and outdoors
operation with fast camera motion and wide field of view (FOV) cameras. We use
ElasticFusion, a state-of-the-art dense SLAM system, as our starting point where each of
these contributions is implemented as a novel extension to the system.
We first present a collaborative dense SLAM system that allows a number of
cameras starting with unknown initial relative positions to maintain local maps with the
original ElasticFusion algorithm. Visual place recognition across local maps results in
constraints that allow maps to be aligned into a common global reference frame, facilitating
collaborative mapping and tracking of multiple cameras within a shared map.
Within dense alternation based SLAM systems, the standard approach is to fuse
every frame into the dense model without considering whether the information contained
within the frame is already captured by the dense map and therefore redundant. As the
number of cameras or the scale of the map increases, this approach becomes inefficient. In
our second contribution, we address this inefficiency by introducing a novel information
theoretic approach to keyframe selection that allows the system to avoid processing
redundant information. We implement the procedure within ElasticFusion, demonstrating
a marked reduction in the number of frames required by the system to estimate an accurate,
denoised surface reconstruction.
Before dense SLAM techniques can be applied in outdoor scenarios we must
first address their reliance on active depth cameras, and their lack of suitability to fast
camera motion. In our third contribution we present an outdoor dense SLAM system. The system overcomes the need for an active sensor by employing neural network-based depth
inference to predict the geometry of the scene as it appears in each image. To address the
issue of camera tracking during fast motion we employ a hybrid architecture, combining
elements of both dense and sparse SLAM systems to perform camera tracking and to
achieve globally consistent dense mapping.
Automotive applications present a particularly important setting for dense visual
SLAM systems. Such applications are characterised by their use of wide FOV cameras and
are therefore not accurately modelled by the standard pinhole camera model. The fourth
contribution of this thesis is to extend the above hybrid sparse-dense monocular SLAM
system to cater for large FOV fisheye imagery. This is achieved by reformulating the
mapping pipeline in terms of the Kannala-Brandt fisheye camera model. To estimate depth,
we introduce a new version of the PackNet depth estimation neural network (Guizilini et
al., 2020) adapted for fisheye inputs.
To demonstrate the effectiveness of our contributions, we present experimental
results, computed by processing the synthetic ICL-NUIM dataset of Handa et al. (2014) as
well as the real-world TUM-RGBD dataset of Sturm et al. (2012). For outdoor SLAM we
show the results of our system processing the autonomous driving KITTI and KITTI-360
datasets of Geiger et al. (2012a) and Liao et al. (2021) respectively
- …