18 research outputs found
Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy
With the advent of agriculture 3.0 and 4.0, researchers are increasingly
focusing on the development of innovative smart farming and precision
agriculture technologies by introducing automation and robotics into the
agricultural processes. Autonomous agricultural field machines have been
gaining significant attention from farmers and industries to reduce costs,
human workload, and required resources. Nevertheless, achieving sufficient
autonomous navigation capabilities requires the simultaneous cooperation of
different processes; localization, mapping, and path planning are just some of
the steps that aim at providing to the machine the right set of skills to
operate in semi-structured and unstructured environments. In this context, this
study presents a low-cost local motion planner for autonomous navigation in
vineyards based only on an RGB-D camera, low range hardware, and a dual layer
control algorithm. The first algorithm exploits the disparity map and its depth
representation to generate a proportional control for the robotic platform.
Concurrently, a second back-up algorithm, based on representations learning and
resilient to illumination variations, can take control of the machine in case
of a momentaneous failure of the first block. Moreover, due to the double
nature of the system, after initial training of the deep learning model with an
initial dataset, the strict synergy between the two algorithms opens the
possibility of exploiting new automatically labeled data, coming from the
field, to extend the existing model knowledge. The machine learning algorithm
has been trained and tested, using transfer learning, with acquired images
during different field surveys in the North region of Italy and then optimized
for on-device inference with model pruning and quantization. Finally, the
overall system has been validated with a customized robot platform in the
relevant environment
Multi-image Super Resolution of Remotely Sensed Images using Residual Feature Attention Deep Neural Networks
Convolutional Neural Networks (CNNs) have been consistently proved
state-of-the-art results in image Super-Resolution (SR), representing an
exceptional opportunity for the remote sensing field to extract further
information and knowledge from captured data. However, most of the works
published in the literature have been focusing on the Single-Image
Super-Resolution problem so far. At present, satellite based remote sensing
platforms offer huge data availability with high temporal resolution and low
spatial resolution. In this context, the presented research proposes a novel
residual attention model (RAMS) that efficiently tackles the multi-image
super-resolution task, simultaneously exploiting spatial and temporal
correlations to combine multiple images. We introduce the mechanism of visual
feature attention with 3D convolutions in order to obtain an aware data fusion
and information extraction of the multiple low-resolution images, transcending
limitations of the local region of convolutional operations. Moreover, having
multiple inputs with the same scene, our representation learning network makes
extensive use of nestled residual connections to let flow redundant
low-frequency signals and focus the computation on more important
high-frequency components. Extensive experimentation and evaluations against
other available solutions, either for single or multi-image super-resolution,
have demonstrated that the proposed deep learning-based solution can be
considered state-of-the-art for Multi-Image Super-Resolution for remote sensing
applications
A Cost-Effective Person-Following System for Assistive Unmanned Vehicles with Deep Learning at the Edge
The vital statistics of the last century highlight a sharp increment of the
average age of the world population with a consequent growth of the number of
older people. Service robotics applications have the potentiality to provide
systems and tools to support the autonomous and self-sufficient older adults in
their houses in everyday life, thereby avoiding the task of monitoring them
with third parties. In this context, we propose a cost-effective modular
solution to detect and follow a person in an indoor, domestic environment. We
exploited the latest advancements in deep learning optimization techniques, and
we compared different neural network accelerators to provide a robust and
flexible person-following system at the edge. Our proposed cost-effective and
power-efficient solution is fully-integrable with pre-existing navigation
stacks and creates the foundations for the development of fully-autonomous and
self-contained service robotics applications
Local Planners with Deep Reinforcement Learning for Indoor Autonomous Navigation
Autonomous indoor navigation requires an elab- orated and accurate algorithmic stack, able to guide robots through cluttered, unstructured, and dynamic environments. Global and local path planning, mapping, localization, and decision making are only some of the required layers that undergo heavy research from the scientific community to achieve the requirements for fully functional autonomous navigation. In the last years, Deep Reinforcement Learning (DRL) has proven to be a competitive short-range guidance system solution for power-efficient and low computational cost point-to-point local planners. One of the main strengths of this approach is the possibility to train a DRL agent in a simulated environment that encapsulates robot dynamics and task constraints and then deploy its learned point-to-point navigation policy in a real setting. However, despite DRL easily integrates complex mechanical dynamics and multimodal signals into a single model, the effect of different sensor data on navigation performance has not been investigated yet. In this paper, we compare two different DRL navigation solutions that leverage LiDAR and depth camera information, respectively. The agents are trained in the same simulated environment and tested on a common benchmark to highlight the strengths and criticalities of each technique
UAV and Machine Learning Based Refinement of a Satellite-Driven Vegetation Index for Precision Agriculture
Precision agriculture is considered to be a fundamental approach in pursuing
a low-input, high-efficiency, and sustainable kind of agriculture when
performing site-specific management practices. To achieve this objective, a
reliable and updated description of the local status of crops is required.
Remote sensing, and in particular satellite-based imagery, proved to be a
valuable tool in crop mapping, monitoring, and diseases assessment. However,
freely available satellite imagery with low or moderate resolutions showed some
limits in specific agricultural applications, e.g., where crops are grown by
rows. Indeed, in this framework, the satellite's output could be biased by
intra-row covering, giving inaccurate information about crop status. This paper
presents a novel satellite imagery refinement framework, based on a deep
learning technique which exploits information properly derived from high
resolution images acquired by unmanned aerial vehicle (UAV) airborne
multispectral sensors. To train the convolutional neural network, only a single
UAV-driven dataset is required, making the proposed approach simple and
cost-effective. A vineyard in Serralunga d'Alba (Northern Italy) was chosen as
a case study for validation purposes. Refined satellite-driven normalized
difference vegetation index (NDVI) maps, acquired in four different periods
during the vine growing season, were shown to better describe crop status with
respect to raw datasets by correlation analysis and ANOVA. In addition, using a
K-means based classifier, 3-class vineyard vigor maps were profitably derived
from the NDVI maps, which are a valuable tool for growers
Exploring Subgroup Performance In End-to-End Speech Models
End-to-End Spoken Language Understanding models are generally evaluated according to their overall accuracy, or separately on (a priori defined) data subgroups of interest. We propose a technique for analyzing model performance at the subgroup level, which considers all subgroups that can be defined via a given set of metadata and are above a specified minimum size. The metadata can represent user characteristics, recording conditions, and speech targets. Our technique is based on advances in model bias analysis, enabling efficient exploration of resulting subgroups. A fine-grained analysis reveals how model performance varies across subgroups, identifying modeling issues or bias towards specific subgroups. We compare the subgroup-level performance of models based on wav2vec 2.0 and HuBERT on the Fluent Speech Commands dataset. The experimental results illustrate how subgroup-level analysis reveals a finer and more complete picture of performance changes when models are replaced, automatically identifying the subgroups that most benefit or fail to benefit from the chang
Machine Learning Algorithms and their Embedded Implementation for Service Robotics Applications
L'abstract è presente nell'allegato / the abstract is in the attachmen
Efficient-CapsNet: Capsule Network with Self-Attention Routing
Deep convolutional neural networks, assisted by architectural design
strategies, make extensive use of data augmentation techniques and layers with
a high number of feature maps to embed object transformations. That is highly
inefficient and for large datasets implies a massive redundancy of features
detectors. Even though capsules networks are still in their infancy, they
constitute a promising solution to extend current convolutional networks and
endow artificial visual perception with a process to encode more efficiently
all feature affine transformations. Indeed, a properly working capsule network
should theoretically achieve higher results with a considerably lower number of
parameters count due to intrinsic capability to generalize to novel viewpoints.
Nevertheless, little attention has been given to this relevant aspect. In this
paper, we investigate the efficiency of capsule networks and, pushing their
capacity to the limits with an extreme architecture with barely 160K
parameters, we prove that the proposed architecture is still able to achieve
state-of-the-art results on three different datasets with only 2% of the
original CapsNet parameters. Moreover, we replace dynamic routing with a novel
non-iterative, highly parallelizable routing algorithm that can easily cope
with a reduced number of capsules. Extensive experimentation with other capsule
implementations has proved the effectiveness of our methodology and the
capability of capsule networks to efficiently embed visual representations more
prone to generalization.Comment: Accepted by Scientific Report