11,850 research outputs found
DSCA-PSPNet: Dynamic spatial-channel attention pyramid scene parsing network for sugarcane field segmentation in satellite imagery
Sugarcane plays a vital role in many global economies, and its efficient cultivation is critical for sustainable development. A central challenge in sugarcane yield prediction and cultivation management is the precise segmentation of sugarcane fields from satellite imagery. This task is complicated by numerous factors, including varying environmental conditions, scale variability, and spectral similarities between crops and non-crop elements. To address these segmentation challenges, we introduce DSCA-PSPNet, a novel deep learning model with a unique architecture that combines a modified ResNet34 backbone, the Pyramid Scene Parsing Network (PSPNet), and newly proposed Dynamic Squeeze-and-Excitation Context (D-scSE) blocks. Our model effectively adapts to discern the importance of both spatial and channel-wise information, providing superior feature representation for sugarcane fields. We have also created a comprehensive high-resolution satellite imagery dataset from Guangxi’s Fusui County, captured on December 17, 2017, which encompasses a broad spectrum of sugarcane field characteristics and environmental conditions. In comparative studies, DSCA-PSPNet outperforms other state-of-the-art models, achieving an Intersection over Union (IoU) of 87.58%, an accuracy of 92.34%, a precision of 93.80%, a recall of 93.21%, and an F1-Score of 92.38%. Application tests on an RTX 3090 GPU, with input image resolutions of 512 × 512, yielded a prediction time of 4.57ms, a parameter size of 22.57MB, GFLOPs of 11.41, and a memory size of 84.47MB. An ablation study emphasized the vital role of the D-scSE module in enhancing DSCA-PSPNet’s performance. Our contributions in dataset generation and model development open new avenues for tackling the complexities of sugarcane field segmentation, thus contributing to advances in precision agriculture. The source code and dataset will be available on the GitHub repository https://github.com/JulioYuan/DSCA-PSPNet/tree/main
Simulation-based test case generation for unmanned aerial vehicles in the neighborhood of real flights
Unmanned aerial vehicles (UAVs), also known as drones, are acquiring increasing autonomy. With their commercial adoption, the problem of testing their functional and non-functional, and in particular their safety requirements has become a critical concern. Simulation-based testing represents a fundamental practice, but the testing scenarios considered in software-in-the-loop testing may not be representative of the actual scenarios experienced in the field.
In this paper, we propose SURREAL (teSting Uavs in the neighboRhood of REAl fLights), a novel search-based approach that analyses logs of real UAV flights and automatically generates simulation-based tests in the neighborhood of such real flights, thereby improving the realism and representativeness of the simulation-based tests. This is done in two steps: first, SURREAL faithfully replicates the given UAV flight in the simulation environment, generating a simulation-based test that mirrors a pre-logged real-world behavior. Then, it smoothly manipulates the replicated flight conditions to discover slightly modified flight scenarios that are challenging or trigger misbehaviors of the UAV under test in simulation. In our experiments, we were able to replicate a real flight accurately in the simulation environment and to expose unstable and potentially unsafe behavior in the neighborhood of a flight, which even led to crashes
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Dataset Distillation with Convexified Implicit Gradients
We propose a new dataset distillation algorithm using reparameterization and
convexification of implicit gradients (RCIG), that substantially improves the
state-of-the-art. To this end, we first formulate dataset distillation as a
bi-level optimization problem. Then, we show how implicit gradients can be
effectively used to compute meta-gradient updates. We further equip the
algorithm with a convexified approximation that corresponds to learning on top
of a frozen finite-width neural tangent kernel. Finally, we improve bias in
implicit gradients by parameterizing the neural network to enable analytical
computation of final-layer parameters given the body parameters. RCIG
establishes the new state-of-the-art on a diverse series of dataset
distillation tasks. Notably, with one image per class, on resized ImageNet,
RCIG sees on average a 108% improvement over the previous state-of-the-art
distillation algorithm. Similarly, we observed a 66% gain over SOTA on
Tiny-ImageNet and 37% on CIFAR-100
The Application of Data Analytics Technologies for the Predictive Maintenance of Industrial Facilities in Internet of Things (IoT) Environments
In industrial production environments, the maintenance of equipment has a decisive influence on costs and on the plannability of production capacities. In particular, unplanned failures during production times cause high costs, unplanned downtimes and possibly additional collateral damage. Predictive Maintenance starts here and tries to predict a possible failure and its cause so early that its prevention can be prepared and carried out in time. In order to be able to predict malfunctions and failures, the industrial plant with its characteristics, as well as wear and ageing processes, must be modelled. Such modelling can be done by replicating its physical properties. However, this is very complex and requires enormous expert knowledge about the plant and about wear and ageing processes of each individual component. Neural networks and machine learning make it possible to train such models using data and offer an alternative, especially when very complex and non-linear behaviour is evident.
In order for models to make predictions, as much data as possible about the condition of a plant and its environment and production planning data is needed. In Industrial Internet of Things (IIoT) environments, the amount of available data is constantly increasing. Intelligent sensors and highly interconnected production facilities produce a steady stream of data. The sheer volume of data, but also the steady stream in which data is transmitted, place high demands on the data processing systems. If a participating system wants to perform live analyses on the incoming data streams, it must be able to process the incoming data at least as fast as the continuous data stream delivers it. If this is not the case, the system falls further and further behind in processing and thus in its analyses. This also applies to Predictive Maintenance systems, especially if they use complex and computationally intensive machine learning models. If sufficiently scalable hardware resources are available, this may not be a problem at first. However, if this is not the case or if the processing takes place on decentralised units with limited hardware resources (e.g. edge devices), the runtime behaviour and resource requirements of the type of neural network used can become an important criterion.
This thesis addresses Predictive Maintenance systems in IIoT environments using neural networks and Deep Learning, where the runtime behaviour and the resource requirements are relevant. The question is whether it is possible to achieve better runtimes with similarly result quality using a new type of neural network. The focus is on reducing the complexity of the network and improving its parallelisability. Inspired by projects in which complexity was distributed to less complex neural subnetworks by upstream measures, two hypotheses presented in this thesis emerged: a) the distribution of complexity into simpler subnetworks leads to faster processing overall, despite the overhead this creates, and b) if a neural cell has a deeper internal structure, this leads to a less complex network. Within the framework of a qualitative study, an overall impression of Predictive Maintenance applications in IIoT environments using neural networks was developed. Based on the findings, a novel model layout was developed named Sliced Long Short-Term Memory Neural Network (SlicedLSTM). The SlicedLSTM implements the assumptions made in the aforementioned hypotheses in its inner model architecture.
Within the framework of a quantitative study, the runtime behaviour of the SlicedLSTM was compared with that of a reference model in the form of laboratory tests. The study uses synthetically generated data from a NASA project to predict failures of modules of aircraft gas turbines. The dataset contains 1,414 multivariate time series with 104,897 samples of test data and 160,360 samples of training data.
As a result, it could be proven for the specific application and the data used that the SlicedLSTM delivers faster processing times with similar result accuracy and thus clearly outperforms the reference model in this respect. The hypotheses about the influence of complexity in the internal structure of the neuronal cells were confirmed by the study carried out in the context of this thesis
GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and Optimization
There are plenty of graph neural network (GNN) accelerators being proposed.
However, they highly rely on users' hardware expertise and are usually
optimized for one specific GNN model, making them challenging for practical use
. Therefore, in this work, we propose GNNBuilder, the first automated, generic,
end-to-end GNN accelerator generation framework. It features four advantages:
(1) GNNBuilder can automatically generate GNN accelerators for a wide range of
GNN models arbitrarily defined by users; (2) GNNBuilder takes standard PyTorch
programming interface, introducing zero overhead for algorithm developers; (3)
GNNBuilder supports end-to-end code generation, simulation, accelerator
optimization, and hardware deployment, realizing a push-button fashion for GNN
accelerator design; (4) GNNBuilder is equipped with accurate performance models
of its generated accelerator, enabling fast and flexible design space
exploration (DSE). In the experiments, first, we show that our accelerator
performance model has errors within for latency prediction and
for BRAM count prediction. Second, we show that our generated accelerators can
outperform CPU by and GPU by . This framework is
open-source, and the code is available at
https://anonymous.4open.science/r/gnn-builder-83B4/.Comment: 10 pages, 7 figures, 4 tables, 3 listing
IoT-Based Vehicle Monitoring and Driver Assistance System Framework for Safety and Smart Fleet Management
Curbing road accidents has always been one of the utmost priorities in every country. In Malaysia, Traffic Investigation and Enforcement Department reported that Malaysia’s total number of road accidents has increased from 373,071 to 533,875 in the last decade. One of the significant causes of road accidents is driver’s behaviours. However, drivers’ behaviour was challenging to regulate by the enforcement team or fleet operators, especially heavy vehicles. We proposed adopting the Internet of Things (IoT) and its’ emerging technologies to monitor and alert driver’s behavioural and driving patterns in reducing road accidents. In this work, we proposed a lane tracking and iris detection algorithm to monitor and alert the driver’s behaviour when the vehicle sways away from the lane and the driver feeling drowsy, respectively. We implemented electronic devices such as cameras, a global positioning system module, a global system communication module, and a microcontroller as an intelligent transportation system in the vehicle. We implemented face recognition for person identification using the same in-vehicle camera and recorded the working duration for authentication and operation health monitoring, respectively. With the GPS module, we monitored and alerted against permissible vehicle’s speed accordingly. We integrated IoT on the system for the fleet centre to monitor and alert the driver’s behavioural activities in real-time through the user access portal. We validated it successfully on Malaysian roads. The outcome of this pilot project benefits the safety of drivers, public road users, and passengers. The impact of this framework leads to a new regulation by the government agencies towards merit and demerit system, real-time fleet monitoring of intelligent transportation systems, and socio-economy such as cheaper health premiums. The big data can be used to predict the driver’s behavioural in the future
Approximate Computing Survey, Part I: Terminology and Software & Hardware Approximation Techniques
The rapid growth of demanding applications in domains applying multimedia
processing and machine learning has marked a new era for edge and cloud
computing. These applications involve massive data and compute-intensive tasks,
and thus, typical computing paradigms in embedded systems and data centers are
stressed to meet the worldwide demand for high performance. Concurrently, the
landscape of the semiconductor field in the last 15 years has constituted power
as a first-class design concern. As a result, the community of computing
systems is forced to find alternative design approaches to facilitate
high-performance and/or power-efficient computing. Among the examined
solutions, Approximate Computing has attracted an ever-increasing interest,
with research works applying approximations across the entire traditional
computing stack, i.e., at software, hardware, and architectural levels. Over
the last decade, there is a plethora of approximation techniques in software
(programs, frameworks, compilers, runtimes, languages), hardware (circuits,
accelerators), and architectures (processors, memories). The current article is
Part I of our comprehensive survey on Approximate Computing, and it reviews its
motivation, terminology and principles, as well it classifies and presents the
technical details of the state-of-the-art software and hardware approximation
techniques.Comment: Under Review at ACM Computing Survey
Challenges for Monocular 6D Object Pose Estimation in Robotics
Object pose estimation is a core perception task that enables, for example,
object grasping and scene understanding. The widely available, inexpensive and
high-resolution RGB sensors and CNNs that allow for fast inference based on
this modality make monocular approaches especially well suited for robotics
applications. We observe that previous surveys on object pose estimation
establish the state of the art for varying modalities, single- and multi-view
settings, and datasets and metrics that consider a multitude of applications.
We argue, however, that those works' broad scope hinders the identification of
open challenges that are specific to monocular approaches and the derivation of
promising future challenges for their application in robotics. By providing a
unified view on recent publications from both robotics and computer vision, we
find that occlusion handling, novel pose representations, and formalizing and
improving category-level pose estimation are still fundamental challenges that
are highly relevant for robotics. Moreover, to further improve robotic
performance, large object sets, novel objects, refractive materials, and
uncertainty estimates are central, largely unsolved open challenges. In order
to address them, ontological reasoning, deformability handling, scene-level
reasoning, realistic datasets, and the ecological footprint of algorithms need
to be improved.Comment: arXiv admin note: substantial text overlap with arXiv:2302.1182
Pyramid Semantic Graph-based Global Point Cloud Registration with Low Overlap
Global point cloud registration is essential in many robotics tasks like loop
closing and relocalization. Unfortunately, the registration often suffers from
the low overlap between point clouds, a frequent occurrence in practical
applications due to occlusion and viewpoint change. In this paper, we propose a
graph-theoretic framework to address the problem of global point cloud
registration with low overlap. To this end, we construct a consistency graph to
facilitate robust data association and employ graduated non-convexity (GNC) for
reliable pose estimation, following the state-of-the-art (SoTA) methods.
Unlike previous approaches, we use semantic cues to scale down the dense
point clouds, thus reducing the problem size. Moreover, we address the
ambiguity arising from the consistency threshold by constructing a pyramid
graph with multi-level consistency thresholds. Then we propose a cascaded
gradient ascend method to solve the resulting densest clique problem and obtain
multiple pose candidates for every consistency threshold. Finally, fast
geometric verification is employed to select the optimal estimation from
multiple pose candidates. Our experiments, conducted on a self-collected indoor
dataset and the public KITTI dataset, demonstrate that our method achieves the
highest success rate despite the low overlap of point clouds and low semantic
quality. We have open-sourced our code
https://github.com/HKUST-Aerial-Robotics/Pagor for this project.Comment: Accepted by IROS202
- …