16,456 research outputs found
Towards Autonomous Selective Harvesting: A Review of Robot Perception, Robot Design, Motion Planning and Control
This paper provides an overview of the current state-of-the-art in selective
harvesting robots (SHRs) and their potential for addressing the challenges of
global food production. SHRs have the potential to increase productivity,
reduce labour costs, and minimise food waste by selectively harvesting only
ripe fruits and vegetables. The paper discusses the main components of SHRs,
including perception, grasping, cutting, motion planning, and control. It also
highlights the challenges in developing SHR technologies, particularly in the
areas of robot design, motion planning and control. The paper also discusses
the potential benefits of integrating AI and soft robots and data-driven
methods to enhance the performance and robustness of SHR systems. Finally, the
paper identifies several open research questions in the field and highlights
the need for further research and development efforts to advance SHR
technologies to meet the challenges of global food production. Overall, this
paper provides a starting point for researchers and practitioners interested in
developing SHRs and highlights the need for more research in this field.Comment: Preprint: to be appeared in Journal of Field Robotic
Security and Privacy Problems in Voice Assistant Applications: A Survey
Voice assistant applications have become omniscient nowadays. Two models that
provide the two most important functions for real-life applications (i.e.,
Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR)
models and Speaker Identification (SI) models. According to recent studies,
security and privacy threats have also emerged with the rapid development of
the Internet of Things (IoT). The security issues researched include attack
techniques toward machine learning models and other hardware components widely
used in voice assistant applications. The privacy issues include technical-wise
information stealing and policy-wise privacy breaches. The voice assistant
application takes a steadily growing market share every year, but their privacy
and security issues never stopped causing huge economic losses and endangering
users' personal sensitive information. Thus, it is important to have a
comprehensive survey to outline the categorization of the current research
regarding the security and privacy problems of voice assistant applications.
This paper concludes and assesses five kinds of security attacks and three
types of privacy threats in the papers published in the top-tier conferences of
cyber security and voice domain.Comment: 5 figure
Testing the nomological network for the Personal Engagement Model
The study of employee engagement has been a key focus of management for over three decades. The academic literature on engagement has generated multiple definitions but there are two primary models of engagement: the Personal Engagement Model of Kahn (1990), and the Work Engagement Model (WEM) of Schaufeli et al., (2002). While the former is cited by most authors as the seminal work on engagement, research has tended to focus on elements of the model and most theoretical work on engagement has predominantly used the WEM to consider the topic.
The purpose of this study was to test all the elements of the nomological network of the PEM to determine whether the complete model of personal engagement is viable. This was done using data from a large, complex public sector workforce. Survey questions were designed to test each element of the PEM and administered to a sample of the workforce (n = 3,103). The scales were tested and refined using confirmatory factor analysis and then the model was tested determine the structure of the nomological network. This was validated and the generalisability of the final model was tested across different work and organisational types.
The results showed that the PEM is viable but there were differences from what was originally proposed by Kahn (1990). Specifically, of the three psychological conditions deemed necessary for engagement to occur, meaningfulness, safety, and availability, only meaningfulness was found to contribute to employee engagement. The model demonstrated that employees experience meaningfulness through both the nature of the work that they do and the organisation within which they do their work. Finally, the findings were replicated across employees in different work types and different organisational types.
This thesis makes five contributions to the engagement paradigm. It advances engagement theory by testing the PEM and showing that it is an adequate representation of engagement. A model for testing the causal mechanism for engagement has been articulated, demonstrating that meaningfulness in work is a primary mechanism for engagement. The research has shown the key aspects of the workplace in which employees experience meaningfulness, the nature of the work that they do and the organisation within which they do it. It has demonstrated that this is consistent across organisations and the type of work. Finally, it has developed a reliable measure of the different elements of the PEM which will support future research in this area
CoRe-Sleep: A Multimodal Fusion Framework for Time Series Robust to Imperfect Modalities
Sleep abnormalities can have severe health consequences. Automated sleep
staging, i.e. labelling the sequence of sleep stages from the patient's
physiological recordings, could simplify the diagnostic process. Previous work
on automated sleep staging has achieved great results, mainly relying on the
EEG signal. However, often multiple sources of information are available beyond
EEG. This can be particularly beneficial when the EEG recordings are noisy or
even missing completely. In this paper, we propose CoRe-Sleep, a Coordinated
Representation multimodal fusion network that is particularly focused on
improving the robustness of signal analysis on imperfect data. We demonstrate
how appropriately handling multimodal information can be the key to achieving
such robustness. CoRe-Sleep tolerates noisy or missing modalities segments,
allowing training on incomplete data. Additionally, it shows state-of-the-art
performance when testing on both multimodal and unimodal data using a single
model on SHHS-1, the largest publicly available study that includes sleep stage
labels. The results indicate that training the model on multimodal data does
positively influence performance when tested on unimodal data. This work aims
at bridging the gap between automated analysis tools and their clinical
utility.Comment: 10 pages, 4 figures, 2 tables, journa
Concept Graph Neural Networks for Surgical Video Understanding
We constantly integrate our knowledge and understanding of the world to
enhance our interpretation of what we see.
This ability is crucial in application domains which entail reasoning about
multiple entities and concepts, such as AI-augmented surgery. In this paper, we
propose a novel way of integrating conceptual knowledge into temporal analysis
tasks via temporal concept graph networks. In the proposed networks, a global
knowledge graph is incorporated into the temporal analysis of surgical
instances, learning the meaning of concepts and relations as they apply to the
data. We demonstrate our results in surgical video data for tasks such as
verification of critical view of safety, as well as estimation of Parkland
grading scale. The results show that our method improves the recognition and
detection of complex benchmarks as well as enables other analytic applications
of interest
Loop Closure Detection Based on Object-level Spatial Layout and Semantic Consistency
Visual simultaneous localization and mapping (SLAM) systems face challenges
in detecting loop closure under the circumstance of large viewpoint changes. In
this paper, we present an object-based loop closure detection method based on
the spatial layout and semanic consistency of the 3D scene graph. Firstly, we
propose an object-level data association approach based on the semantic
information from semantic labels, intersection over union (IoU), object color,
and object embedding. Subsequently, multi-view bundle adjustment with the
associated objects is utilized to jointly optimize the poses of objects and
cameras. We represent the refined objects as a 3D spatial graph with semantics
and topology. Then, we propose a graph matching approach to select
correspondence objects based on the structure layout and semantic property
similarity of vertices' neighbors. Finally, we jointly optimize camera
trajectories and object poses in an object-level pose graph optimization, which
results in a globally consistent map. Experimental results demonstrate that our
proposed data association approach can construct more accurate 3D semantic
maps, and our loop closure method is more robust than point-based and
object-based methods in circumstances with large viewpoint changes
CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition
We present CrossLoc3D, a novel 3D place recognition method that solves a
large-scale point matching problem in a cross-source setting. Cross-source
point cloud data corresponds to point sets captured by depth sensors with
different accuracies or from different distances and perspectives. We address
the challenges in terms of developing 3D place recognition methods that account
for the representation gap between points captured by different sources. Our
method handles cross-source data by utilizing multi-grained features and
selecting convolution kernel sizes that correspond to most prominent features.
Inspired by the diffusion models, our method uses a novel iterative refinement
process that gradually shifts the embedding spaces from different sources to a
single canonical space for better metric learning. In addition, we present
CS-Campus3D, the first 3D aerial-ground cross-source dataset consisting of
point cloud data from both aerial and ground LiDAR scans. The point clouds in
CS-Campus3D have representation gaps and other features like different views,
point densities, and noise patterns. We show that our CrossLoc3D algorithm can
achieve an improvement of 4.74% - 15.37% in terms of the top 1 average recall
on our CS-Campus3D benchmark and achieves performance comparable to
state-of-the-art 3D place recognition method on the Oxford RobotCar. We will
release the code and CS-Campus3D benchmark
Learning Spiking Neural Systems with the Event-Driven Forward-Forward Process
We develop a novel credit assignment algorithm for information processing
with spiking neurons without requiring feedback synapses. Specifically, we
propose an event-driven generalization of the forward-forward and the
predictive forward-forward learning processes for a spiking neural system that
iteratively processes sensory input over a stimulus window. As a result, the
recurrent circuit computes the membrane potential of each neuron in each layer
as a function of local bottom-up, top-down, and lateral signals, facilitating a
dynamic, layer-wise parallel form of neural computation. Unlike spiking neural
coding, which relies on feedback synapses to adjust neural electrical activity,
our model operates purely online and forward in time, offering a promising way
to learn distributed representations of sensory data patterns with temporal
spike signals. Notably, our experimental results on several pattern datasets
demonstrate that the even-driven forward-forward (ED-FF) framework works well
for training a dynamic recurrent spiking system capable of both classification
and reconstruction
Deep Transfer Learning Applications in Intrusion Detection Systems: A Comprehensive Review
Globally, the external Internet is increasingly being connected to the
contemporary industrial control system. As a result, there is an immediate need
to protect the network from several threats. The key infrastructure of
industrial activity may be protected from harm by using an intrusion detection
system (IDS), a preventive measure mechanism, to recognize new kinds of
dangerous threats and hostile activities. The most recent artificial
intelligence (AI) techniques used to create IDS in many kinds of industrial
control networks are examined in this study, with a particular emphasis on
IDS-based deep transfer learning (DTL). This latter can be seen as a type of
information fusion that merge, and/or adapt knowledge from multiple domains to
enhance the performance of the target task, particularly when the labeled data
in the target domain is scarce. Publications issued after 2015 were taken into
account. These selected publications were divided into three categories:
DTL-only and IDS-only are involved in the introduction and background, and
DTL-based IDS papers are involved in the core papers of this review.
Researchers will be able to have a better grasp of the current state of DTL
approaches used in IDS in many different types of networks by reading this
review paper. Other useful information, such as the datasets used, the sort of
DTL employed, the pre-trained network, IDS techniques, the evaluation metrics
including accuracy/F-score and false alarm rate (FAR), and the improvement
gained, were also covered. The algorithms, and methods used in several studies,
or illustrate deeply and clearly the principle in any DTL-based IDS subcategory
are presented to the reader
Desarrollo de una batería de memoria semántica para pacientes con epilepsia del lóbulo temporal
La epilepsia focal más frecuente es aquella epilepsia cuyo foco epileptógeno está localizado en el lóbulo temporal medial y es secundaria a una esclerosis con atrofia de la región amígdalo-hipocámpica, con una red epileptógena que abarca la porción anterior del lóbulo temporal. En ocasiones los pacientes requieren de un tratamiento quirúrgico que incluye la resección unilateral de ambas regiones, tanto del polo anterior, como del complejo amígdala-hipocampo. Estas estructuras han demostrado tener gran importancia para el procesamiento de la memoria semántica (región anterotemporal) y episódica (región amígdalo-hipocámpica), por lo que los pacientes que son sometidos a esta intervención suelen presentar quejas cognitivas relacionadas con ambos tipos de memoria. Sin embargo, parece que las evaluaciones neuropsicológicas que realizamos de forma rutinaria en las diferentes Unidades de Epilepsia no son capaces de detectar todos los problemas cognitivos que ocurren en estos pacientes ya que, a pesar de las dificultades expresadas por estos, las evaluaciones no muestran alteraciones. La hipótesis principal del presente trabajo es que estas quejas se deben a tipos de memoria que no están incluidos en las pruebas neuropsicológicas actuales y, por tanto, no somos capaces de identificar bien sus problemas. En primer lugar, se propone que la memoria semántica está afectada, pero solamente para palabras de baja frecuencia de uso en la vida diaria, no analizadas en las evaluaciones convencionales actuales. En segundo lugar, otros problemas no objetivados se deben a un problema de la memoria de consolidación, medida como olvido a largo plazo acelerado que se detecta cuando se amplia el periodo de evaluación del recuerdo. Además, estas alteraciones van a manifestarse con mayor intensidad en pacientes cuyo foco epileptógeno está localizado en el lóbulo temporal izquierdo. Los objetivos fundamentales de este trabajo son evaluar en pacientes con epilepsia del lóbulo temporal medial intervenidos quirúrgicamente mediante lobectomía temporal anterior con amigdalohipocampectomía la presencia de alteraciones de la memoria verbal tanto semántica como episódica, así como conocer su valor lateralizador según el hemisferio afectado. El estudio se basó en la comparación de pacientes con epilepsia del lóbulo temporal (ELT) tratados con lobectomía temporal anterior con amigdalohipocampectomía con un grupo control de personas sanas, comparables respecto a edad, nivel educativo y coeficiente intelectual (CI). Las pruebas de memoria semántica mostraron que únicamente los pacientes con ELT izquierda tenían alteraciones, especialmente para ítems de baja frecuencia y tanto en tares de expresión como de comprensión verbal. Asimismo, el tiempo de reacción fue mayor en el grupo de pacientes con ELT izquierda para todos los ítems y únicamente para las palabras o conceptos de baja frecuencia en aquellos con ELT derecha. Además, se incluyó una prueba de memoria episódica estándar (RAVLT) que en lugar de restringir la evaluación a 30 minutos, se evaluó a 7 días para medir el olvido a largo plazo. Los resultados mostraron que los dos grupos de pacientes, tanto los de ELT izquierda como aquellos con ELT derecha, desarrollaron olvido a largo plazo. Por último los resultados mostraron que la presencia de crisis epilépticas no afectó a la presencia de olvido a largo plazo acelerado
- …