15,075 research outputs found
Towards Autonomous Selective Harvesting: A Review of Robot Perception, Robot Design, Motion Planning and Control
This paper provides an overview of the current state-of-the-art in selective
harvesting robots (SHRs) and their potential for addressing the challenges of
global food production. SHRs have the potential to increase productivity,
reduce labour costs, and minimise food waste by selectively harvesting only
ripe fruits and vegetables. The paper discusses the main components of SHRs,
including perception, grasping, cutting, motion planning, and control. It also
highlights the challenges in developing SHR technologies, particularly in the
areas of robot design, motion planning and control. The paper also discusses
the potential benefits of integrating AI and soft robots and data-driven
methods to enhance the performance and robustness of SHR systems. Finally, the
paper identifies several open research questions in the field and highlights
the need for further research and development efforts to advance SHR
technologies to meet the challenges of global food production. Overall, this
paper provides a starting point for researchers and practitioners interested in
developing SHRs and highlights the need for more research in this field.Comment: Preprint: to be appeared in Journal of Field Robotic
Security and Privacy Problems in Voice Assistant Applications: A Survey
Voice assistant applications have become omniscient nowadays. Two models that
provide the two most important functions for real-life applications (i.e.,
Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR)
models and Speaker Identification (SI) models. According to recent studies,
security and privacy threats have also emerged with the rapid development of
the Internet of Things (IoT). The security issues researched include attack
techniques toward machine learning models and other hardware components widely
used in voice assistant applications. The privacy issues include technical-wise
information stealing and policy-wise privacy breaches. The voice assistant
application takes a steadily growing market share every year, but their privacy
and security issues never stopped causing huge economic losses and endangering
users' personal sensitive information. Thus, it is important to have a
comprehensive survey to outline the categorization of the current research
regarding the security and privacy problems of voice assistant applications.
This paper concludes and assesses five kinds of security attacks and three
types of privacy threats in the papers published in the top-tier conferences of
cyber security and voice domain.Comment: 5 figure
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
TransFusionOdom: Interpretable Transformer-based LiDAR-Inertial Fusion Odometry Estimation
Multi-modal fusion of sensors is a commonly used approach to enhance the
performance of odometry estimation, which is also a fundamental module for
mobile robots. However, the question of \textit{how to perform fusion among
different modalities in a supervised sensor fusion odometry estimation task?}
is still one of challenging issues remains. Some simple operations, such as
element-wise summation and concatenation, are not capable of assigning adaptive
attentional weights to incorporate different modalities efficiently, which make
it difficult to achieve competitive odometry results. Recently, the Transformer
architecture has shown potential for multi-modal fusion tasks, particularly in
the domains of vision with language. In this work, we propose an end-to-end
supervised Transformer-based LiDAR-Inertial fusion framework (namely
TransFusionOdom) for odometry estimation. The multi-attention fusion module
demonstrates different fusion approaches for homogeneous and heterogeneous
modalities to address the overfitting problem that can arise from blindly
increasing the complexity of the model. Additionally, to interpret the learning
process of the Transformer-based multi-modal interactions, a general
visualization approach is introduced to illustrate the interactions between
modalities. Moreover, exhaustive ablation studies evaluate different
multi-modal fusion strategies to verify the performance of the proposed fusion
strategy. A synthetic multi-modal dataset is made public to validate the
generalization ability of the proposed fusion strategy, which also works for
other combinations of different modalities. The quantitative and qualitative
odometry evaluations on the KITTI dataset verify the proposed TransFusionOdom
could achieve superior performance compared with other related works.Comment: Submitted to IEEE Sensors Journal with some modifications. This work
has been submitted to the IEEE for possible publication. Copyright may be
transferred without notice, after which this version may no longer be
accessibl
Offline and Online Models for Learning Pairwise Relations in Data
Pairwise relations between data points are essential for numerous machine learning algorithms. Many representation learning methods consider pairwise relations to identify the latent features and patterns in the data. This thesis, investigates learning of pairwise relations from two different perspectives: offline learning and online learning.The first part of the thesis focuses on offline learning by starting with an investigation of the performance modeling of a synchronization method in concurrent programming using a Markov chain whose state transition matrix models pairwise relations between involved cores in a computer process.Then the thesis focuses on a particular pairwise distance measure, the minimax distance, and explores memory-efficient approaches to computing this distance by proposing a hierarchical representation of the data with a linear memory requirement with respect to the number of data points, from which the exact pairwise minimax distances can be derived in a memory-efficient manner. Then, a memory-efficient sampling method is proposed that follows the aforementioned hierarchical representation of the data and samples the data points in a way that the minimax distances between all data points are maximally preserved. Finally, the thesis proposes a practical non-parametric clustering of vehicle motion trajectories to annotate traffic scenarios based on transitive relations between trajectories in an embedded space.The second part of the thesis takes an online learning perspective, and starts by presenting an online learning method for identifying bottlenecks in a road network by extracting the minimax path, where bottlenecks are considered as road segments with the highest cost, e.g., in the sense of travel time. Inspired by real-world road networks, the thesis assumes a stochastic traffic environment in which the road-specific probability distribution of travel time is unknown. Therefore, it needs to learn the parameters of the probability distribution through observations by modeling the bottleneck identification task as a combinatorial semi-bandit problem. The proposed approach takes into account the prior knowledge and follows a Bayesian approach to update the parameters. Moreover, it develops a combinatorial variant of Thompson Sampling and derives an upper bound for the corresponding Bayesian regret. Furthermore, the thesis proposes an approximate algorithm to address the respective computational intractability issue.Finally, the thesis considers contextual information of road network segments by extending the proposed model to a contextual combinatorial semi-bandit framework and investigates and develops various algorithms for this contextual combinatorial setting
A Decision Support System for Economic Viability and Environmental Impact Assessment of Vertical Farms
Vertical farming (VF) is the practice of growing crops or animals using the vertical dimension via multi-tier racks or vertically inclined surfaces. In this thesis, I focus on the emerging industry of plant-specific VF. Vertical plant farming (VPF) is a promising and relatively novel practice that can be conducted in buildings with environmental control and artificial lighting. However, the nascent sector has experienced challenges in economic viability, standardisation, and environmental sustainability. Practitioners and academics call for a comprehensive financial analysis of VPF, but efforts are stifled by a lack of valid and available data.
A review of economic estimation and horticultural software identifies a need for a decision support system (DSS) that facilitates risk-empowered business planning for vertical farmers. This thesis proposes an open-source DSS framework to evaluate business sustainability through financial risk and environmental impact assessments. Data from the literature, alongside lessons learned from industry practitioners, would be centralised in the proposed DSS using imprecise data techniques. These techniques have been applied in engineering but are seldom used in financial forecasting. This could benefit complex sectors which only have scarce data to predict business viability.
To begin the execution of the DSS framework, VPF practitioners were interviewed using a mixed-methods approach. Learnings from over 19 shuttered and operational VPF projects provide insights into the barriers inhibiting scalability and identifying risks to form a risk taxonomy. Labour was the most commonly reported top challenge. Therefore, research was conducted to explore lean principles to improve productivity.
A probabilistic model representing a spectrum of variables and their associated uncertainty was built according to the DSS framework to evaluate the financial risk for VF projects. This enabled flexible computation without precise production or financial data to improve economic estimation accuracy. The model assessed two VPF cases (one in the UK and another in Japan), demonstrating the first risk and uncertainty quantification of VPF business models in the literature. The results highlighted measures to improve economic viability and the viability of the UK and Japan case.
The environmental impact assessment model was developed, allowing VPF operators to evaluate their carbon footprint compared to traditional agriculture using life-cycle assessment. I explore strategies for net-zero carbon production through sensitivity analysis. Renewable energies, especially solar, geothermal, and tidal power, show promise for reducing the carbon emissions of indoor VPF. Results show that renewably-powered VPF can reduce carbon emissions compared to field-based agriculture when considering the land-use change.
The drivers for DSS adoption have been researched, showing a pathway of compliance and design thinking to overcome the ‘problem of implementation’ and enable commercialisation. Further work is suggested to standardise VF equipment, collect benchmarking data, and characterise risks. This work will reduce risk and uncertainty and accelerate the sector’s emergence
Visualisation of Fundamental Movement Skills (FMS): An Iterative Process Using an Overarm Throw
Fundamental Movement Skills (FMS) are precursor gross motor skills to more complex or specialised skills and are recognised as important indicators of physical competence, a key component of physical literacy. FMS are predominantly assessed using pre-defined manual methodologies, most commonly the various iterations of the Test of Gross Motor Development. However, such assessments are time-consuming and often require a minimum basic level of training to conduct. Therefore, the overall aim of this thesis was to utilise accelerometry to develop a visualisation concept as part of a feasibility study to support the learning and assessment of FMS, by reducing subjectivity and the overall time taken to conduct a gross motor skill assessment. The overarm throw, an important fundamental movement skill, was specifically selected for the visualisation development as it is an acyclic movement with a distinct initiation and conclusion. Thirteen children (14.8 ± 0.3 years; 9 boys) wore an ActiGraph GT9X Link Inertial Measurement Unit device on the dominant wrist whilst performing a series of overarm throws. This thesis illustrates how the visualisation concept was developed using raw accelerometer data, which was processed and manipulated using MATLAB 2019b software to obtain and depict key throw performance data, including the trajectory and velocity of the wrist during the throw. Overall, this thesis found that the developed visualisation concept can provide strong indicators of throw competency based on the shape of the throw trajectory. Future research should seek to utilise a larger, more diverse, population, and incorporate machine learning. Finally, further work is required to translate this concept to other gross motor skills
Learning disentangled speech representations
A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody.
The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions.
In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks.
This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically
Recommended from our members
The influence of blockchains and internet of things on global value chain
Copyright © 2022 The Authors. Despite the increasing proliferation of deploying the internet of things (IoT) in the global value chain (GVC), several challenges might lead to a lack of trust among value chain partners, for example, technical challenges (i.e., confidentiality, authenticity, and privacy); and security challenges (i.e., counterfeiting, physical tampering, and data theft). In this study, we argue that blockchain technology (BT), when combined with the IoT ecosystem, will strengthen GVC and enhance value creation and capture among value chain partners. Therefore, we examine the impact of BT combined with the IoT ecosystem and how it can be utilized to enhance value creation and capture among value chain partners. We collected data through an online survey, and 265 U.K. Agri-food retailers completed the survey. Our data were analyzed using structural equation modeling. Our finding reveals that BT enhances GVC by improving IoT scalability, security, and traceability combined with the IoT ecosystem. Moreover, the combination of BT and IoT strengthens GVC and creates more value for value chain partners, which serves as a competitive advantage. Finally, our research outlines the theoretical and practical contribution of combining BT and the IoT ecosystem
- …