511 research outputs found
Radio frequency fingerprint identification for Internet of Things: A survey
Radio frequency fingerprint (RFF) identification is a promising technique for identifying Internet of Things (IoT) devices. This paper presents a comprehensive survey on RFF identification, which covers various aspects ranging from related definitions to details of each stage in the identification process, namely signal preprocessing, RFF feature extraction, further processing, and RFF identification. Specifically, three main steps of preprocessing are summarized, including carrier frequency offset estimation, noise elimination, and channel cancellation. Besides, three kinds of RFFs are categorized, comprising I/Q signal-based, parameter-based, and transformation-based features. Meanwhile, feature fusion and feature dimension reduction are elaborated as two main further processing methods. Furthermore, a novel framework is established from the perspective of closed set and open set problems, and the related state-of-the-art methodologies are investigated, including approaches based on traditional machine learning, deep learning, and generative models. Additionally, we highlight the challenges faced by RFF identification and point out future research trends in this field
Self-supervised learning for transferable representations
Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks
Goal-Conditioned Reinforcement Learning with Disentanglement-based Reachability Planning
Goal-Conditioned Reinforcement Learning (GCRL) can enable agents to
spontaneously set diverse goals to learn a set of skills. Despite the excellent
works proposed in various fields, reaching distant goals in temporally extended
tasks remains a challenge for GCRL. Current works tackled this problem by
leveraging planning algorithms to plan intermediate subgoals to augment GCRL.
Their methods need two crucial requirements: (i) a state representation space
to search valid subgoals, and (ii) a distance function to measure the
reachability of subgoals. However, they struggle to scale to high-dimensional
state space due to their non-compact representations. Moreover, they cannot
collect high-quality training data through standard GC policies, which results
in an inaccurate distance function. Both affect the efficiency and performance
of planning and policy learning. In the paper, we propose a goal-conditioned RL
algorithm combined with Disentanglement-based Reachability Planning (REPlan) to
solve temporally extended tasks. In REPlan, a Disentangled Representation
Module (DRM) is proposed to learn compact representations which disentangle
robot poses and object positions from high-dimensional observations in a
self-supervised manner. A simple REachability discrimination Module (REM) is
also designed to determine the temporal distance of subgoals. Moreover, REM
computes intrinsic bonuses to encourage the collection of novel states for
training. We evaluate our REPlan in three vision-based simulation tasks and one
real-world task. The experiments demonstrate that our REPlan significantly
outperforms the prior state-of-the-art methods in solving temporally extended
tasks.Comment: Accepted by 2023 RAL with ICR
Computational Approaches to Drug Profiling and Drug-Protein Interactions
Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a
long period of stagnation in drug approvals. Due to the extreme costs associated with
introducing a drug to the market, locating and understanding the reasons for clinical failure
is key to future productivity. As part of this PhD, three main contributions were made in
this respect. First, the web platform, LigNFam enables users to interactively explore
similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly,
two deep-learning-based binding site comparison tools were developed, competing with
the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the
open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold
relationships and has already been used in multiple projects, including integration into a
virtual screening pipeline to increase the tractability of ultra-large screening experiments.
Together, and with existing tools, the contributions made will aid in the understanding of
drug-protein relationships, particularly in the fields of off-target prediction and drug
repurposing, helping to design better drugs faster
Enlighten-anything:When Segment Anything Model Meets Low-light Image Enhancement
Image restoration is a low-level visual task, and most CNN methods are
designed as black boxes, lacking transparency and intrinsic aesthetics. Many
unsupervised approaches ignore the degradation of visible information in
low-light scenes, which will seriously affect the aggregation of complementary
information and also make the fusion algorithm unable to produce satisfactory
fusion results under extreme conditions. In this paper, we propose
Enlighten-anything, which is able to enhance and fuse the semantic intent of
SAM segmentation with low-light images to obtain fused images with good visual
perception. The generalization ability of unsupervised learning is greatly
improved, and experiments on LOL dataset are conducted to show that our method
improves 3db in PSNR over baseline and 8 in SSIM. zero-shot learning of SAM
introduces a powerful aid for unsupervised low-light enhancement. The source
code of Rethink-Diffusion can be obtained from
https://github.com/zhangbaijin/enlighten-anythin
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring
Artificially intelligent perception is increasingly present in the lives of
every one of us. Vehicles are no exception, (...) In the near future, pattern
recognition will have an even stronger role in vehicles, as self-driving cars
will require automated ways to understand what is happening around (and within)
them and act accordingly. (...) This doctoral work focused on advancing
in-vehicle sensing through the research of novel computer vision and pattern
recognition methodologies for both biometrics and wellbeing monitoring. The
main focus has been on electrocardiogram (ECG) biometrics, a trait well-known
for its potential for seamless driver monitoring. Major efforts were devoted to
achieving improved performance in identification and identity verification in
off-the-person scenarios, well-known for increased noise and variability. Here,
end-to-end deep learning ECG biometric solutions were proposed and important
topics were addressed such as cross-database and long-term performance,
waveform relevance through explainability, and interlead conversion. Face
biometrics, a natural complement to the ECG in seamless unconstrained
scenarios, was also studied in this work. The open challenges of masked face
recognition and interpretability in biometrics were tackled in an effort to
evolve towards algorithms that are more transparent, trustworthy, and robust to
significant occlusions. Within the topic of wellbeing monitoring, improved
solutions to multimodal emotion recognition in groups of people and
activity/violence recognition in in-vehicle scenarios were proposed. At last,
we also proposed a novel way to learn template security within end-to-end
models, dismissing additional separate encryption processes, and a
self-supervised learning approach tailored to sequential data, in order to
ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022
to the University of Port
OCHID-Fi: Occlusion-Robust Hand Pose Estimation in 3D via RF-Vision
Hand Pose Estimation (HPE) is crucial to many applications, but conventional
cameras-based CM-HPE methods are completely subject to Line-of-Sight (LoS), as
cameras cannot capture occluded objects. In this paper, we propose to exploit
Radio-Frequency-Vision (RF-vision) capable of bypassing obstacles for achieving
occluded HPE, and we introduce OCHID-Fi as the first RF-HPE method with 3D pose
estimation capability. OCHID-Fi employs wideband RF sensors widely available on
smart devices (e.g., iPhones) to probe 3D human hand pose and extract their
skeletons behind obstacles. To overcome the challenge in labeling RF imaging
given its human incomprehensible nature, OCHID-Fi employs a cross-modality and
cross-domain training process. It uses a pre-trained CM-HPE network and a
synchronized CM/RF dataset, to guide the training of its complex-valued RF-HPE
network under LoS conditions. It further transfers knowledge learned from
labeled LoS domain to unlabeled occluded domain via adversarial learning,
enabling OCHID-Fi to generalize to unseen occluded scenarios. Experimental
results demonstrate the superiority of OCHID-Fi: it achieves comparable
accuracy to CM-HPE under normal conditions while maintaining such accuracy even
in occluded scenarios, with empirical evidence for its generalizability to new
domains.Comment: Accepted to ICCV 202
Are We Using Autoencoders in a Wrong Way?
Autoencoders are certainly among the most studied and used Deep Learning
models: the idea behind them is to train a model in order to reconstruct the
same input data. The peculiarity of these models is to compress the information
through a bottleneck, creating what is called Latent Space. Autoencoders are
generally used for dimensionality reduction, anomaly detection and feature
extraction. These models have been extensively studied and updated, given their
high simplicity and power. Examples are (i) the Denoising Autoencoder, where
the model is trained to reconstruct an image from a noisy one; (ii) Sparse
Autoencoder, where the bottleneck is created by a regularization term in the
loss function; (iii) Variational Autoencoder, where the latent space is used to
generate new consistent data. In this article, we revisited the standard
training for the undercomplete Autoencoder modifying the shape of the latent
space without using any explicit regularization term in the loss function. We
forced the model to reconstruct not the same observation in input, but another
one sampled from the same class distribution. We also explored the behaviour of
the latent space in the case of reconstruction of a random sample from the
whole dataset
Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology
Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/
- …