1,587 research outputs found
Inpainting of long audio segments with similarity graphs
We present a novel method for the compensation of long duration data loss in
audio signals, in particular music. The concealment of such signal defects is
based on a graph that encodes signal structure in terms of time-persistent
spectral similarity. A suitable candidate segment for the substitution of the
lost content is proposed by an intuitive optimization scheme and smoothly
inserted into the gap, i.e. the lost or distorted signal region. Extensive
listening tests show that the proposed algorithm provides highly promising
results when applied to a variety of real-world music signals
Semantic Cross-View Matching
Matching cross-view images is challenging because the appearance and
viewpoints are significantly different. While low-level features based on
gradient orientations or filter responses can drastically vary with such
changes in viewpoint, semantic information of images however shows an invariant
characteristic in this respect. Consequently, semantically labeled regions can
be used for performing cross-view matching. In this paper, we therefore explore
this idea and propose an automatic method for detecting and representing the
semantic information of an RGB image with the goal of performing cross-view
matching with a (non-RGB) geographic information system (GIS). A segmented
image forms the input to our system with segments assigned to semantic concepts
such as traffic signs, lakes, roads, foliage, etc. We design a descriptor to
robustly capture both, the presence of semantic concepts and the spatial layout
of those segments. Pairwise distances between the descriptors extracted from
the GIS map and the query image are then used to generate a shortlist of the
most promising locations with similar semantic concepts in a consistent spatial
layout. An experimental evaluation with challenging query images and a large
urban area shows promising results
Enhancing Representation in Radiography-Reports Foundation Model: A Granular Alignment Algorithm Using Masked Contrastive Learning
Recently, multi-modal vision-language foundation models have gained
significant attention in the medical field. While these models offer great
opportunities, they still face a number of challenges, such as the requirement
for fine-grained knowledge understanding in computer-aided diagnosis and
capability of utilizing very limited or no task-specific labeled data in
real-world clinical applications. In this study, we present MaCo, a novel
multi-modal medical foundation model that explores masked contrastive learning
to achieve granular alignment and zero-shot learning for a variety of medical
imaging tasks. MaCo incorporates a correlation weighting mechanism to adjust
the correlation between masked image patches and their corresponding reports,
thereby enhancing the representation learning capabilities. We evaluate MaCo on
six well-known open-source X-ray datasets, and the experimental results show it
outperforms seven state-of-the-art approaches for classification, segmentation,
and zero-shot phase grounding, demonstrating its great potential to promote a
wide range of medical image analysis tasks
In vivo probabilistic atlas of white matter tracts of the human subthalamic area combining track density imaging and optimized diffusion tractography
The human subthalamic area is a region of high anatomical complexity, tightly packed with tiny fiber bundles. Some of them, including the pallidothalamic, cerebello-thalamic, and mammillothalamic tracts, are relevant targets in functional neurosurgery for various brain diseases. Diffusion-weighted imaging-based tractography has been suggested as a useful tool to map white matter pathways in the human brain in vivo and non-invasively, though the reconstruction of these specific fiber bundles is challenging due to their small dimensions and complex anatomy. To the best of our knowledge, a population-based, in vivo probabilistic atlas of subthalamic white matter tracts is still missing. In the present work, we devised an optimized tractography protocol for reproducible reconstruction of the tracts of subthalamic area in a large data sample from the Human Connectome Project repository. First, we leveraged the super-resolution properties and high anatomical detail provided by short tracks track-density imaging (stTDI) to identify the white matter bundles of the subthalamic area on a group-level template. Tracts identification on the stTDI template was also aided by visualization of histological sections of human specimens. Then, we employed this anatomical information to drive tractography at the subject-level, optimizing tracking parameters to maximize between-subject and within-subject similarities as well as anatomical accuracy. Finally, we gathered subject level tracts reconstructed with optimized tractography into a large-scale, normative population atlas. We suggest that this atlas could be useful in both clinical anatomy and functional neurosurgery settings, to improve our understanding of the complex morphology of this important brain region
Low-count Time Series Anomaly Detection
Low-count time series describe sparse or intermittent events, which are
prevalent in large-scale online platforms that capture and monitor diverse data
types. Several distinct challenges surface when modelling low-count time
series, particularly low signal-to-noise ratios (when anomaly signatures are
provably undetectable), and non-uniform performance (when average metrics are
not representative of local behaviour). The time series anomaly detection
community currently lacks explicit tooling and processes to model and reliably
detect anomalies in these settings. We address this gap by introducing a novel
generative procedure for creating benchmark datasets comprising of low-count
time series with anomalous segments. Via a mixture of theoretical and empirical
analysis, our work explains how widely-used algorithms struggle with the
distribution overlap between normal and anomalous segments. In order to
mitigate this shortcoming, we then leverage our findings to demonstrate how
anomaly score smoothing consistently improves performance. The practical
utility of our analysis and recommendation is validated on a real-world dataset
containing sales data for retail stores.Comment: 6 pages, 7 figures, to be published in IEEE 2023 Workshop on Machine
Learning for Signal Processing (MLSP
Citywide Estimation of Traffic Dynamics via Sparse GPS Traces
Traffic congestion is a perpetual challenge in metropolitan areas around the world. The ability to understand traffic dynamics is thus critical to effective traffic control and management. However, estimation of traffic conditions over a large-scale road network has proven to be a challenging task for two reasons: first, traffic conditions are intrinsically stochastic; second, the availability and quality of traffic data vary to a great extent. Traditional traffic monitoring systems that exist mostly on major roads and highways are insufficient to recover the traffic conditions for an entire network. Recent advances in GPS technology and the resulting rich data sets offer new opportunities to improve upon such traditional means, by providing much broader coverage of road networks. Despite that, such data are limited by their spatial-temporal sparsity in practice. To address these issues, we have developed a novel framework to estimate travel times, traversed paths, and missing values over a large-scale road network using sparse GPS traces. Our method consists of two phases. In the first phase, we adopt the shortest travel time criterion based on Wardrop\u27s Principles in the map-matching process. With an improved traveltime allocation technique, we have achieved up to 52.5% relative error reduction in network travel times compared to a state-of-the-art method [1]. In the second phase, we estimate missing values using Compressed Sensing algorithm, thereby reducing the number of required measurements by 94.64%
Crowdsourcing-Based Fingerprinting for Indoor Location in Multi-Storey Buildings
POCI-01-0247-FEDER-033479The number of available indoor location solutions has been growing, however with insufficient precision, high implementation costs or scalability limitations. As fingerprinting-based methods rely on ubiquitous information in buildings, the need for additional infrastructure is discarded. Still, the time-consuming manual process to acquire fingerprints limits their applicability in most scenarios. This paper proposes an algorithm for the automatic construction of environmental fingerprints on multi-storey buildings, leveraging the information sources available in each scenario. It relies on unlabelled crowdsourced data from users’ smartphones. With only the floor plans as input, a demand for most applications, we apply a multimodal approach that joins inertial data, local magnetic field andWi-Fi signals to construct highly accurate fingerprints. Precise movement estimation is achieved regardless of smartphone usage through Deep Neural Networks, and the transition between floors detected from barometric data. Users’ trajectories obtained with Pedestrian Dead Reckoning techniques are partitioned into clusters with Wi-Fi measurements. Straight sections from the same cluster are then compared with subsequence Dynamic Time Warping to search for similarities. From the identified overlapping sections, a particle filter fits each trajectory into the building’s floor plans. From all successfully mapped routes, fingerprints labelled with physical locations are finally obtained. Experimental results from an office and a university building show that this solution constructs comparable fingerprints to those acquired manually, thus providing a useful tool for fingerprinting-based solutions automatic setup.publishersversionpublishe
- …