5,912 research outputs found
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Multi-center anatomical segmentation with heterogeneous labels via landmark-based models
Learning anatomical segmentation from heterogeneous labels in multi-center
datasets is a common situation encountered in clinical scenarios, where certain
anatomical structures are only annotated in images coming from particular
medical centers, but not in the full database. Here we first show how
state-of-the-art pixel-level segmentation models fail in naively learning this
task due to domain memorization issues and conflicting labels. We then propose
to adopt HybridGNet, a landmark-based segmentation model which learns the
available anatomical structures using graph-based representations. By analyzing
the latent space learned by both models, we show that HybridGNet naturally
learns more domain-invariant feature representations, and provide empirical
evidence in the context of chest X-ray multiclass segmentation. We hope these
insights will shed light on the training of deep learning models with
heterogeneous labels from public and multi-center datasets
TSGBench: Time Series Generation Benchmark
Synthetic Time Series Generation (TSG) is crucial in a range of applications,
including data augmentation, anomaly detection, and privacy preservation.
Although significant strides have been made in this field, existing methods
exhibit three key limitations: (1) They often benchmark against similar model
types, constraining a holistic view of performance capabilities. (2) The use of
specialized synthetic and private datasets introduces biases and hampers
generalizability. (3) Ambiguous evaluation measures, often tied to custom
networks or downstream tasks, hinder consistent and fair comparison.
To overcome these limitations, we introduce \textsf{TSGBench}, the inaugural
Time Series Generation Benchmark, designed for a unified and comprehensive
assessment of TSG methods. It comprises three modules: (1) a curated collection
of publicly available, real-world datasets tailored for TSG, together with a
standardized preprocessing pipeline; (2) a comprehensive evaluation measures
suite including vanilla measures, new distance-based assessments, and
visualization tools; (3) a pioneering generalization test rooted in Domain
Adaptation (DA), compatible with all methods. We have conducted comprehensive
experiments using \textsf{TSGBench} across a spectrum of ten real-world
datasets from diverse domains, utilizing ten advanced TSG methods and twelve
evaluation measures. The results highlight the reliability and efficacy of
\textsf{TSGBench} in evaluating TSG methods. Crucially, \textsf{TSGBench}
delivers a statistical analysis of the performance rankings of these methods,
illuminating their varying performance across different datasets and measures
and offering nuanced insights into the effectiveness of each method.Comment: Accepted and to appear in VLDB 202
Multi-Scale Modelling of Cold Regions Hydrology
Numerical computer simulations are increasingly important tools required to address both research and operational water resource issues related to the hydrological cycle. Cold region hydrological models have requirements to calculate phase change in water via consideration of the energy balance which has high spatial variability. This motivates the inclusion of explicit spatial heterogeneity and field-testable process representations in such models. However, standard techniques for spatial representation such as raster discretization can lead to prohibitively large computational costs and increased uncertainty due to increased degrees of freedom. As well, semi-distributed approaches may not sufficiently represent all the spatial variability. Further, there is uncertainty regarding which process conceptualizations are used and the degree of required complexity, motivating modelling approaches that allow testing multiple working hypotheses. This thesis considers two themes. In the first, the development of improved modelling techniques to efficiently include spatial heterogeneity, investigate warranted model complexity, and appropriate process representation in cold region models is addressed. In the second, the issues of non-linear process cascades, emergence, and compensatory behaviours in cold regions hydrological process representations is addressed. To address these themes, a new modelling framework, the Canadian Hydrological Model (CHM), is presented. Key design goals for CHM include the ability to: capture spatial heterogeneity in an efficient manner, include multiple process representations, be able to change, remove, and decouple hydrological process algorithms, work both at point and spatially distributed scales, reduce computational overhead to facilitate uncertainty analysis, scale over multiple spatial extents, and utilize a variety of boundary and initial conditions. To enable multi-scale modelling in CHM, a novel multi-objective unstructured mesh generation software *mesher* is presented. Mesher represents the landscape using a multi-scale, variable resolution surface mesh. It was found that this explicitly captured the spatial heterogeneity important for emergent behaviours and cold regions processes, and reduced the total number of computational elements by 50\% to 90\% from that of a uniform mesh. Four energy balance snowpack models of varying complexity and degree of coupling of the energy and mass budget were used to simulate SWE in a forest clearing in the Canadian Rocky Mountains. It was found that 1) a compensatory response was present in the fully coupled models’ energy and mass balance that reduced their sensitivity to errors in meteorology and albedo and 2) the weakly coupled models produced less accurate simulations and were more sensitive to errors in forcing meteorology and albedo. The results suggest that the inclusion of a fully coupled mass and energy budget improves prediction of snow accumulation and ablation, but there was little advantage by introducing a multi-layered snowpack scheme. This helps define warranted complexity model decisions for this region. Lastly, a 3-D advection-diffusion blowing snow transport and sublimation model using a finite volume method discretization via a variable resolution unstructured mesh was developed. This found that the blowing snow calculation was able to represent the spatial redistribution of SWE over a sub-arctic mountain basin when compared to detailed snow surveys and the use of the unstructured mesh provided a 62\% reduction in computational elements. Without the inclusion of blowing snow, unrealistic homogeneous snow covers were simulated which would lead to incorrect melt rates and runoff contributions. This thesis shows that there is a need to: use fully coupled energy and mass balance models in mountains terrain, capture snow-drift resolving scales in next-generation hydrological models, employ variable resolution unstructured meshes as a way to reduce computational time, and consider cascading process interactions
- …