28 research outputs found
Geometric Data Augmentations to Mitigate Distribution Shifts in Pollen Classification from Microscopic Images
Distribution shifts are characterized by differences between the training and
test data distributions. They can significantly reduce the accuracy of machine
learning models deployed in real-world scenarios. This paper explores the
distribution shift problem when classifying pollen grains from microscopic
images collected in the wild with a low-cost camera sensor. We leverage the
domain knowledge that geometric features are highly important for accurate
pollen identification and introduce two novel geometric image augmentation
techniques to significantly narrow the accuracy gap between the model
performance on the train and test datasets. In particular, we show that
Tenengrad and ImageToSketch filters are highly effective to balance the shape
and texture information while leaving out unimportant details that may confuse
the model. Extensive evaluations on various model architectures demonstrate a
consistent improvement of the model generalization to field data of up to 14%
achieved by the geometric augmentation techniques when compared to a wide range
of standard image augmentations. The approach is validated through an ablation
study using pollen hydration tests to recover the shape of dry pollen grains.
The proposed geometric augmentations also receive the highest scores according
to the affinity and diversity measures from the literature.Comment: 16 pages, 6 figures, ICPADS 202
Representing Input Transformations by Low-Dimensional Parameter Subspaces
Deep models lack robustness to simple input transformations such as rotation,
scaling, and translation, unless they feature a particular invariant
architecture or undergo specific training, e.g., learning the desired
robustness from data augmentations. Alternatively, input transformations can be
treated as a domain shift problem, and solved by post-deployment model
adaptation. Although a large number of methods deal with transformed inputs,
the fundamental relation between input transformations and optimal model
weights is unknown. In this paper, we put forward the configuration subspace
hypothesis that model weights optimal for parameterized continuous
transformations can reside in low-dimensional linear subspaces. We introduce
subspace-configurable networks to learn these subspaces and observe their
structure and surprisingly low dimensionality on all tested transformations,
datasets and architectures from computer vision and audio signal processing
domains. Our findings enable efficient model reconfiguration, especially when
limited storage and computing resources are at stake
Generic Model and Architecture for Cooperating Objects in Sensor Network Environments
The complexity and heterogeneity of cooperating object applications in ubiquitous environments or of applications in the sensor network domain require the use of generic models and architectures. These architectures should provide support for the following three key issues: flexible installation, management and reconfiguration of components in the system; optimization strategies whose implementation usually involves the proper management of cross-layer information; and proper adaptation techniques that allow for the self-configuration of nodes and components in the system with minimal human intervention. In this paper, we present one possible instance of such a generic model and architecture and show its applicability using Sustainable Bridges, a sensor network application that requires the analysis of complex sensor data to achieve its goal of effectively monitoring bridges for the detection of structural defects
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
In this paper we look into the conjecture of Entezari et al. (2021) which
states that if the permutation invariance of neural networks is taken into
account, then there is likely no loss barrier to the linear interpolation
between SGD solutions. First, we observe that neuron alignment methods alone
are insufficient to establish low-barrier linear connectivity between SGD
solutions due to a phenomenon we call variance collapse: interpolated deep
networks suffer a collapse in the variance of their activations, causing poor
performance. Next, we propose REPAIR (REnormalizing Permuted Activations for
Interpolation Repair) which mitigates variance collapse by rescaling the
preactivations of such interpolated networks. We explore the interaction
between our method and the choice of normalization layer, network width, and
depth, and demonstrate that using REPAIR on top of neuron alignment methods
leads to 60%-100% relative barrier reduction across a wide variety of
architecture families and tasks. In particular, we report a 74% barrier
reduction for ResNet50 on ImageNet and 90% barrier reduction for ResNet18 on
CIFAR10
SCAN: Multi-hop calibration for mobile sensor arrays
Urban air pollution monitoring with mobile, portable, low-cost sensors has attracted increasing research interest for their wide spatial coverage and affordable expenses to the general public. However, low-cost air quality sensors not only drift over time but also suffer from cross-sensitivities and dependency on meteorological effects. Therefore calibration of measurements from low-cost sensors is indispensable to guarantee data accuracy and consistency to be fit for quantitative studies on air pollution. In this work we propose sensor array network calibration (SCAN), a multi-hop calibration technique for dependent low-cost sensors. SCAN is applicable to sets of co-located, heterogeneous sensors, known as sensor arrays, to compensate for cross-sensitivities and dependencies on meteorological influences. SCAN minimizes error accumulation over multiple hops of sensor arrays, which is unattainable with existing multi-hop calibration techniques. We formulate SCAN as a novel constrained least-squares regression and provide a closed-form expression of its regression parameters. We theoretically prove that SCAN is free from regression dilution even in presence of measurement noise. In-depth simulations demonstrate that SCAN outperforms various calibration techniques. Evaluations on two real-world low-cost air pollution sensor datasets comprising 66 million samples collected over three years show that SCAN yields 16% to 60% lower error than state-of-the-art calibration techniques.</jats:p
DataComp: In search of the next generation of multimodal datasets
Multimodal datasets are a critical component in recent breakthroughs such as
Stable Diffusion and GPT-4, yet their design does not receive the same research
attention as model architectures or training algorithms. To address this
shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset
experiments centered around a new candidate pool of 12.8 billion image-text
pairs from Common Crawl. Participants in our benchmark design new filtering
techniques or curate new data sources and then evaluate their new dataset by
running our standardized CLIP training code and testing the resulting model on
38 downstream test sets. Our benchmark consists of multiple compute scales
spanning four orders of magnitude, which enables the study of scaling trends
and makes the benchmark accessible to researchers with varying resources. Our
baseline experiments show that the DataComp workflow leads to better training
sets. In particular, our best baseline, DataComp-1B, enables training a CLIP
ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet, outperforming
OpenAI's CLIP ViT-L/14 by 3.7 percentage points while using the same training
procedure and compute. We release DataComp and all accompanying code at
www.datacomp.ai
Pollen Video Library for Benchmarking Detection, Classification, Tracking and Novelty Detection Tasks
<p><strong>Dataset description</strong></p><p>This dataset contains microscopic images and videos of pollen gathered between Feb. and Aug. 2020 in Graz, Austria.</p><p>Pollen images of 16 types: ...<strong>images_16_types.zip</strong></p><ul><li>Acer Pseudoplatanus</li><li>Aesculus Carnea</li><li>Alnus</li><li>Anthoxanthum</li><li>Betula Pendula</li><li>Brassica</li><li>Carpinus</li><li>Corylus</li><li>Dactylis Glomerata</li><li>Fraxinus</li><li>Pinus Nigra</li><li>Platanus</li><li>Populus Nigra</li><li>Prunus Avium</li><li>Sequoiadendron Giganteum</li><li>Taxus Baccata</li></ul><p>Pollen video library ...<strong>pollen_video_library.zip</strong></p><ul><li>Each type of pollen is in a separate folder, there may be multiple videos per type.</li><li>In each pollen folder, we included images cropped from the videos by YOLO object detection algorithm trained on a subset of pollen images as described in [1].</li><li>Cropped file name structure [Video file name]_[TrackingID]_[Image index of a grain]_[Frame index in video]<ul><li>Example, if a grain has 5 images, the file name would be: Anthoxanthum-grass-20200530-122652_0000000_001_00001.jpg Anthoxanthum-grass-20200530-122652_0000000_002_00002.jpg ... Anthoxanthum-grass-20200530-122652_0000000_005_00005.jpg</li></ul></li></ul><p>Field data over 3 days are gathered in Graz in spring 2020. ...<strong>pollen_field_data.zip</strong></p><p> </p><p><strong>Version 2:</strong></p><p>For experiments of mitigating the distribution shift of pollen identification on field data, there are 5 types selected from field data and manually labeled by the expert. The data are zipped in <strong>"the manual_labeled_field_data_5_types.zip"</strong> </p><p>The "<strong>images_5_types_9010_train.zip</strong>" and "<strong>images_5_types_9010_val.zip</strong>" contain 5 types selected from library data (<strong>images_16_types.zip</strong>), and these correspond to field data. </p><p>The "<strong>images_3_types_for_ablation_study.zip</strong>" contains data on 3 levels of pollen grain hydration. These data are used for the ablation study of model generalization in pollen identification. </p><p>Sample code to load the data and visualize the images is in ...plot_pollen_sample.py. Download and extract the file ...<strong>images_16_types.zip</strong> in the same folder as ...plot_pollen_sample.py to run the example.</p><p><strong>Dependecies:</strong></p><ul><li>opencv</li><li>numpy</li><li>matplotlib</li></ul><p><strong>Credit</strong></p><p>[1] N. Cao, M. Meyer, L. Thiele, and O. Saukh. 2020. Automated Pollen Detection with an Affordable Technology. In Proceedings of the International Conference on Embedded Wireless Systems and Networks (EWSN). 108–119.</p><p>@inproceedings{namcao2020pollen, title = {Automated Pollen Detection with an Affordable Technology}, author = {Nam Cao and Matthias Meyer and Lothar Thiele and Olga Saukh}, booktitle = {Proceedings of the International Conference on Embedded Wireless Systems and Networks (EWSN)}, pages={108–119} month = {2}, year = {2020}, }</p><p>Appears in the Proceedings of the 3rd Workshop on Data Acquisition To Analysis (DATA '20)</p>
Deep Neural Network Pruning for Nuclei Instance Segmentation in Hematoxylin & Eosin-Stained Histological Images
Recently, pruning deep neural networks (DNNs) has received a lot of attention
for improving accuracy and generalization power, reducing network size, and
increasing inference speed on specialized hardwares. Although pruning was
mainly tested on computer vision tasks, its application in the context of
medical image analysis has hardly been explored. This work investigates the
impact of well-known pruning techniques, namely layer-wise and network-wide
magnitude pruning, on the nuclei instance segmentation performance in
histological images. Our utilized instance segmentation model consists of two
main branches: (1) a semantic segmentation branch, and (2) a deep regression
branch. We investigate the impact of weight pruning on the performance of both
branches separately and on the final nuclei instance segmentation result.
Evaluated on two publicly available datasets, our results show that layer-wise
pruning delivers slightly better performance than networkwide pruning for small
compression ratios (CRs) while for large CRs, network-wide pruning yields
superior performance. For semantic segmentation, deep regression and final
instance segmentation, 93.75 %, 95 %, and 80 % of the model weights can be
pruned by layer-wise pruning with less than 2 % reduction in the performance of
respective models