28 research outputs found

    Geometric Data Augmentations to Mitigate Distribution Shifts in Pollen Classification from Microscopic Images

    Full text link
    Distribution shifts are characterized by differences between the training and test data distributions. They can significantly reduce the accuracy of machine learning models deployed in real-world scenarios. This paper explores the distribution shift problem when classifying pollen grains from microscopic images collected in the wild with a low-cost camera sensor. We leverage the domain knowledge that geometric features are highly important for accurate pollen identification and introduce two novel geometric image augmentation techniques to significantly narrow the accuracy gap between the model performance on the train and test datasets. In particular, we show that Tenengrad and ImageToSketch filters are highly effective to balance the shape and texture information while leaving out unimportant details that may confuse the model. Extensive evaluations on various model architectures demonstrate a consistent improvement of the model generalization to field data of up to 14% achieved by the geometric augmentation techniques when compared to a wide range of standard image augmentations. The approach is validated through an ablation study using pollen hydration tests to recover the shape of dry pollen grains. The proposed geometric augmentations also receive the highest scores according to the affinity and diversity measures from the literature.Comment: 16 pages, 6 figures, ICPADS 202

    Representing Input Transformations by Low-Dimensional Parameter Subspaces

    Full text link
    Deep models lack robustness to simple input transformations such as rotation, scaling, and translation, unless they feature a particular invariant architecture or undergo specific training, e.g., learning the desired robustness from data augmentations. Alternatively, input transformations can be treated as a domain shift problem, and solved by post-deployment model adaptation. Although a large number of methods deal with transformed inputs, the fundamental relation between input transformations and optimal model weights is unknown. In this paper, we put forward the configuration subspace hypothesis that model weights optimal for parameterized continuous transformations can reside in low-dimensional linear subspaces. We introduce subspace-configurable networks to learn these subspaces and observe their structure and surprisingly low dimensionality on all tested transformations, datasets and architectures from computer vision and audio signal processing domains. Our findings enable efficient model reconfiguration, especially when limited storage and computing resources are at stake

    Generic Model and Architecture for Cooperating Objects in Sensor Network Environments

    Full text link
    The complexity and heterogeneity of cooperating object applications in ubiquitous environments or of applications in the sensor network domain require the use of generic models and architectures. These architectures should provide support for the following three key issues: flexible installation, management and reconfiguration of components in the system; optimization strategies whose implementation usually involves the proper management of cross-layer information; and proper adaptation techniques that allow for the self-configuration of nodes and components in the system with minimal human intervention. In this paper, we present one possible instance of such a generic model and architecture and show its applicability using Sustainable Bridges, a sensor network application that requires the analysis of complex sensor data to achieve its goal of effectively monitoring bridges for the detection of structural defects

    REPAIR: REnormalizing Permuted Activations for Interpolation Repair

    Full text link
    In this paper we look into the conjecture of Entezari et al. (2021) which states that if the permutation invariance of neural networks is taken into account, then there is likely no loss barrier to the linear interpolation between SGD solutions. First, we observe that neuron alignment methods alone are insufficient to establish low-barrier linear connectivity between SGD solutions due to a phenomenon we call variance collapse: interpolated deep networks suffer a collapse in the variance of their activations, causing poor performance. Next, we propose REPAIR (REnormalizing Permuted Activations for Interpolation Repair) which mitigates variance collapse by rescaling the preactivations of such interpolated networks. We explore the interaction between our method and the choice of normalization layer, network width, and depth, and demonstrate that using REPAIR on top of neuron alignment methods leads to 60%-100% relative barrier reduction across a wide variety of architecture families and tasks. In particular, we report a 74% barrier reduction for ResNet50 on ImageNet and 90% barrier reduction for ResNet18 on CIFAR10

    SCAN: Multi-hop calibration for mobile sensor arrays

    Get PDF
    Urban air pollution monitoring with mobile, portable, low-cost sensors has attracted increasing research interest for their wide spatial coverage and affordable expenses to the general public. However, low-cost air quality sensors not only drift over time but also suffer from cross-sensitivities and dependency on meteorological effects. Therefore calibration of measurements from low-cost sensors is indispensable to guarantee data accuracy and consistency to be fit for quantitative studies on air pollution. In this work we propose sensor array network calibration (SCAN), a multi-hop calibration technique for dependent low-cost sensors. SCAN is applicable to sets of co-located, heterogeneous sensors, known as sensor arrays, to compensate for cross-sensitivities and dependencies on meteorological influences. SCAN minimizes error accumulation over multiple hops of sensor arrays, which is unattainable with existing multi-hop calibration techniques. We formulate SCAN as a novel constrained least-squares regression and provide a closed-form expression of its regression parameters. We theoretically prove that SCAN is free from regression dilution even in presence of measurement noise. In-depth simulations demonstrate that SCAN outperforms various calibration techniques. Evaluations on two real-world low-cost air pollution sensor datasets comprising 66 million samples collected over three years show that SCAN yields 16% to 60% lower error than state-of-the-art calibration techniques.</jats:p

    DataComp: In search of the next generation of multimodal datasets

    Full text link
    Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl. Participants in our benchmark design new filtering techniques or curate new data sources and then evaluate their new dataset by running our standardized CLIP training code and testing the resulting model on 38 downstream test sets. Our benchmark consists of multiple compute scales spanning four orders of magnitude, which enables the study of scaling trends and makes the benchmark accessible to researchers with varying resources. Our baseline experiments show that the DataComp workflow leads to better training sets. In particular, our best baseline, DataComp-1B, enables training a CLIP ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet, outperforming OpenAI's CLIP ViT-L/14 by 3.7 percentage points while using the same training procedure and compute. We release DataComp and all accompanying code at www.datacomp.ai

    Pollen Video Library for Benchmarking Detection, Classification, Tracking and Novelty Detection Tasks

    No full text
    &lt;p&gt;&lt;strong&gt;Dataset description&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;This dataset contains microscopic images and videos of pollen gathered between Feb. and Aug. 2020 in Graz, Austria.&lt;/p&gt;&lt;p&gt;Pollen images of 16 types:&nbsp;...&lt;strong&gt;images_16_types.zip&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Acer Pseudoplatanus&lt;/li&gt;&lt;li&gt;Aesculus Carnea&lt;/li&gt;&lt;li&gt;Alnus&lt;/li&gt;&lt;li&gt;Anthoxanthum&lt;/li&gt;&lt;li&gt;Betula Pendula&lt;/li&gt;&lt;li&gt;Brassica&lt;/li&gt;&lt;li&gt;Carpinus&lt;/li&gt;&lt;li&gt;Corylus&lt;/li&gt;&lt;li&gt;Dactylis Glomerata&lt;/li&gt;&lt;li&gt;Fraxinus&lt;/li&gt;&lt;li&gt;Pinus Nigra&lt;/li&gt;&lt;li&gt;Platanus&lt;/li&gt;&lt;li&gt;Populus Nigra&lt;/li&gt;&lt;li&gt;Prunus Avium&lt;/li&gt;&lt;li&gt;Sequoiadendron Giganteum&lt;/li&gt;&lt;li&gt;Taxus Baccata&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Pollen video library&nbsp;...&lt;strong&gt;pollen_video_library.zip&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Each type of pollen is in a separate folder, there may be multiple videos per type.&lt;/li&gt;&lt;li&gt;In each pollen folder, we included images cropped from the videos by YOLO object detection algorithm trained on a subset of pollen images as described in [1].&lt;/li&gt;&lt;li&gt;Cropped file name structure&nbsp;[Video file name]_[TrackingID]_[Image index of a grain]_[Frame index in video]&lt;ul&gt;&lt;li&gt;Example, if a grain has 5 images, the file name would be: &nbsp;Anthoxanthum-grass-20200530-122652_0000000_001_00001.jpg Anthoxanthum-grass-20200530-122652_0000000_002_00002.jpg ... Anthoxanthum-grass-20200530-122652_0000000_005_00005.jpg&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Field data over 3 days are gathered in Graz in spring 2020.&nbsp;...&lt;strong&gt;pollen_field_data.zip&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&nbsp;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Version 2:&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;For experiments of mitigating the distribution shift of pollen identification on field data, there are 5 types selected from field data and manually labeled by the expert. The data are zipped in &lt;strong&gt;"the manual_labeled_field_data_5_types.zip"&lt;/strong&gt;&nbsp;&lt;/p&gt;&lt;p&gt;The "&lt;strong&gt;images_5_types_9010_train.zip&lt;/strong&gt;" and "&lt;strong&gt;images_5_types_9010_val.zip&lt;/strong&gt;" contain 5 types selected from library data (&lt;strong&gt;images_16_types.zip&lt;/strong&gt;), &nbsp;and these correspond to field data.&nbsp;&lt;/p&gt;&lt;p&gt;The "&lt;strong&gt;images_3_types_for_ablation_study.zip&lt;/strong&gt;" contains data on 3 levels of pollen grain hydration. These data are used for the ablation study of model generalization in pollen identification.&nbsp;&lt;/p&gt;&lt;p&gt;Sample code to load the data and visualize the images is in&nbsp;...plot_pollen_sample.py. Download and extract the file&nbsp;...&lt;strong&gt;images_16_types.zip&lt;/strong&gt;&nbsp;in the same folder as&nbsp;...plot_pollen_sample.py&nbsp;to run the example.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Dependecies:&lt;/strong&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;opencv&lt;/li&gt;&lt;li&gt;numpy&lt;/li&gt;&lt;li&gt;matplotlib&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Credit&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;[1] N. Cao, M. Meyer, L. Thiele, and O. Saukh. 2020. Automated Pollen Detection with an Affordable Technology. In Proceedings of the International Conference on Embedded Wireless Systems and Networks (EWSN). 108–119.&lt;/p&gt;&lt;p&gt;@inproceedings{namcao2020pollen, &nbsp;title = {Automated Pollen Detection with an Affordable Technology}, &nbsp;author = {Nam Cao and Matthias Meyer and Lothar Thiele and Olga Saukh}, &nbsp;booktitle = {Proceedings of the International Conference on Embedded Wireless Systems and Networks (EWSN)}, &nbsp;pages={108–119} &nbsp;month = {2}, &nbsp;year = {2020}, }&lt;/p&gt;&lt;p&gt;Appears in the Proceedings of the 3rd Workshop on Data Acquisition To Analysis (DATA '20)&lt;/p&gt

    Deep Neural Network Pruning for Nuclei Instance Segmentation in Hematoxylin & Eosin-Stained Histological Images

    Full text link
    Recently, pruning deep neural networks (DNNs) has received a lot of attention for improving accuracy and generalization power, reducing network size, and increasing inference speed on specialized hardwares. Although pruning was mainly tested on computer vision tasks, its application in the context of medical image analysis has hardly been explored. This work investigates the impact of well-known pruning techniques, namely layer-wise and network-wide magnitude pruning, on the nuclei instance segmentation performance in histological images. Our utilized instance segmentation model consists of two main branches: (1) a semantic segmentation branch, and (2) a deep regression branch. We investigate the impact of weight pruning on the performance of both branches separately and on the final nuclei instance segmentation result. Evaluated on two publicly available datasets, our results show that layer-wise pruning delivers slightly better performance than networkwide pruning for small compression ratios (CRs) while for large CRs, network-wide pruning yields superior performance. For semantic segmentation, deep regression and final instance segmentation, 93.75 %, 95 %, and 80 % of the model weights can be pruned by layer-wise pruning with less than 2 % reduction in the performance of respective models
    corecore