14 research outputs found

    A New Suite of Statistical Algorithms for Bayesian Model Fitting with Both Intrinsic and Extrinsic Uncertainties in Two Dimensions

    Get PDF
    Fitting a statistical model to data is one of the most important tools in any scientific or data-driven field, and rigorously fitting a two dimensional statistical model to data that has intrinsic uncertainties (error bars) in both the independent variable and the dependent variable is a daunting task, especially if the data also has extrinsic uncertainty (sample variance) that cannot be fully accounted for by the error bars. Here, I introduce a novel statistic (described as the Trotter, Reichart, Konz statistic, or TRK) developed in Trotter (2011) that is advantageous towards model-fitting in this "worst-case data" scenario, especially when compared to other methods. I implemented this statistic as a suite of fitting algorithms in C++ that comes equipped with many capabilities, including: support for any nonlinear model; probability distribution generation, correlation removal and custom priors for model parameters; asymmetric uncertainties in the data and/or model, and more. I also built an end-to-end website through which the algorithm can be used easily, but generally, with a high degree of customizability. The statistic is applicable to practically any data-driven field, and I show a few examples of its usage within the realm of astronomy. This thesis along with Trotter (2011) form the foundations for Trotter, Daniel E. Reichart, and Konz (2020), in preparation. The TRK source code and web-based calculator can be found at https://github.com/nickk124/TRK and https://skynet.unc.edu/rcr/calculator/trk, respectively.Bachelor of Scienc

    The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images

    Full text link
    This paper investigates discrepancies in how neural networks learn from different imaging domains, which are commonly overlooked when adopting computer vision techniques from the domain of natural images to other specialized domains such as medical images. Recent works have found that the generalization error of a trained network typically increases with the intrinsic dimension (ddatad_{data}) of its training set. Yet, the steepness of this relationship varies significantly between medical (radiological) and natural imaging domains, with no existing theoretical explanation. We address this gap in knowledge by establishing and empirically validating a generalization scaling law with respect to ddatad_{data}, and propose that the substantial scaling discrepancy between the two considered domains may be at least partially attributed to the higher intrinsic ``label sharpness'' (KFK_\mathcal{F}) of medical imaging datasets, a metric which we propose. Next, we demonstrate an additional benefit of measuring the label sharpness of a training set: it is negatively correlated with the trained model's adversarial robustness, which notably leads to models for medical images having a substantially higher vulnerability to adversarial attack. Finally, we extend our ddatad_{data} formalism to the related metric of learned representation intrinsic dimension (dreprd_{repr}), derive a generalization scaling law with respect to dreprd_{repr}, and show that ddatad_{data} serves as an upper bound for dreprd_{repr}. Our theoretical results are supported by thorough experiments with six models and eleven natural and medical imaging datasets over a range of training set sizes. Our findings offer insights into the influence of intrinsic dataset properties on generalization, representation learning, and robustness in deep neural networks. Code link: https://github.com/mazurowski-lab/intrinsic-propertiesComment: ICLR 2024. Code: https://github.com/mazurowski-lab/intrinsic-propertie

    COMPARATIVE BIOMECHANICAL ANALYSIS OF A FEMALE HAMMER THROW ATHLETE FOR BACK-TO-BACK AMERICAN RECORD YEARS

    Get PDF
    Hammer athletes must optimize performance variables to maximize their official distance. Analysis of key performance variables might explain how the subject improved an American record year in 2018 to another record in 2019. A 3-D analysis was performed on trial videos from 2018 and 2019. Release height, release velocity, release angle, and hip-shoulder separation were compared among years and throws, and their relationship with official distance was assessed. Release height (p \u3c 0.01) and release angle (p \u3c 0.01) were more consistent in 2019 than 2018. The relationships among official distance, release height (p = 0.06), and hip-shoulder separation (p = 0.04) were different between years. The efficient use of hip-shoulder separation could be responsible for the increase in official distance between years

    A systematic study of the foreground-background imbalance problem in deep learning for object detection

    Full text link
    The class imbalance problem in deep learning has been explored in several studies, but there has yet to be a systematic analysis of this phenomenon in object detection. Here, we present comprehensive analyses and experiments of the foreground-background (F-B) imbalance problem in object detection, which is very common and caused by small, infrequent objects of interest. We experimentally study the effects of different aspects of F-B imbalance (object size, number of objects, dataset size, object type) on detection performance. In addition, we also compare 9 leading methods for addressing this problem, including Faster-RCNN, SSD, OHEM, Libra-RCNN, Focal-Loss, GHM, PISA, YOLO-v3, and GFL with a range of datasets from different imaging domains. We conclude that (1) the F-B imbalance can indeed cause a significant drop in detection performance, (2) The detection performance is more affected by F-B imbalance when fewer training data are available, (3) in most cases, decreasing object size leads to larger performance drop than decreasing number of objects, given the same change in the ratio of object pixels to non-object pixels, (6) among all selected methods, Libra-RCNN and PISA demonstrate the best performance in addressing the issue of F-B imbalance. (7) When the training dataset size is large, the choice of method is not impactful (8) Soft-sampling methods, including focal-loss, GHM, and GFL, perform fairly well on average but are relatively unstable

    Deep Learning for Breast MRI Style Transfer with Limited Training Data

    Full text link
    In this work we introduce a novel medical image style transfer method, StyleMapper, that can transfer medical scans to an unseen style with access to limited training data. This is made possible by training our model on unlimited possibilities of simulated random medical imaging styles on the training set, making our work more computationally efficient when compared with other style transfer methods. Moreover, our method enables arbitrary style transfer: transferring images to styles unseen in training. This is useful for medical imaging, where images are acquired using different protocols and different scanner models, resulting in a variety of styles that data may need to be transferred between. Methods: Our model disentangles image content from style and can modify an image's style by simply replacing the style encoding with one extracted from a single image of the target style, with no additional optimization required. This also allows the model to distinguish between different styles of images, including among those that were unseen in training. We propose a formal description of the proposed model. Results: Experimental results on breast magnetic resonance images indicate the effectiveness of our method for style transfer. Conclusion: Our style transfer method allows for the alignment of medical images taken with different scanners into a single unified style dataset, allowing for the training of other downstream tasks on such a dataset for tasks such as classification, object detection and others.Comment: preprint version, accepted in the Journal of Digital Imaging (JDIM). 16 pages (+ author names + references + supplementary), 6 figure

    Medical Image Segmentation with InTEnt: Integrated Entropy Weighting for Single Image Test-Time Adaptation

    Full text link
    Test-time adaptation (TTA) refers to adapting a trained model to a new domain during testing. Existing TTA techniques rely on having multiple test images from the same domain, yet this may be impractical in real-world applications such as medical imaging, where data acquisition is expensive and imaging conditions vary frequently. Here, we approach such a task, of adapting a medical image segmentation model with only a single unlabeled test image. Most TTA approaches, which directly minimize the entropy of predictions, fail to improve performance significantly in this setting, in which we also observe the choice of batch normalization (BN) layer statistics to be a highly important yet unstable factor due to only having a single test domain example. To overcome this, we propose to instead integrate over predictions made with various estimates of target domain statistics between the training and test statistics, weighted based on their entropy statistics. Our method, validated on 24 source/target domain splits across 3 medical image datasets surpasses the leading method by 2.9% Dice coefficient on average.Comment: Code and pre-trained weights: https://github.com/mazurowski-lab/single-image-test-time-adaptatio

    The Intrinsic Manifolds of Radiological Images and their Role in Deep Learning

    Full text link
    The manifold hypothesis is a core mechanism behind the success of deep learning, so understanding the intrinsic manifold structure of image data is central to studying how neural networks learn from the data. Intrinsic dataset manifolds and their relationship to learning difficulty have recently begun to be studied for the common domain of natural images, but little such research has been attempted for radiological images. We address this here. First, we compare the intrinsic manifold dimensionality of radiological and natural images. We also investigate the relationship between intrinsic dimensionality and generalization ability over a wide range of datasets. Our analysis shows that natural image datasets generally have a higher number of intrinsic dimensions than radiological images. However, the relationship between generalization ability and intrinsic dimensionality is much stronger for medical images, which could be explained as radiological images having intrinsic features that are more difficult to learn. These results give a more principled underpinning for the intuition that radiological images can be more challenging to apply deep learning to than natural image datasets common to machine learning research. We believe rather than directly applying models developed for natural images to the radiological imaging domain, more care should be taken to developing architectures and algorithms that are more tailored to the specific characteristics of this domain. The research shown in our paper, demonstrating these characteristics and the differences from natural images, is an important first step in this direction.Comment: preprint version, accepted for MICCAI 2022 (25th International Conference on Medical Image Computing and Computer Assisted Intervention). 8 pages (+ author names + references + supplementary), 4 figures. Code available at https://github.com/mazurowski-lab/radiologyintrinsicmanifold

    Understanding the Inner Workings of Language Models Through Representation Dissimilarity

    Full text link
    As language models are applied to an increasing number of real-world applications, understanding their inner workings has become an important issue in model trust, interpretability, and transparency. In this work we show that representation dissimilarity measures, which are functions that measure the extent to which two model's internal representations differ, can be a valuable tool for gaining insight into the mechanics of language models. Among our insights are: (i) an apparent asymmetry in the internal representations of model using SoLU and GeLU activation functions, (ii) evidence that dissimilarity measures can identify and locate generalization properties of models that are invisible via in-distribution test set performance, and (iii) new evaluations of how language model features vary as width and depth are increased. Our results suggest that dissimilarity measures are a promising set of tools for shedding light on the inner workings of language models.Comment: EMNLP 2023 (main

    Attributing Learned Concepts in Neural Networks to Training Data

    Full text link
    By now there is substantial evidence that deep learning models learn certain human-interpretable features as part of their internal representations of data. As having the right (or wrong) concepts is critical to trustworthy machine learning systems, it is natural to ask which inputs from the model's original training set were most important for learning a concept at a given layer. To answer this, we combine data attribution methods with methods for probing the concepts learned by a model. Training network and probe ensembles for two concept datasets on a range of network layers, we use the recently developed TRAK method for large-scale data attribution. We find some evidence for convergence, where removing the 10,000 top attributing images for a concept and retraining the model does not change the location of the concept in the network nor the probing sparsity of the concept. This suggests that rather than being highly dependent on a few specific examples, the features that inform the development of a concept are spread in a more diffuse manner across its exemplars, implying robustness in concept formation

    Segment Anything Model for Medical Image Analysis: an Experimental Study

    Full text link
    Training segmentation models for medical images continues to be challenging due to the limited availability and acquisition expense of data annotations. Segment Anything Model (SAM) is a foundation model trained on over 1 billion annotations, predominantly for natural images, that is intended to be able to segment the user-defined object of interest in an interactive manner. Despite its impressive performance on natural images, it is unclear how the model is affected when shifting to medical image domains. Here, we perform an extensive evaluation of SAM's ability to segment medical images on a collection of 11 medical imaging datasets from various modalities and anatomies. In our experiments, we generated point prompts using a standard method that simulates interactive segmentation. Experimental results show that SAM's performance based on single prompts highly varies depending on the task and the dataset, i.e., from 0.1135 for a spine MRI dataset to 0.8650 for a hip x-ray dataset, evaluated by IoU. Performance appears to be high for tasks including well-circumscribed objects with unambiguous prompts and poorer in many other scenarios such as segmentation of tumors. When multiple prompts are provided, performance improves only slightly overall, but more so for datasets where the object is not contiguous. An additional comparison to RITM showed a much better performance of SAM for one prompt but a similar performance of the two methods for a larger number of prompts. We conclude that SAM shows impressive performance for some datasets given the zero-shot learning setup but poor to moderate performance for multiple other datasets. While SAM as a model and as a learning paradigm might be impactful in the medical imaging domain, extensive research is needed to identify the proper ways of adapting it in this domain.Comment: Link to our code: https://github.com/mazurowski-lab/segment-anything-medica
    corecore