10 research outputs found

    Deep Fast Vision: A Python Library for Accelerated Deep Transfer Learning Vision Prototyping

    Full text link
    Deep learning-based vision is characterized by intricate frameworks that often necessitate a profound understanding, presenting a barrier to newcomers and limiting broad adoption. With many researchers grappling with the constraints of smaller datasets, there's a pronounced reliance on pre-trained neural networks, especially for tasks such as image classification. This reliance is further intensified in niche imaging areas where obtaining vast datasets is challenging. Despite the widespread use of transfer learning as a remedy to the small dataset dilemma, a conspicuous absence of tailored auto-ML solutions persists. Addressing these challenges is "Deep Fast Vision", a python library that streamlines the deep learning process. This tool offers a user-friendly experience, enabling results through a simple nested dictionary definition, helping to democratize deep learning for non-experts. Designed for simplicity and scalability, Deep Fast Vision appears as a bridge, connecting the complexities of existing deep learning frameworks with the needs of a diverse user base.Comment: 7 pages, 1 figur

    Adaptive Variance Thresholding: A Novel Approach to Improve Existing Deep Transfer Vision Models and Advance Automatic Knee-Joint Osteoarthritis Classification

    Full text link
    Knee-Joint Osteoarthritis (KOA) is a prevalent cause of global disability and is inherently complex to diagnose due to its subtle radiographic markers and individualized progression. One promising classification avenue involves applying deep learning methods; however, these techniques demand extensive, diversified datasets, which pose substantial challenges due to medical data collection restrictions. Existing practices typically resort to smaller datasets and transfer learning. However, this approach often inherits unnecessary pre-learned features that can clutter the classifier's vector space, potentially hampering performance. This study proposes a novel paradigm for improving post-training specialized classifiers by introducing adaptive variance thresholding (AVT) followed by Neural Architecture Search (NAS). This approach led to two key outcomes: an increase in the initial accuracy of the pre-trained KOA models and a 60-fold reduction in the NAS input vector space, thus facilitating faster inference speed and a more efficient hyperparameter search. We also applied this approach to an external model trained for KOA classification. Despite its initial performance, the application of our methodology improved its average accuracy, making it one of the top three KOA classification models.Comment: 26 pages, 5 figure

    Synthesizing Bidirectional Temporal States of Knee Osteoarthritis Radiographs with Cycle-Consistent Generative Adversarial Neural Networks

    Full text link
    Knee Osteoarthritis (KOA), a leading cause of disability worldwide, is challenging to detect early due to subtle radiographic indicators. Diverse, extensive datasets are needed but are challenging to compile because of privacy, data collection limitations, and the progressive nature of KOA. However, a model capable of projecting genuine radiographs into different OA stages could augment data pools, enhance algorithm training, and offer pre-emptive prognostic insights. In this study, we trained a CycleGAN model to synthesize past and future stages of KOA on any genuine radiograph. The model was validated using a Convolutional Neural Network that was deceived into misclassifying disease stages in transformed images, demonstrating the CycleGAN's ability to effectively transform disease characteristics forward or backward in time. The model was particularly effective in synthesizing future disease states and showed an exceptional ability to retroactively transition late-stage radiographs to earlier stages by eliminating osteophytes and expanding knee joint space, signature characteristics of None or Doubtful KOA. The model's results signify a promising potential for enhancing diagnostic models, data augmentation, and educational and prognostic usage in healthcare. Nevertheless, further refinement, validation, and a broader evaluation process encompassing both CNN-based assessments and expert medical feedback are emphasized for future research and development.Comment: 29 pages, 10 figure

    Improving Performance in Colorectal Cancer Histology Decomposition using Deep and Ensemble Machine Learning

    Full text link
    In routine colorectal cancer management, histologic samples stained with hematoxylin and eosin are commonly used. Nonetheless, their potential for defining objective biomarkers for patient stratification and treatment selection is still being explored. The current gold standard relies on expensive and time-consuming genetic tests. However, recent research highlights the potential of convolutional neural networks (CNNs) in facilitating the extraction of clinically relevant biomarkers from these readily available images. These CNN-based biomarkers can predict patient outcomes comparably to golden standards, with the added advantages of speed, automation, and minimal cost. The predictive potential of CNN-based biomarkers fundamentally relies on the ability of convolutional neural networks (CNNs) to classify diverse tissue types from whole slide microscope images accurately. Consequently, enhancing the accuracy of tissue class decomposition is critical to amplifying the prognostic potential of imaging-based biomarkers. This study introduces a hybrid Deep and ensemble machine learning model that surpassed all preceding solutions for this classification task. Our model achieved 96.74% accuracy on the external test set and 99.89% on the internal test set. Recognizing the potential of these models in advancing the task, we have made them publicly available for further research and development.Comment: 28 pages, 9 figure

    H&E Multi-Laboratory Staining Variance Exploration with Machine Learning

    Get PDF
    In diagnostic histopathology, hematoxylin and eosin (H&E) staining is a critical process that highlights salient histological features. Staining results vary between laboratories regardless of the histopathological task, although the method does not change. This variance can impair the accuracy of algorithms and histopathologists’ time-to-insight. Investigating this variance can help calibrate stain normalization tasks to reverse this negative potential. With machine learning, this study evaluated the staining variance between different laboratories on three tissue types. We received H&E-stained slides from 66 different laboratories. Each slide contained kidney, skin, and colon tissue samples stained by the method routinely used in each laboratory. The samples were digitized and summarized as red, green, and blue channel histograms. Dimensions were reduced using principal component analysis. The data projected by principal components were inserted into the k-means clustering algorithm and the k-nearest neighbors classifier with the laboratories as the target. The k-means silhouette index indicated that K = 2 clusters had the best separability in all tissue types. The supervised classification result showed laboratory effects and tissue-type bias. Both supervised and unsupervised approaches suggested that tissue type also affected inter-laboratory variance. We suggest tissue type to also be considered upon choosing the staining and color-normalization approach.publishedVersionPeer reviewe

    H&E Multi-Laboratory Staining Variance Exploration with Machine Learning

    Get PDF
    In diagnostic histopathology, hematoxylin and eosin (H&E) staining is a critical process that highlights salient histological features. Staining results vary between laboratories regardless of the histopathological task, although the method does not change. This variance can impair the accuracy of algorithms and histopathologists' time-to-insight. Investigating this variance can help calibrate stain normalization tasks to reverse this negative potential. With machine learning, this study evaluated the staining variance between different laboratories on three tissue types. We received H&E-stained slides from 66 different laboratories. Each slide contained kidney, skin, and colon tissue samples stained by the method routinely used in each laboratory. The samples were digitized and summarized as red, green, and blue channel histograms. Dimensions were reduced using principal component analysis. The data projected by principal components were inserted into the k-means clustering algorithm and the k-nearest neighbors classifier with the laboratories as the target. The k-means silhouette index indicated that K = 2 clusters had the best separability in all tissue types. The supervised classification result showed laboratory effects and tissue-type bias. Both supervised and unsupervised approaches suggested that tissue type also affected inter-laboratory variance. We suggest tissue type to also be considered upon choosing the staining and color-normalization approach

    Developing and testing sub-band spectral features in music genre and music mood machine learning

    No full text
    In the field of artificial intelligence, supervised machine learning enables us to try to develop automatic recognition systems. In music information retrieval, training and testing such systems is possible with a variety of music datasets. Two key prediction tasks are those of music genre recognition, and of music mood recognition. The focus of this study is to evaluate the classification of music into genres and mood categories from the audio content. To this end, we evaluate five novel spectro-temporal variants of sub-band musical features. These features are, sub-band entropy, sub-band flux, sub-band kurtosis, sub-band skewness and sub-band zero crossing rate. The choice of features is based on previous studies that highlight the potential efficacy of sub-band features. To aid our analysis we include the Mel-Frequency Cepstral Coefficients feature as our baseline approach. The classification performances are obtained with various learning algorithms, distinct datasets and multiple feature selection subsets. In order to create and evaluate models in both tasks, we use two music datasets prelabelled with regards to, music genres (GTZAN) and music mood (PandaMood) respectively. In addition, this study is the first to develop an adaptive window decomposition method for these sub-band features and one of a handful few that uses artist filtering and fault filtering for the GTZAN dataset. Our results show that the vast majority of sub-band features outperformed the MFCCs in the music genre and the music mood tasks. Between individual features, sub-band entropy outperformed and outranked every feature in both tasks and feature selection approaches. Lastly, we find lower overfitting tendencies for sub-band features in comparison to the MFCCs. In summary, this study gives support to the use of these sub-band features for music genre and music mood classification tasks and further suggests uses in other content-based predictive tasks

    DeepFake knee osteoarthritis X-rays from generative adversarial neural networks deceive medical experts and offer augmentation potential to automatic classification

    No full text
    Recent developments in deep learning have impacted medical science. However, new privacy issues and regulatory frameworks have hindered medical data sharing and collection. Deep learning is a very data-intensive process for which such regulatory limitations limit the potential for new breakthroughs and collaborations. However, generating medically accurate synthetic data can alleviate privacy issues and potentially augment deep learning pipelines. This study presents generative adversarial neural networks capable of generating realistic images of knee joint X-rays with varying osteoarthritis severity. We offer 320,000 synthetic (DeepFake) X-ray images from training with 5,556 real images. We validated our models regarding medical accuracy with 15 medical experts and for augmentation effects with an osteoarthritis severity classification task. We devised a survey of 30 real and 30 DeepFake images for medical experts. The result showed that on average, more DeepFakes were mistaken for real than the reverse. The result signified sufficient DeepFake realism for deceiving the medical experts. Finally, our DeepFakes improved classification accuracy in an osteoarthritis severity classification task with scarce real data and transfer learning. In addition, in the same classification task, we replaced all real training data with DeepFakes and suffered only a 3.79% loss from baseline accuracy in classifying real osteoarthritis X-rays.peerReviewe

    Machine learning predicts upper secondary education dropout as early as the end of primary school

    No full text
    Education plays a pivotal role in alleviating poverty, driving economic growth, and empowering individuals, thereby significantly influencing societal and personal development. However, the persistent issue of school dropout poses a significant challenge, with its effects extending beyond the individual. While previous research has employed machine learning for dropout classification, these studies often suffer from a short-term focus, relying on data collected only a few years into the study period. This study expanded the modeling horizon by utilizing a 13-year longitudinal dataset, encompassing data from kindergarten to Grade 9. Our methodology incorporated a comprehensive range of parameters, including students’ academic and cognitive skills, motivation, behavior, well-being, and officially recorded dropout data. The machine learning models developed in this study demonstrated notable classification ability, achieving a mean area under the curve (AUC) of 0.61 with data up to Grade 6 and an improved AUC of 0.65 with data up to Grade 9. Further data collection and independent correlational and causal analyses are crucial. In future iterations, such models may have the potential to proactively support educators’ processes and existing protocols for identifying at-risk students, thereby potentially aiding in the reinvention of student retention and success strategies and ultimately contributing to improved educational outcomes.peerReviewe

    Improved accuracy in colorectal cancer tissue decomposition through refinement of established deep learning solutions

    No full text
    Abstract Hematoxylin and eosin-stained biopsy slides are regularly available for colorectal cancer patients. These slides are often not used to define objective biomarkers for patient stratification and treatment selection. Standard biomarkers often pertain to costly and slow genetic tests. However, recent work has shown that relevant biomarkers can be extracted from these images using convolutional neural networks (CNNs). The CNN-based biomarkers predicted colorectal cancer patient outcomes comparably to gold standards. Extracting CNN-biomarkers is fast, automatic, and of minimal cost. CNN-based biomarkers rely on the ability of CNNs to recognize distinct tissue types from microscope whole slide images. The quality of these biomarkers (coined ‘Deep Stroma’) depends on the accuracy of CNNs in decomposing all relevant tissue classes. Improving tissue decomposition accuracy is essential for improving the prognostic potential of CNN-biomarkers. In this study, we implemented a novel training strategy to refine an established CNN model, which then surpassed all previous solutions . We obtained a 95.6% average accuracy in the external test set and 99.5% in the internal test set. Our approach reduced errors in biomarker-relevant classes, such as Lymphocytes, and was the first to include interpretability methods. These methods were used to better apprehend our model’s limitations and capabilities
    corecore