Search CORE

10 research outputs found

Deep Fast Vision: A Python Library for Accelerated Deep Transfer Learning Vision Prototyping

Author: Prezja Fabi
Publication venue
Publication date: 10/11/2023
Field of study

Deep learning-based vision is characterized by intricate frameworks that often necessitate a profound understanding, presenting a barrier to newcomers and limiting broad adoption. With many researchers grappling with the constraints of smaller datasets, there's a pronounced reliance on pre-trained neural networks, especially for tasks such as image classification. This reliance is further intensified in niche imaging areas where obtaining vast datasets is challenging. Despite the widespread use of transfer learning as a remedy to the small dataset dilemma, a conspicuous absence of tailored auto-ML solutions persists. Addressing these challenges is "Deep Fast Vision", a python library that streamlines the deep learning process. This tool offers a user-friendly experience, enabling results through a simple nested dictionary definition, helping to democratize deep learning for non-experts. Designed for simplicity and scalability, Deep Fast Vision appears as a bridge, connecting the complexities of existing deep learning frameworks with the needs of a diverse user base.Comment: 7 pages, 1 figur

arXiv.org e-Print Archive

Adaptive Variance Thresholding: A Novel Approach to Improve Existing Deep Transfer Vision Models and Advance Automatic Knee-Joint Osteoarthritis Classification

Author: Annala Leevi
Kiiskinen Sampsa
Lahtinen Suvi
Ojala Timo
Prezja Fabi
Publication venue
Publication date: 09/11/2023
Field of study

Knee-Joint Osteoarthritis (KOA) is a prevalent cause of global disability and is inherently complex to diagnose due to its subtle radiographic markers and individualized progression. One promising classification avenue involves applying deep learning methods; however, these techniques demand extensive, diversified datasets, which pose substantial challenges due to medical data collection restrictions. Existing practices typically resort to smaller datasets and transfer learning. However, this approach often inherits unnecessary pre-learned features that can clutter the classifier's vector space, potentially hampering performance. This study proposes a novel paradigm for improving post-training specialized classifiers by introducing adaptive variance thresholding (AVT) followed by Neural Architecture Search (NAS). This approach led to two key outcomes: an increase in the initial accuracy of the pre-trained KOA models and a 60-fold reduction in the NAS input vector space, thus facilitating faster inference speed and a more efficient hyperparameter search. We also applied this approach to an external model trained for KOA classification. Despite its initial performance, the application of our methodology improved its average accuracy, making it one of the top three KOA classification models.Comment: 26 pages, 5 figure

arXiv.org e-Print Archive

Synthesizing Bidirectional Temporal States of Knee Osteoarthritis Radiographs with Cycle-Consistent Generative Adversarial Neural Networks

Author: Annala Leevi
Kiiskinen Sampsa
Lahtinen Suvi
Ojala Timo
Prezja Fabi
Publication venue
Publication date: 09/11/2023
Field of study

Knee Osteoarthritis (KOA), a leading cause of disability worldwide, is challenging to detect early due to subtle radiographic indicators. Diverse, extensive datasets are needed but are challenging to compile because of privacy, data collection limitations, and the progressive nature of KOA. However, a model capable of projecting genuine radiographs into different OA stages could augment data pools, enhance algorithm training, and offer pre-emptive prognostic insights. In this study, we trained a CycleGAN model to synthesize past and future stages of KOA on any genuine radiograph. The model was validated using a Convolutional Neural Network that was deceived into misclassifying disease stages in transformed images, demonstrating the CycleGAN's ability to effectively transform disease characteristics forward or backward in time. The model was particularly effective in synthesizing future disease states and showed an exceptional ability to retroactively transition late-stage radiographs to earlier stages by eliminating osteophytes and expanding knee joint space, signature characteristics of None or Doubtful KOA. The model's results signify a promising potential for enhancing diagnostic models, data augmentation, and educational and prognostic usage in healthcare. Nevertheless, further refinement, validation, and a broader evaluation process encompassing both CNN-based assessments and expert medical feedback are emphasized for future research and development.Comment: 29 pages, 10 figure

arXiv.org e-Print Archive

Improving Performance in Colorectal Cancer Histology Decomposition using Deep and Ensemble Machine Learning

Author: Annala Leevi
Kiiskinen Sampsa
Kuopio Teijo
Lahtinen Suvi
Ojala Timo
Prezja Fabi
Ruusuvuori Pekka
Publication venue
Publication date: 25/10/2023
Field of study

In routine colorectal cancer management, histologic samples stained with hematoxylin and eosin are commonly used. Nonetheless, their potential for defining objective biomarkers for patient stratification and treatment selection is still being explored. The current gold standard relies on expensive and time-consuming genetic tests. However, recent research highlights the potential of convolutional neural networks (CNNs) in facilitating the extraction of clinically relevant biomarkers from these readily available images. These CNN-based biomarkers can predict patient outcomes comparably to golden standards, with the added advantages of speed, automation, and minimal cost. The predictive potential of CNN-based biomarkers fundamentally relies on the ability of convolutional neural networks (CNNs) to classify diverse tissue types from whole slide microscope images accurately. Consequently, enhancing the accuracy of tissue class decomposition is critical to amplifying the prognostic potential of imaging-based biomarkers. This study introduces a hybrid Deep and ensemble machine learning model that surpassed all preceding solutions for this classification task. Our model achieved 96.74% accuracy on the external test set and 99.89% on the internal test set. Recognizing the potential of these models in advancing the task, we have made them publicly available for further research and development.Comment: 28 pages, 9 figure

arXiv.org e-Print Archive

H&E Multi-Laboratory Staining Variance Exploration with Machine Learning

Author: Kuopio Teijo
Prezja Fabi
Pölönen Ilkka
Ruusuvuori Pekka
Äyrämö Sami
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

In diagnostic histopathology, hematoxylin and eosin (H&E) staining is a critical process that highlights salient histological features. Staining results vary between laboratories regardless of the histopathological task, although the method does not change. This variance can impair the accuracy of algorithms and histopathologists’ time-to-insight. Investigating this variance can help calibrate stain normalization tasks to reverse this negative potential. With machine learning, this study evaluated the staining variance between different laboratories on three tissue types. We received H&E-stained slides from 66 different laboratories. Each slide contained kidney, skin, and colon tissue samples stained by the method routinely used in each laboratory. The samples were digitized and summarized as red, green, and blue channel histograms. Dimensions were reduced using principal component analysis. The data projected by principal components were inserted into the k-means clustering algorithm and the k-nearest neighbors classifier with the laboratories as the target. The k-means silhouette index indicated that K = 2 clusters had the best separability in all tissue types. The supervised classification result showed laboratory effects and tissue-type bias. Both supervised and unsupervised approaches suggested that tissue type also affected inter-laboratory variance. We suggest tissue type to also be considered upon choosing the staining and color-normalization approach.publishedVersionPeer reviewe

Multidisciplinary Digital Publishing Institute

Jyväskylä University Digital Archive

Trepo - Institutional Repository of Tampere University

H&E Multi-Laboratory Staining Variance Exploration with Machine Learning

Author: Ayramo Sami
Kuopio Teijo
Polonen Ilkka
Prezja Fabi
Ruusuvuori Pekka
Publication venue: 'MDPI AG'
Publication date: 28/10/2022
Field of study

In diagnostic histopathology, hematoxylin and eosin (H&E) staining is a critical process that highlights salient histological features. Staining results vary between laboratories regardless of the histopathological task, although the method does not change. This variance can impair the accuracy of algorithms and histopathologists' time-to-insight. Investigating this variance can help calibrate stain normalization tasks to reverse this negative potential. With machine learning, this study evaluated the staining variance between different laboratories on three tissue types. We received H&E-stained slides from 66 different laboratories. Each slide contained kidney, skin, and colon tissue samples stained by the method routinely used in each laboratory. The samples were digitized and summarized as red, green, and blue channel histograms. Dimensions were reduced using principal component analysis. The data projected by principal components were inserted into the k-means clustering algorithm and the k-nearest neighbors classifier with the laboratories as the target. The k-means silhouette index indicated that K = 2 clusters had the best separability in all tissue types. The supervised classification result showed laboratory effects and tissue-type bias. Both supervised and unsupervised approaches suggested that tissue type also affected inter-laboratory variance. We suggest tissue type to also be considered upon choosing the staining and color-normalization approach

UTUPub

Developing and testing sub-band spectral features in music genre and music mood machine learning

Author: Prezja Fabi
Publication venue
Publication date: 01/01/2018
Field of study

In the field of artificial intelligence, supervised machine learning enables us to try to develop automatic recognition systems. In music information retrieval, training and testing such systems is possible with a variety of music datasets. Two key prediction tasks are those of music genre recognition, and of music mood recognition. The focus of this study is to evaluate the classification of music into genres and mood categories from the audio content. To this end, we evaluate five novel spectro-temporal variants of sub-band musical features. These features are, sub-band entropy, sub-band flux, sub-band kurtosis, sub-band skewness and sub-band zero crossing rate. The choice of features is based on previous studies that highlight the potential efficacy of sub-band features. To aid our analysis we include the Mel-Frequency Cepstral Coefficients feature as our baseline approach. The classification performances are obtained with various learning algorithms, distinct datasets and multiple feature selection subsets. In order to create and evaluate models in both tasks, we use two music datasets prelabelled with regards to, music genres (GTZAN) and music mood (PandaMood) respectively. In addition, this study is the first to develop an adaptive window decomposition method for these sub-band features and one of a handful few that uses artist filtering and fault filtering for the GTZAN dataset. Our results show that the vast majority of sub-band features outperformed the MFCCs in the music genre and the music mood tasks. Between individual features, sub-band entropy outperformed and outranked every feature in both tasks and feature selection approaches. Lastly, we find lower overfitting tendencies for sub-band features in comparison to the MFCCs. In summary, this study gives support to the use of these sub-band features for music genre and music mood classification tasks and further suggests uses in other content-based predictive tasks

Jyväskylä University Digital Archive

DeepFake knee osteoarthritis X-rays from generative adversarial neural networks deceive medical experts and offer augmentation potential to automatic classification

Author: Niinimäki Esko
Paloneva Juha
Prezja Fabi
Pölönen Ilkka
Äyrämö Sami
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Recent developments in deep learning have impacted medical science. However, new privacy issues and regulatory frameworks have hindered medical data sharing and collection. Deep learning is a very data-intensive process for which such regulatory limitations limit the potential for new breakthroughs and collaborations. However, generating medically accurate synthetic data can alleviate privacy issues and potentially augment deep learning pipelines. This study presents generative adversarial neural networks capable of generating realistic images of knee joint X-rays with varying osteoarthritis severity. We offer 320,000 synthetic (DeepFake) X-ray images from training with 5,556 real images. We validated our models regarding medical accuracy with 15 medical experts and for augmentation effects with an osteoarthritis severity classification task. We devised a survey of 30 real and 30 DeepFake images for medical experts. The result showed that on average, more DeepFakes were mistaken for real than the reverse. The result signified sufficient DeepFake realism for deceiving the medical experts. Finally, our DeepFakes improved classification accuracy in an osteoarthritis severity classification task with scarce real data and transfer learning. In addition, in the same classification task, we replaced all real training data with DeepFakes and suffered only a 3.79% loss from baseline accuracy in classifying real osteoarthritis X-rays.peerReviewe

Jyväskylä University Digital Archive

PubMed Central

Machine learning predicts upper secondary education dropout as early as the end of primary school

Author: Lerkkanen Marja-Kristiina
Poikkeus Anna-Maija
Prezja Fabi
Psyridou Maria
Torppa Minna
Vasalampi Kati
Publication venue: Nature Publishing Group
Publication date: 01/01/2024
Field of study

Education plays a pivotal role in alleviating poverty, driving economic growth, and empowering individuals, thereby significantly influencing societal and personal development. However, the persistent issue of school dropout poses a significant challenge, with its effects extending beyond the individual. While previous research has employed machine learning for dropout classification, these studies often suffer from a short-term focus, relying on data collected only a few years into the study period. This study expanded the modeling horizon by utilizing a 13-year longitudinal dataset, encompassing data from kindergarten to Grade 9. Our methodology incorporated a comprehensive range of parameters, including students’ academic and cognitive skills, motivation, behavior, well-being, and officially recorded dropout data. The machine learning models developed in this study demonstrated notable classification ability, achieving a mean area under the curve (AUC) of 0.61 with data up to Grade 6 and an improved AUC of 0.65 with data up to Grade 9. Further data collection and independent correlational and causal analyses are crucial. In future iterations, such models may have the potential to proactively support educators’ processes and existing protocols for identifying at-risk students, thereby potentially aiding in the reinvention of student retention and success strategies and ultimately contributing to improved educational outcomes.peerReviewe

Jyväskylä University Digital Archive

Improved accuracy in colorectal cancer tissue decomposition through refinement of established deep learning solutions

Author: Fabi Prezja
Ilkka Pölönen
Pekka Ruusuvuori
Sami Äyrämö
Suvi Lahtinen
Teijo Kuopio
Timo Ojala
Publication venue: Nature Portfolio
Publication date: 01/01/2023
Field of study

Abstract Hematoxylin and eosin-stained biopsy slides are regularly available for colorectal cancer patients. These slides are often not used to define objective biomarkers for patient stratification and treatment selection. Standard biomarkers often pertain to costly and slow genetic tests. However, recent work has shown that relevant biomarkers can be extracted from these images using convolutional neural networks (CNNs). The CNN-based biomarkers predicted colorectal cancer patient outcomes comparably to gold standards. Extracting CNN-biomarkers is fast, automatic, and of minimal cost. CNN-based biomarkers rely on the ability of CNNs to recognize distinct tissue types from microscope whole slide images. The quality of these biomarkers (coined ‘Deep Stroma’) depends on the accuracy of CNNs in decomposing all relevant tissue classes. Improving tissue decomposition accuracy is essential for improving the prognostic potential of CNN-biomarkers. In this study, we implemented a novel training strategy to refine an established CNN model, which then surpassed all previous solutions . We obtained a 95.6% average accuracy in the external test set and 99.5% in the internal test set. Our approach reduced errors in biomarker-relevant classes, such as Lymphocytes, and was the first to include interpretability methods. These methods were used to better apprehend our model’s limitations and capabilities

Jyväskylä University Digital Archive

Directory of Open Access Journals