13 research outputs found

    Influence of augmentation on the performance of the double ResNet-based model for chest X-ray classification

    Get PDF
    Purpose: A pandemic disease elicited by the SARS-CoV-2 virus has become a serious health issue due to infecting millions of people all over the world. Recent publications prove that artificial intelligence (AI) can be used for medical diagnosis purposes, including interpretation of X-ray images. X-ray scanning is relatively cheap, and scan processing is not computationally demanding. Material and methods: In our experiment a baseline transfer learning schema of processing of lung X-ray images, including augmentation, in order to detect COVID-19 symptoms was implemented. Seven different scenarios of augmentation were proposed. The model was trained on a dataset consisting of more than 30,000 X-ray images. Results: The obtained model was evaluated using real images from a Polish hospital, with the use of standard metrics, and it achieved accuracy = 0.9839, precision = 0.9697, recall = 1.0000, and F1-score = 0.9846. Conclusions: Our experiment proved that augmentations and masking could be important steps of data pre-processing and could contribute to improvement of the evaluation metrics. Because medical professionals often tend to lack confidence in AI-based tools, we have designed the proposed model so that its results would be explainable and could play a supporting role for radiology specialists in their work

    Integrating glycolysis, citric acid cycle, pentose phosphate pathway, and fatty acid beta‑oxidation into a single computational model

    Get PDF
    The metabolic network of a living cell is highly intricate and involves complex interactions between various pathways. In this study, we propose a computational model that integrates glycolysis, the pentose phosphate pathway (PPP), the fatty acids beta-oxidation, and the tricarboxylic acid cycle (TCA cycle) using queueing theory. The model utilizes literature data on metabolite concentrations and enzyme kinetic constants to calculate the probabilities of individual reactions occurring on a microscopic scale, which can be viewed as the reaction rates on a macroscopic scale. However, it should be noted that the model has some limitations, including not accounting for all the reactions in which the metabolites are involved. Therefore, a genetic algorithm (GA) was used to estimate the impact of these external processes. Despite these limitations, our model achieved high accuracy and stability, providing real-time observation of changes in metabolite concentrations. This type of model can help in better understanding the mechanisms of biochemical reactions in cells, which can ultimately contribute to the prevention and treatment of aging, cancer, metabolic diseases, and neurodegenerative disorders

    Queueing theory model of pentose phosphate pathway

    Get PDF
    Due to its role in maintaining the proper functioning of the cell, the pentose phosphate pathway (PPP) is one of the most important metabolic pathways. It is responsible for regulating the concentration of simple sugars and provides precursors for the synthesis of amino acids and nucleotides. In addition, it plays a critical role in maintaining an adequate level of NADPH, which is necessary for the cell to fight oxidative stress. These reasons prompted the authors to develop a computational model, based on queueing theory, capable of simulating changes in PPP metabolites’ concentrations. The model has been validated with empirical data from tumor cells. The obtained results prove the stability and accuracy of the model. By applying queueing theory, this model can be further expanded to include successive metabolic pathways. The use of the model may accelerate research on new drugs, reduce drug costs, and reduce the reliance on laboratory animals necessary for this type of research on which new methods are tested

    A Machine-Learning-Based Approach to Prediction of Biogeographic Ancestry within Europe

    Get PDF
    Data obtained with the use of massive parallel sequencing (MPS) can be valuable in population genetics studies. In particular, such data harbor the potential for distinguishing samples from different populations, especially from those coming from adjacent populations of common origin. Machine learning (ML) techniques seem to be especially well suited for analyzing large datasets obtained using MPS. The Slavic populations constitute about a third of the population of Europe and inhabit a large area of the continent, while being relatively closely related in population genetics terms. In this proof-of-concept study, various ML techniques were used to classify DNA samples from Slavic and non-Slavic individuals. The primary objective of this study was to empirically evaluate the feasibility of discerning the genetic provenance of individuals of Slavic descent who exhibit genetic similarity, with the overarching goal of categorizing DNA specimens derived from diverse Slavic population representatives. Raw sequencing data were pre-processed, to obtain a 1200 character-long binary vector. A total of three classifiers were used—Random Forest, Support Vector Machine (SVM), and XGBoost. The most-promising results were obtained using SVM with a linear kernel, with 99.9% accuracy and F1-scores of 0.9846–1.000 for all classes

    Analiza danych GWAS przy użyciu algorytmów uczenia maszynowego – przegląd literatury

    No full text
    Machine learning is a part of field concerned with AI. The main goal of machine learning algorithms is to create automatic system that improves itself with the use of its experience (given data) to gain new knowledge. Genome-Wide Association Studies compare whole genomes of different individuals in order to see if any of genetic variants are correlated with a trait. Using ML for GWAS analysis can be beneficial for scientists. It has been proved several times in various ways.Uczenie maszynowe jest dziedziną nauki związaną ze sztuczną inteligencją. Głównym celem algorytmów uczenia maszynowego jest stworzenie automatycznego systemu, który poprawia się dzięki wykorzystaniu swojego doświadczenia (danych) w celu zdobycia nowej wiedzy. Badania asocjacyjne całego genomu (GWAS) porównują całe genomy różnych osobników, aby sprawdzić, czy którykolwiek z wariantów genetycznych jest skorelowany z cechą. Wykorzystanie ML do analizy GWAS może być korzystne dla naukowców. Zostało to udowodnione na różne sposoby

    Algorytm do analizy danych genetycznych – porównanie częstotliwości występowania określonych mutacji wśród różnych populacji

    No full text
    This paper presents a novel algorithm which can be used to analyze genomic data obtained during Next Generation Sequencing (NGS). Due to the interest in the subject among geneticists, it is necessary to develop algorithms and programs which analyze genetic data that will be user-friendly and accessible to people not related to typical bioinformatics. A way of performing comparative analyze, including proper data preprocessing and final data processing is described. Input data for the algorithm are annotated .vcf files. The outcome of presented algorithm is a file with counted percentage of single nucleotide polymorphisms (SNP) in data for every loaded population.W artykule przedstawiono nowatorski algorytm służący do analizy danych genomowych uzyskanych podczas sekwencjonowania nowej generacji (NGS). Ze względu na zainteresowanie tą tematyką wśród genetyków konieczne jest opracowanie przyjaznych dla użytkownika i dostępnych dla osób niezwiązanych z bioinformatyką algorytmów i programów analizujących dane genetyczne. Opisano sposób przeprowadzania analizy porównawczej, w tym wstępne i końcowe przetwarzanie danych. Dane wejściowe algorytmu to pliki formatu .vcf z adnotacjami. Wynikiem przedstawionego algorytmu jest plik zawierający informacje dotyczące częstości występowania polimorfizmów pojedynczego nukleotydu (ang. single nucleotide polymorphism, SNP) w badanych populacjach

    Przewidywanie immunogenności u myszy przy użyciu klasyfikatora Random Forest

    No full text
    Biomedical data are difficult to interpret due to their large amount. One of the solutions to cope with this problem is to use machine learning. Machine learning can be used to capture previously unnoticed dependencies. The authors performed random forest classifier with entropy and Gini index criteria on immunogenicity data. Input data consisted of 3 columns: epitope (8-11 amino acids long peptide), major histocompatibility complex (MHC) and immune response. Presented model can predict the immune response based on epitope-MHC complex. Achieved results had accuracy of 84% for entropy and 83% for Gini index. The results are not fully satisfying but are a fair start for more complexed experiments and could be used as an indicator for further research.Dane biomedyczne są trudne do interpretacji ze względu na ich dużą ilość. Jednym z rozwiązań radzenia sobie z tym problemem jest wykorzystanie uczenia maszynowego. Techniki te umożliwiają wychwycenie wcześniej niezauważonych zależności. W artykule przedstawiono wykorzystanie klasyfikatora Random Forest z kryterium entropii i indeksem Gini na danych dotyczących immunogenności. Dane wejściowe składają się z 3 kolumn: epitop (peptyd o długości 8-11 aminokwasów), główny kompleks zgodności tkankowej (MHC) i odpowiedź immunologiczna. Zaprezentowany model przewiduje odpowiedź immunologiczną na podstawie kompleksu epitop-MHC. Uzyskane wyniki osiągnęły dokładność na poziomie 84% (entropia) i 83% (indeks Gini). Wyniki nie są w pełni satysfakcjonujące, ale stanowią dobry początek dla bardziej złożonych eksperymentów i wyznacznik do dalszych badań

    A Novel Method of Vein Detection with the Use of Digital Image Correlation

    No full text
    Digital image correlation may be useful in many different fields of science, one of which is medicine. In this paper, the authors present the results of research aimed at detecting skin micro-shifts caused by pulsation of the veins. A novel technique using digital image correlation (DIC) and filtering the resulting shifts map to detect pulsating veins was proposed. After applying the proposed method, the veins in the forearm were visualized. The proposed technique may be used in the diagnosis of venous stenosis and may also contribute to reducing the number of adverse events during blood collection. The great advantage of the proposed method is the lack of the need to have specialized equipment, only a typical mobile phone camera is needed to perform the test

    A Novel Lightweight Approach to COVID-19 Diagnostics Based on Chest X-ray Images

    No full text
    Background: This paper presents a novel lightweight approach based on machine learning methods supporting COVID-19 diagnostics based on X-ray images. The presented schema offers effective and quick diagnosis of COVID-19. Methods: Real data (X-ray images) from hospital patients were used in this study. All labels, namely those that were COVID-19 positive and negative, were confirmed by a PCR test. Feature extraction was performed using a convolutional neural network, and the subsequent classification of samples used Random Forest, XGBoost, LightGBM and CatBoost. Results: The LightGBM model was the most effective in classifying patients on the basis of features extracted from X-ray images, with an accuracy of 1.00, a precision of 1.00, a recall of 1.00 and an F1-score of 1.00. Conclusion: The proposed schema can potentially be used as a support for radiologists to improve the diagnostic process. The presented approach is efficient and fast. Moreover, it is not excessively complex computationally

    Integrating glycolysis, citric acid cycle, pentose phosphate pathway, and fatty acid beta-oxidation into a single computational model

    No full text
    Abstract The metabolic network of a living cell is highly intricate and involves complex interactions between various pathways. In this study, we propose a computational model that integrates glycolysis, the pentose phosphate pathway (PPP), the fatty acids beta-oxidation, and the tricarboxylic acid cycle (TCA cycle) using queueing theory. The model utilizes literature data on metabolite concentrations and enzyme kinetic constants to calculate the probabilities of individual reactions occurring on a microscopic scale, which can be viewed as the reaction rates on a macroscopic scale. However, it should be noted that the model has some limitations, including not accounting for all the reactions in which the metabolites are involved. Therefore, a genetic algorithm (GA) was used to estimate the impact of these external processes. Despite these limitations, our model achieved high accuracy and stability, providing real-time observation of changes in metabolite concentrations. This type of model can help in better understanding the mechanisms of biochemical reactions in cells, which can ultimately contribute to the prevention and treatment of aging, cancer, metabolic diseases, and neurodegenerative disorders
    corecore