30 research outputs found

    PSSPNN: PatchShuffle Stochastic Pooling Neural Network for an explainable diagnosis of COVID-19 with multiple-way data augmentation

    Get PDF
    Aim. COVID-19 has caused large death tolls all over the world. Accurate diagnosis is of significant importance for early treatment. Methods. In this study, we proposed a novel PSSPNN model for classification between COVID-19, secondary pulmonary tuberculosis, community-captured pneumonia, and healthy subjects. PSSPNN entails five improvements: we first proposed the n-conv stochastic pooling module. Second, a novel stochastic pooling neural network was proposed. Third, PatchShuffle was introduced as a regularization term. Fourth, an improved multiple-way data augmentation was used. Fifth, Grad-CAM was utilized to interpret our AI model. Results. The 10 runs with random seed on the test set showed our algorithm achieved a microaveraged F1 score of 95.79%. Moreover, our method is better than nine state-of-the-art approaches. Conclusion. This proposed PSSPNN will help assist radiologists to make diagnosis more quickly and accurately on COVID-19 cases

    Intelligent Early Diagnosis System against Strep Throat Infection Using Deep Neural Networks

    Get PDF
    The most frequent bacterial pathogen causing acute pharyngitis is Group-A hemolytic Streptococcus (GAS), and sore throat is the second most frequent acute infection. The immunological reaction to group A Streptococcus-induced pharyngitis results in Acute Rheumatic Fever (ARF). A genetically vulnerable host for ARF is a streptococcal infection. ARF, which can affect various organs and cause irreparable valve damage and heart failure, is the antecedent to Rheumatic Heart Disease (RHD). RHD, in many countries is Cardiovascular Disease (CVD) refers to a range of conditions that affect the heart and blood vessels, including coronary artery disease, heart attack, heart failure, and stroke. It is important to note that while this approach has demonstrated promising results, further studies and validation are necessary to establish its clinical feasibility and reliability. Further research can also be done to evaluate the generalization of the model to larger and diverse patient populations. The results showed that using Image Synthesis-based augmentation improved the ROC-AUC scores compared to basic data augmentation. The proposed method could be a valuable tool for healthcare professionals to quickly and accurately diagnose strep throat, leading to timely treatment and improved patient outcomes. The experimental findings indicate that the suggested detection approach for strep throat has a high level of accuracy and effectiveness. The approach has an average sensitivity of 93.1%, average specificity of 96.7%, and an overall accuracy of 96.3%. The ROC-AUC of 0.989 suggests that the approach is effective at distinguishing between positive and negative cases of strep throat. These results indicate that the suggested detection approach is a promising tool for accurately identifying cases of strep throat

    PSSPNN: PatchShuffle Stochastic Pooling Neural Network for an explainable diagnosis of COVID-19 with multiple-way data augmentation

    Get PDF
    Aim. COVID-19 has caused large death tolls all over the world. Accurate diagnosis is of significant importance for early treatment. Methods. In this study, we proposed a novel PSSPNN model for classification between COVID-19, secondary pulmonary tuberculosis, community-captured pneumonia, and healthy subjects. PSSPNN entails five improvements: we first proposed the n-conv stochastic pooling module. Second, a novel stochastic pooling neural network was proposed. Third, PatchShuffle was introduced as a regularization term. Fourth, an improved multiple-way data augmentation was used. Fifth, Grad-CAM was utilized to interpret our AI model. Results. The 10 runs with random seed on the test set showed our algorithm achieved a microaveraged F1 score of 95.79%. Moreover, our method is better than nine state-of-the-art approaches. Conclusion. This proposed PSSPNN will help assist radiologists to make diagnosis more quickly and accurately on COVID-19 cases

    Semantic-Constraint Matching Transformer for Weakly Supervised Object Localization

    Full text link
    Weakly supervised object localization (WSOL) strives to learn to localize objects with only image-level supervision. Due to the local receptive fields generated by convolution operations, previous CNN-based methods suffer from partial activation issues, concentrating on the object's discriminative part instead of the entire entity scope. Benefiting from the capability of the self-attention mechanism to acquire long-range feature dependencies, Vision Transformer has been recently applied to alleviate the local activation drawbacks. However, since the transformer lacks the inductive localization bias that are inherent in CNNs, it may cause a divergent activation problem resulting in an uncertain distinction between foreground and background. In this work, we proposed a novel Semantic-Constraint Matching Network (SCMN) via a transformer to converge on the divergent activation. Specifically, we first propose a local patch shuffle strategy to construct the image pairs, disrupting local patches while guaranteeing global consistency. The paired images that contain the common object in spatial are then fed into the Siamese network encoder. We further design a semantic-constraint matching module, which aims to mine the co-object part by matching the coarse class activation maps (CAMs) extracted from the pair images, thus implicitly guiding and calibrating the transformer network to alleviate the divergent activation. Extensive experimental results conducted on two challenging benchmarks, including CUB-200-2011 and ILSVRC datasets show that our method can achieve the new state-of-the-art performance and outperform the previous method by a large margin

    Deep learning in food category recognition

    Get PDF
    Integrating artificial intelligence with food category recognition has been a field of interest for research for the past few decades. It is potentially one of the next steps in revolutionizing human interaction with food. The modern advent of big data and the development of data-oriented fields like deep learning have provided advancements in food category recognition. With increasing computational power and ever-larger food datasets, the approach’s potential has yet to be realized. This survey provides an overview of methods that can be applied to various food category recognition tasks, including detecting type, ingredients, quality, and quantity. We survey the core components for constructing a machine learning system for food category recognition, including datasets, data augmentation, hand-crafted feature extraction, and machine learning algorithms. We place a particular focus on the field of deep learning, including the utilization of convolutional neural networks, transfer learning, and semi-supervised learning. We provide an overview of relevant studies to promote further developments in food category recognition for research and industrial applicationsMRC (MC_PC_17171)Royal Society (RP202G0230)BHF (AA/18/3/34220)Hope Foundation for Cancer Research (RM60G0680)GCRF (P202PF11)Sino-UK Industrial Fund (RP202G0289)LIAS (P202ED10Data Science Enhancement Fund (P202RE237)Fight for Sight (24NN201);Sino-UK Education Fund (OP202006)BBSRC (RM32G0178B8

    Feature transforms for image data augmentation

    Get PDF
    A problem with convolutional neural networks (CNNs) is that they require large datasets to obtain adequate robustness; on small datasets, they are prone to overfitting. Many methods have been proposed to overcome this shortcoming with CNNs. In cases where additional samples cannot easily be collected, a common approach is to generate more data points from existing data using an augmentation technique. In image classification, many augmentation approaches utilize simple image manipulation algorithms. In this work, we propose some new methods for data augmentation based on several image transformations: the Fourier transform (FT), the Radon transform (RT), and the discrete cosine transform (DCT). These and other data augmentation methods are considered in order to quantify their effectiveness in creating ensembles of neural networks. The novelty of this research is to consider different strategies for data augmentation to generate training sets from which to train several classifiers which are combined into an ensemble. Specifically, the idea is to create an ensemble based on a kind of bagging of the training set, where each model is trained on a different training set obtained by augmenting the original training set with different approaches. We build ensembles on the data level by adding images generated by combining fourteen augmentation approaches, with three based on FT, RT, and DCT, proposed here for the first time. Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method. These networks and several fusions are evaluated and compared across eleven benchmarks. Results show that building ensembles on the data level by combining different data augmentation methods produce classifiers that not only compete competitively against the state-of-the-art but often surpass the best approaches reported in the literature

    Image database expansion tool

    Get PDF
    Tato práce se věnuje vytvoření nástroje pro rozšiřování obrazových databází. Práce se zabývá teorií zpracování obrazu a existujícím nástrojům a přístupům v této problematice. Z této teorie jsou navrženy jednotlivé metody, které jsou dále implementovány. Dále jsou tyto metody opatřeny uživatelským rozhraním a dávkovým spouštění úprav obrázků. Nakonec jsou popsány omezující podmínky vytvořeného nástroje.This thesis devotes to create a tool for expansion of image dataset. Thesis deals with theory of image processing and existing tools and the approaches in this field. From this teory, individual methods are designed and implemented. Futhermore, these methods are provided by user interface and batch start of altering images. At the end restrictive conditions of program are described.
    corecore