782 research outputs found

    A Heuristic Neural Network Structure Relying on Fuzzy Logic for Images Scoring

    Get PDF
    Traditional deep learning methods are sub-optimal in classifying ambiguity features, which often arise in noisy and hard to predict categories, especially, to distinguish semantic scoring. Semantic scoring, depending on semantic logic to implement evaluation, inevitably contains fuzzy description and misses some concepts, for example, the ambiguous relationship between normal and probably normal always presents unclear boundaries (normal − more likely normal - probably normal). Thus, human error is common when annotating images. Differing from existing methods that focus on modifying kernel structure of neural networks, this study proposes a dominant fuzzy fully connected layer (FFCL) for Breast Imaging Reporting and Data System (BI-RADS) scoring and validates the universality of this proposed structure. This proposed model aims to develop complementary properties of scoring for semantic paradigms, while constructing fuzzy rules based on analyzing human thought patterns, and to particularly reduce the influence of semantic conglutination. Specifically, this semantic-sensitive defuzzier layer projects features occupied by relative categories into semantic space, and a fuzzy decoder modifies probabilities of the last output layer referring to the global trend. Moreover, the ambiguous semantic space between two relative categories shrinks during the learning phases, as the positive and negative growth trends of one category appearing among its relatives were considered. We first used the Euclidean Distance (ED) to zoom in the distance between the real scores and the predicted scores, and then employed two sample t test method to evidence the advantage of the FFCL architecture. Extensive experimental results performed on the CBIS-DDSM dataset show that our FFCL structure can achieve superior performances for both triple and multiclass classification in BI-RADS scoring, outperforming the state-of-the-art methods

    Uncertainty-Aware Deep Ensembles for Reliable and Explainable Predictions of Clinical Time Series

    Get PDF
    Deep learning-based support systems have demonstrated encouraging results in numerous clinical applications involving the processing of time series data. While such systems often are very accurate, they have no inherent mechanism for explaining what influenced the predictions, which is critical for clinical tasks. However, existing explainability techniques lack an important component for trustworthy and reliable decision support, namely a notion of uncertainty. In this paper, we address this lack of uncertainty by proposing a deep ensemble approach where a collection of DNNs are trained independently. A measure of uncertainty in the relevance scores is computed by taking the standard deviation across the relevance scores produced by each model in the ensemble, which in turn is used to make the explanations more reliable. The class activation mapping method is used to assign a relevance score for each time step in the time series. Results demonstrate that the proposed ensemble is more accurate in locating relevant time steps and is more consistent across random initializations, thus making the model more trustworthy. The proposed methodology paves the way for constructing trustworthy and dependable support systems for processing clinical time series for healthcare related tasks.Comment: 11 pages, 9 figures, code at https://github.com/Wickstrom/TimeSeriesXA

    Dependability of Alternative Computing Paradigms for Machine Learning: hype or hope?

    Get PDF
    Today we observe amazing performance achieved by Machine Learning (ML); for specific tasks it even surpasses human capabilities. Unfortunately, nothing comes for free: the hidden cost behind ML performance stems from its high complexity in terms of operations to be computed and the involved amount of data. For this reasons, custom Artificial Intelligence hardware accelerators based on alternative computing paradigms are attracting large interest. Such dedicated devices support the energy-hungry data movement, speed of computation, and memory resources that MLs require to realize their full potential. However, when ML is deployed on safety-/mission-critical applications, dependability becomes a concern. This paper presents the state of the art of custom Artificial Intelligence hardware architectures for ML, here Spiking and Convolutional Neural Networks, and shows the best practices to evaluate their dependability

    A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era

    Full text link
    Heart sound auscultation has been demonstrated to be beneficial in clinical usage for early screening of cardiovascular diseases. Due to the high requirement of well-trained professionals for auscultation, automatic auscultation benefiting from signal processing and machine learning can help auxiliary diagnosis and reduce the burdens of training professional clinicians. Nevertheless, classic machine learning is limited to performance improvement in the era of big data. Deep learning has achieved better performance than classic machine learning in many research fields, as it employs more complex model architectures with stronger capability of extracting effective representations. Deep learning has been successfully applied to heart sound analysis in the past years. As most review works about heart sound analysis were given before 2017, the present survey is the first to work on a comprehensive overview to summarise papers on heart sound analysis with deep learning in the past six years 2017--2022. We introduce both classic machine learning and deep learning for comparison, and further offer insights about the advances and future research directions in deep learning for heart sound analysis

    Toward robust deep neural networks

    Get PDF
    Dans cette thĂšse, notre objectif est de dĂ©velopper des modĂšles d’apprentissage robustes et fiables mais prĂ©cis, en particulier les Convolutional Neural Network (CNN), en prĂ©sence des exemples anomalies, comme des exemples adversaires et d’échantillons hors distribution –Out-of-Distribution (OOD). Comme la premiĂšre contribution, nous proposons d’estimer la confiance calibrĂ©e pour les exemples adversaires en encourageant la diversitĂ© dans un ensemble des CNNs. À cette fin, nous concevons un ensemble de spĂ©cialistes diversifiĂ©s avec un mĂ©canisme de vote simple et efficace en termes de calcul pour prĂ©dire les exemples adversaires avec une faible confiance tout en maintenant la confiance prĂ©dicative des Ă©chantillons propres Ă©levĂ©e. En prĂ©sence de dĂ©saccord dans notre ensemble, nous prouvons qu’une borne supĂ©rieure de 0:5 + _0 peut ĂȘtre Ă©tablie pour la confiance, conduisant Ă  un seuil de dĂ©tection global fixe de tau = 0; 5. Nous justifions analytiquement le rĂŽle de la diversitĂ© dans notre ensemble sur l’attĂ©nuation du risque des exemples adversaires Ă  la fois en boĂźte noire et en boĂźte blanche. Enfin, nous Ă©valuons empiriquement la robustesse de notre ensemble aux attaques de la boĂźte noire et de la boĂźte blanche sur plusieurs donnĂ©es standards. La deuxiĂšme contribution vise Ă  aborder la dĂ©tection d’échantillons OOD Ă  travers un modĂšle de bout en bout entraĂźnĂ© sur un ensemble OOD appropriĂ©. À cette fin, nous abordons la question centrale suivante : comment diffĂ©rencier des diffĂ©rents ensembles de donnĂ©es OOD disponibles par rapport Ă  une tĂąche de distribution donnĂ©e pour sĂ©lectionner la plus appropriĂ©e, ce qui induit Ă  son tour un modĂšle calibrĂ© avec un taux de dĂ©tection des ensembles inaperçus de donnĂ©es OOD? Pour rĂ©pondre Ă  cette question, nous proposons de diffĂ©rencier les ensembles OOD par leur niveau de "protection" des sub-manifolds. Pour mesurer le niveau de protection, nous concevons ensuite trois nouvelles mesures efficaces en termes de calcul Ă  l’aide d’un CNN vanille prĂ©formĂ©. Dans une vaste sĂ©rie d’expĂ©riences sur les tĂąches de classification d’image et d’audio, nous dĂ©montrons empiriquement la capacitĂ© d’un CNN augmentĂ© (A-CNN) et d’un CNN explicitement calibrĂ© pour dĂ©tecter une portion significativement plus grande des exemples OOD. Fait intĂ©ressant, nous observons Ă©galement qu’un tel A-CNN (nommĂ© A-CNN) peut Ă©galement dĂ©tecter les adversaires exemples FGS en boĂźte noire avec des perturbations significatives. En tant que troisiĂšme contribution, nous Ă©tudions de plus prĂšs de la capacitĂ© de l’A-CNN sur la dĂ©tection de types plus larges d’adversaires boĂźte noire (pas seulement ceux de type FGS). Pour augmenter la capacitĂ© d’A-CNN Ă  dĂ©tecter un plus grand nombre d’adversaires,nous augmentons l’ensemble d’entraĂźnement OOD avec des Ă©chantillons interpolĂ©s inter-classes. Ensuite, nous dĂ©montrons que l’A-CNN, entraĂźnĂ© sur tous ces donnĂ©es, a un taux de dĂ©tection cohĂ©rent sur tous les types des adversaires exemples invisibles. Alors que la entraĂźnement d’un A-CNN sur des adversaires PGD ne conduit pas Ă  un taux de dĂ©tection stable sur tous les types d’adversaires, en particulier les types inaperçus. Nous Ă©valuons Ă©galement visuellement l’espace des fonctionnalitĂ©s et les limites de dĂ©cision dans l’espace d’entrĂ©e d’un CNN vanille et de son homologue augmentĂ© en prĂ©sence d’adversaires et de ceux qui sont propres. Par un A-CNN correctement formĂ©, nous visons Ă  faire un pas vers un modĂšle d’apprentissage debout en bout unifiĂ© et fiable avec de faibles taux de risque sur les Ă©chantillons propres et les Ă©chantillons inhabituels, par exemple, les Ă©chantillons adversaires et OOD. La derniĂšre contribution est de prĂ©senter une application de A-CNN pour l’entraĂźnement d’un dĂ©tecteur d’objet robuste sur un ensemble de donnĂ©es partiellement Ă©tiquetĂ©es, en particulier un ensemble de donnĂ©es fusionnĂ©. La fusion de divers ensembles de donnĂ©es provenant de contextes similaires mais avec diffĂ©rents ensembles d’objets d’intĂ©rĂȘt (OoI) est un moyen peu coĂ»teux de crĂ©er un ensemble de donnĂ©es Ă  grande Ă©chelle qui couvre un plus large spectre d’OoI. De plus, la fusion d’ensembles de donnĂ©es permet de rĂ©aliser un dĂ©tecteur d’objet unifiĂ©, au lieu d’en avoir plusieurs sĂ©parĂ©s, ce qui entraĂźne une rĂ©duction des coĂ»ts de calcul et de temps. Cependant, la fusion d’ensembles de donnĂ©es, en particulier Ă  partir d’un contexte similaire, entraĂźne de nombreuses instances d’étiquetĂ©es manquantes. Dans le but d’entraĂźner un dĂ©tecteur d’objet robuste intĂ©grĂ© sur un ensemble de donnĂ©es partiellement Ă©tiquetĂ©es mais Ă  grande Ă©chelle, nous proposons un cadre d’entraĂźnement auto-supervisĂ© pour surmonter le problĂšme des instances d’étiquettes manquantes dans les ensembles des donnĂ©es fusionnĂ©s. Notre cadre est Ă©valuĂ© sur un ensemble de donnĂ©es fusionnĂ© avec un taux Ă©levĂ© d’étiquettes manquantes. Les rĂ©sultats empiriques confirment la viabilitĂ© de nos pseudo-Ă©tiquettes gĂ©nĂ©rĂ©es pour amĂ©liorer les performances de YOLO, en tant que dĂ©tecteur d’objet Ă  la pointe de la technologie.In this thesis, our goal is to develop robust and reliable yet accurate learning models, particularly Convolutional Neural Networks (CNNs), in the presence of adversarial examples and Out-of-Distribution (OOD) samples. As the first contribution, we propose to predict adversarial instances with high uncertainty through encouraging diversity in an ensemble of CNNs. To this end, we devise an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism to predict the adversarial examples with low confidence while keeping the predictive confidence of the clean samples high. In the presence of high entropy in our ensemble, we prove that the predictive confidence can be upper-bounded, leading to have a globally fixed threshold over the predictive confidence for identifying adversaries. We analytically justify the role of diversity in our ensemble on mitigating the risk of both black-box and white-box adversarial examples. Finally, we empirically assess the robustness of our ensemble to the black-box and the white-box attacks on several benchmark datasets.The second contribution aims to address the detection of OOD samples through an end-to-end model trained on an appropriate OOD set. To this end, we address the following central question: how to differentiate many available OOD sets w.r.t. a given in distribution task to select the most appropriate one, which in turn induces a model with a high detection rate of unseen OOD sets? To answer this question, we hypothesize that the “protection” level of in-distribution sub-manifolds by each OOD set can be a good possible property to differentiate OOD sets. To measure the protection level, we then design three novel, simple, and cost-effective metrics using a pre-trained vanilla CNN. In an extensive series of experiments on image and audio classification tasks, we empirically demonstrate the abilityof an Augmented-CNN (A-CNN) and an explicitly-calibrated CNN for detecting a significantly larger portion of unseen OOD samples, if they are trained on the most protective OOD set. Interestingly, we also observe that the A-CNN trained on the most protective OOD set (calledA-CNN) can also detect the black-box Fast Gradient Sign (FGS) adversarial examples. As the third contribution, we investigate more closely the capacity of the A-CNN on the detection of wider types of black-box adversaries. To increase the capability of A-CNN to detect a larger number of adversaries, we augment its OOD training set with some inter-class interpolated samples. Then, we demonstrate that the A-CNN trained on the most protective OOD set along with the interpolated samples has a consistent detection rate on all types of unseen adversarial examples. Where as training an A-CNN on Projected Gradient Descent (PGD) adversaries does not lead to a stable detection rate on all types of adversaries, particularly the unseen types. We also visually assess the feature space and the decision boundaries in the input space of a vanilla CNN and its augmented counterpart in the presence of adversaries and the clean ones. By a properly trained A-CNN, we aim to take a step toward a unified and reliable end-to-end learning model with small risk rates on both clean samples and the unusual ones, e.g. adversarial and OOD samples.The last contribution is to show a use-case of A-CNN for training a robust object detector on a partially-labeled dataset, particularly a merged dataset. Merging various datasets from similar contexts but with different sets of Object of Interest (OoI) is an inexpensive way to craft a large-scale dataset which covers a larger spectrum of OoIs. Moreover, merging datasets allows achieving a unified object detector, instead of having several separate ones, resultingin the reduction of computational and time costs. However, merging datasets, especially from a similar context, causes many missing-label instances. With the goal of training an integrated robust object detector on a partially-labeled but large-scale dataset, we propose a self-supervised training framework to overcome the issue of missing-label instances in the merged datasets. Our framework is evaluated on a merged dataset with a high missing-label rate. The empirical results confirm the viability of our generated pseudo-labels to enhance the performance of YOLO, as the current (to date) state-of-the-art object detector
    • 

    corecore