326 research outputs found

    Iterative Assessment and Improvement of DNN Operational Accuracy

    Get PDF
    Deep Neural Networks (DNN) are nowadays largely adopted in many application domains thanks to their human-like, or even superhuman, performance in specific tasks. However, due to unpredictable/unconsidered operating conditions, unexpected failures show up on field, making the performance of a DNN in operation very different from the one estimated prior to release. In the life cycle of DNN systems, the assessment of accuracy is typically addressed in two ways: offline, via sampling of operational inputs, or online, via pseudo-oracles. The former is considered more expensive due to the need for manual labeling of the sampled inputs. The latter is automatic but less accurate. We believe that emerging iterative industrial-strength life cycle models for Machine Learning systems, like MLOps, offer the possibility to leverage inputs observed in operation not only to provide faithful estimates of a DNN accuracy, but also to improve it through remodeling/retraining actions. We propose DAIC (DNN Assessment and Improvement Cycle), an approach which combines ''low-cost'' online pseudo-oracles and ''high-cost'' offline sampling techniques to estimate and improve the operational accuracy of a DNN in the iterations of its life cycle. Preliminary results show the benefits of combining the two approaches and integrating them in the DNN life cycle

    Assessing Black-box Test Case Generation Techniques for Microservices

    Get PDF
    Testing of microservices architectures (MSA) – today a popular software architectural style - demands for automation in its several tasks, like tests generation, prioritization and execution. Automated black-box generation of test cases for MSA currently borrows techniques and tools from the testing of RESTful Web Services. This paper: i) proposes the uTest stateless pairwise combinatorial technique (and its automation tool) for test cases generation for functional and robustness microservices testing, and ii) experimentally compares - with three open-source MSA used as subjects - four state-of-the-art black-box tools conceived for Web Services, adopting evolutionary-, dependencies- and mutation-based generation techniques, and the pro- posed uTest combinatorial tool. The comparison shows little differences in coverage values; uTest pairwise testing achieves better average failure rate with a considerably lower number of tests. Web Services tools do not perform for MSA as well as a tester might expect, highlighting the need for MSA-specific techniques

    Assessing Operational Accuracy of CNN-based Image Classifiers using an Oracle Surrogate

    Get PDF
    Context Assessing the accuracy in operation of a Machine Learning (ML) system for image classification on arbitrary (unlabeled) inputs is hard. This is due to the oracle problem, which impacts the ability of automatically judging the output of the classification, thus hindering the accuracy of the assessment when unlabeled previously unseen inputs are submitted to the system. Objective We propose the Image Classification Oracle Surrogate (ICOS), a technique to automatically evaluate the accuracy in operation of image classifiers based on Convolutional Neural Networks (CNNs). Method To establish whether the classification of an arbitrary image is correct or not, ICOS leverages three knowledge sources: operational input data, training data, and the ML algorithm. Knowledge is expressed through likely invariants - properties which should not be violated by correct classifications. ICOS infers and filters invariants to improve the correct detection of misclassifications, reducing the number of false positives. We evaluate ICOS experimentally on twelve CNNs – using the popular MNIST, CIFAR10, CIFAR100, and ImageNet datasets. We compare it to two alternative strategies, namely cross-referencing and self-checking. Results Experimental results show that ICOS exhibits performance comparable to the other strategies in terms of accuracy, showing higher stability over a variety of CNNs and datasets with different complexity and size. Conclusions ICOS likely invariants are shown to be effective in automatically detecting misclassifications by CNNs used in image classification tasks when the expected output is unknown; ICOS ultimately yields faithful assessments of their accuracy in operation. Knowledge about input data can also be manually incorporated into ICOS, to increase robustness against unexpected phenomena in operation, like label shift

    Investigational drugs for the treatment of endometriosis, an update on recent developments

    Get PDF
    Introduction: Endometriosis is a hormone-dependent benign chronic disease that requires a chronic medical therapy. Although currently available drugs are efficacious in treating endometriosis-related pain, some women experience partial or no improvement. Moreover, the recurrence of symptoms is expected after discontinuation of the therapies. Currently, new drugs are under intense clinical investigation for the treatment of endometriosis. Areas covered: This review aims to offer the reader a complete and updated overview on new investigational drugs and early molecular targets for the treatment of endometriosis. The authors describe the pre-clinical and clinical development of these agents. Expert opinion: Among the drugs under investigation, late clinical trials on gonadotropin-releasing hormone antagonists (GnRH-ant) showed the most promising results for the treatment of endometriosis. Aromatase inhibitors (AIs) are efficacious in treating endometriosis related pain symptoms but they cause significant adverse effects that limit their long-term use. New targets have been identified to produce drugs for the treatment of endometriosis, but the majority of these new compounds have only been investigated in laboratory studies or early clinical trials. Thus, further clinical research is required in order to elucidate their efficacy and safety in human

    Modelling the effect of SMP production and external carbon addition on S-driven autotrophic denitrification

    Get PDF
    The aim of this study was to develop a mathematical model to assess the effect of soluble microbial products production and external carbon source addition on the performance of a sulfur-driven autotrophic denitrification (SdAD) process. During SdAD, the growth of autotrophic biomass (AUT) was accompanied by the proliferation of heterotrophic biomass mainly consisting of heterotrophic denitrifiers (HD) and sulfate-reducing bacteria (SRB), which are able to grow on both the SMP derived from the microbial activities and on an external carbon source. The process was supposed to occur in a sequencing batch reactor to investigate the effects of the COD injection on both heterotrophic species and to enhance the production and consumption of SMP. The mathematical model was built on mass balance considerations and consists of a system of nonlinear impulsive differential equations, which have been solved numerically. Different simulation scenarios have been investigated by varying the main operational parameters: cycle duration, day of COD injection and quantity of COD injected. For cycle durations of more than 15 days and a COD injection after the half-cycle duration, SdAD represents the prevailing process and the SRB represent the main heterotrophic family. For shorter cycle duration and COD injections earlier than the middle of the cycle, the same performance can be achieved increasing the quantity of COD added, which results in an increased activity of HD. In all the performed simulation even in the case of COD addition, AUT remain the prevailing microbial family in the reactor

    Transvaginal ultrasound versus magnetic resonance imaging for diagnosing adenomyosis: A systematic review and head-to-head meta-analysis

    Get PDF
    Background: Transvaginal ultrasound (TVS) and magnetic resonance imaging (MRI) are used for the clinical diagnosis of adenomyosis. Objectives: To compare the diagnostic accuracy of TVS and MRI for the diagnosis of adenomyosis.Search Strategy: A search of studies was performed in five databases comparing TVS and MRI for the diagnosis of adenomyosis from January 1990 to May 2022.Selection Criteria: Studies were eligible if they reported on the use of TVS and MRI in the same set of patients. The reference standard must be pathology (hysterectomy). Data Collection and Analysis: The quality of studies was assessed using the QUADAS-2 tool. Pooled sensitivity and specificity of both techniques were estimated and compared. Main Results: Six studies comprising 595 women were included. The risk of bias of patient selection was high in three studies. The risk of bias for index tests and reference test was low. Pooled estimated sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio for TVS were 75%, 81%, 3.9, and 0.31, respectively. These figures for MRI were 69%, 80%, 3.5, and 0.39, respectively. No statistically significant differences were found (p= 0.7509). Heterogeneity was high. Conclusions: MRI and TVS have similar performances for the diagnosis of adenomyosis

    Rockfall threatening cumae archeological site fruition (Phlegraean fields park—naples)

    Get PDF
    Natural hazards threaten many archaeological sites in the world; therefore, susceptibility analysis is essential to reduce their impacts and support site fruition by visitors. In this paper, rockfall susceptibility analysis of the western slope of the Cumae Mount in the Cumae Archaeological Site (Phlegraean Fields, Naples), already affected by rockfall events, is described as support to a management plan for fruition and site conservation. Being the first Greek settlement in southern Italy, the site has great historical importance and offers unique historical elements such as the Cumaean Sibyl’s Cave. The analysis began with a 3D modeling of the slope through digital terrestrial photogrammetry, which forms a basis for a geomechanical analysis. Digital discontinuity measurements and cluster analysis provide data for kinematic analysis, which pointed out the planar, wedge and toppling failure potential. Subsequently, a propagation-based susceptibility analysis was completed into a GIS environment: it shows that most of the western sector of the site is susceptible to rockfall, including the access course, a segment of the Cumana Railroad and its local station. The work highlights the need for specific mitigation measures to increase visitor safety and the efficacy of filed-based digital reconstruction to support susceptibility analysis in rockfall prone areas

    Recurrence Rate and Morbidity after Ultrasound-guided Transvaginal Aspiration of Ultrasound Benign-appearing Adnexal Cystic Masses with and without Sclerotherapy: A Systematic Review and Meta-analysis

    Get PDF
    To determine the pooled recurrence rate of benign adnexal masses/cysts (namely simple cyst, endometrioma, hydrosalpinx, peritoneal cyst) after transvaginal ultrasound-guided aspiration, with or without sclerotherapy

    Ovarian Adnexal Reporting Data System (O-RADS) for Classifying Adnexal Masses: A Systematic Review and Meta-Analysis

    Get PDF
    In this systematic review and meta-analysis, we aimed to assess the pooled diagnostic performance of the so-called Ovarian Adnexal Report Data System (O-RADS) for classifying adnexal masses using transvaginal ultrasound, a classification system that was introduced in 2020. We performed a search for studies reporting the use of the O-RADS system for classifying adnexal masses from January 2020 to April 2022 in several databases (Medline (PubMed), Google Scholar, Scopus, Cochrane, and Web of Science). We selected prospective and retrospective cohort studies using the O-RADS system for classifying adnexal masses with histologic diagnosis or conservative management demonstrating spontaneous resolution or persistence in cases of benign appearing masses after follow-up scan as the reference standard. We excluded studies not related to the topic under review, studies not addressing O-RADS classification, studies addressing MRI O-RADS classification, letters to the editor, commentaries, narrative reviews, consensus documents, and studies where data were not available for constructing a 2 × 2 table. The pooled sensitivity, specificity, positive and negative likelihood ratios, and diagnostic odds ratio (DOR) were calculated. The quality of the studies was evaluated using QUADAS-2. A total of 502 citations were identified. Ultimately, 11 studies comprising 4634 masses were included. The mean prevalence of ovarian malignancy was 32%. The risk of bias was high in eight studies for the "patient selection" domain. The risk of bias was low for the "index test" and "reference test" domains for all studies. Overall, the pooled estimated sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and DOR of the O-RADS system for classifying adnexal masses were 97% (95% confidence interval (CI) = 94%-98%), 77% (95% CI = 68%-84%), 4.2 (95% CI = 2.9-6.0), 0.04 (95% CI = 0.03-0.07), and 96 (95% CI = 50-185), respectively. Heterogeneity was moderate for sensitivity and high for specificity. In conclusion, the O-RADS system has good sensitivity and moderate specificity for classifying adnexal masses
    corecore