43 research outputs found

    Assumption-lean and Data-adaptive Post-Prediction Inference

    Full text link
    A primary challenge facing modern scientific research is the limited availability of gold-standard data which can be both costly and labor-intensive to obtain. With the rapid development of machine learning (ML), scientists have relied on ML algorithms to predict these gold-standard outcomes with easily obtained covariates. However, these predicted outcomes are often used directly in subsequent statistical analyses, ignoring imprecision and heterogeneity introduced by the prediction procedure. This will likely result in false positive findings and invalid scientific conclusions. In this work, we introduce an assumption-lean and data-adaptive Post-Prediction Inference (POP-Inf) procedure that allows valid and powerful inference based on ML-predicted outcomes. Its "assumption-lean" property guarantees reliable statistical inference without assumptions on the ML-prediction, for a wide range of statistical quantities. Its "data-adaptive'" feature guarantees an efficiency gain over existing post-prediction inference methods, regardless of the accuracy of ML-prediction. We demonstrate the superiority and applicability of our method through simulations and large-scale genomic data

    Gotcha! I Know What You are Doing on the FPGA Cloud: Fingerprinting Co-Located Cloud FPGA Accelerators via Measuring Communication Links

    Full text link
    In recent decades, due to the emerging requirements of computation acceleration, cloud FPGAs have become popular in public clouds. Major cloud service providers, e.g. AWS and Microsoft Azure have provided FPGA computing resources in their infrastructure and have enabled users to design and deploy their own accelerators on these FPGAs. Multi-tenancy FPGAs, where multiple users can share the same FPGA fabric with certain types of isolation to improve resource efficiency, have already been proved feasible. However, this also raises security concerns. Various types of side-channel attacks targeting multi-tenancy FPGAs have been proposed and validated. The awareness of security vulnerabilities in the cloud has motivated cloud providers to take action to enhance the security of their cloud environments. In FPGA security research papers, researchers always perform attacks under the assumption that attackers successfully co-locate with victims and are aware of the existence of victims on the same FPGA board. However, the way to reach this point, i.e., how attackers secretly obtain information regarding accelerators on the same fabric, is constantly ignored despite the fact that it is non-trivial and important for attackers. In this paper, we present a novel fingerprinting attack to gain the types of co-located FPGA accelerators. We utilize a seemingly non-malicious benchmark accelerator to sniff the communication link and collect performance traces of the FPGA-host communication link. By analyzing these traces, we are able to achieve high classification accuracy for fingerprinting co-located accelerators, which proves that attackers can use our method to perform cloud FPGA accelerator fingerprinting with a high success rate. As far as we know, this is the first paper targeting multi-tenant FPGA accelerator fingerprinting with the communication side-channel.Comment: To be published in ACM CCS 202

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Tirofiban for Stroke without Large or Medium-Sized Vessel Occlusion

    Get PDF
    The effects of the glycoprotein IIb/IIIa receptor inhibitor tirofiban in patients with acute ischemic stroke but who have no evidence of complete occlusion of large or medium-sized vessels have not been extensively studied. In a multicenter trial in China, we enrolled patients with ischemic stroke without occlusion of large or medium-sized vessels and with a National Institutes of Health Stroke Scale score of 5 or more and at least one moderately to severely weak limb. Eligible patients had any of four clinical presentations: ineligible for thrombolysis or thrombectomy and within 24 hours after the patient was last known to be well; progression of stroke symptoms 24 to 96 hours after onset; early neurologic deterioration after thrombolysis; or thrombolysis with no improvement at 4 to 24 hours. Patients were assigned to receive intravenous tirofiban (plus oral placebo) or oral aspirin (100 mg per day, plus intravenous placebo) for 2 days; all patients then received oral aspirin until day 90. The primary efficacy end point was an excellent outcome, defined as a score of 0 or 1 on the modified Rankin scale (range, 0 [no symptoms] to 6 [death]) at 90 days. Secondary end points included functional independence at 90 days and a quality-of-life score. The primary safety end points were death and symptomatic intracranial hemorrhage. A total of 606 patients were assigned to the tirofiban group and 571 to the aspirin group. Most patients had small infarctions that were presumed to be atherosclerotic. The percentage of patients with a score of 0 or 1 on the modified Rankin scale at 90 days was 29.1% with tirofiban and 22.2% with aspirin (adjusted risk ratio, 1.26; 95% confidence interval, 1.04 to 1.53, P = 0.02). Results for secondary end points were generally not consistent with the results of the primary analysis. Mortality was similar in the two groups. The incidence of symptomatic intracranial hemorrhage was 1.0% in the tirofiban group and 0% in the aspirin group. In this trial involving heterogeneous groups of patients with stroke of recent onset or progression of stroke symptoms and nonoccluded large and medium-sized cerebral vessels, intravenous tirofiban was associated with a greater likelihood of an excellent outcome than low-dose aspirin. Incidences of intracranial hemorrhages were low but slightly higher with tirofiban

    Optimal Sensor Selection for Classifying a Set of Ginsengs Using Metal-Oxide Sensors

    No full text
    The sensor selection problem was investigated for the application of classification of a set of ginsengs using a metal-oxide sensor-based homemade electronic nose with linear discriminant analysis. Samples (315) were measured for nine kinds of ginsengs using 12 sensors. We investigated the classification performances of combinations of 12 sensors for the overall discrimination of combinations of nine ginsengs. The minimum numbers of sensors for discriminating each sample set to obtain an optimal classification performance were defined. The relation of the minimum numbers of sensors with number of samples in the sample set was revealed. The results showed that as the number of samples increased, the average minimum number of sensors increased, while the increment decreased gradually and the average optimal classification rate decreased gradually. Moreover, a new approach of sensor selection was proposed to estimate and compare the effective information capacity of each sensor

    Conformal Prediction Based on K-Nearest Neighbors for Discrimination of Ginsengs by a Home-Made Electronic Nose

    No full text
    An estimate on the reliability of prediction in the applications of electronic nose is essential, which has not been paid enough attention. An algorithm framework called conformal prediction is introduced in this work for discriminating different kinds of ginsengs with a home-made electronic nose instrument. Nonconformity measure based on k-nearest neighbors (KNN) is implemented separately as underlying algorithm of conformal prediction. In offline mode, the conformal predictor achieves a classification rate of 84.44% based on 1NN and 80.63% based on 3NN, which is better than that of simple KNN. In addition, it provides an estimate of reliability for each prediction. In online mode, the validity of predictions is guaranteed, which means that the error rate of region predictions never exceeds the significance level set by a user. The potential of this framework for detecting borderline examples and outliers in the application of E-nose is also investigated. The result shows that conformal prediction is a promising framework for the application of electronic nose to make predictions with reliability and validity

    Valid Probabilistic Predictions for Ginseng with Venn Machines Using Electronic Nose

    No full text
    In the application of electronic noses (E-noses), probabilistic prediction is a good way to estimate how confident we are about our prediction. In this work, a homemade E-nose system embedded with 16 metal-oxide semi-conductive gas sensors was used to discriminate nine kinds of ginsengs of different species or production places. A flexible machine learning framework, Venn machine (VM) was introduced to make probabilistic predictions for each prediction. Three Venn predictors were developed based on three classical probabilistic prediction methods (Platt’s method, Softmax regression and Naive Bayes). Three Venn predictors and three classical probabilistic prediction methods were compared in aspect of classification rate and especially the validity of estimated probability. A best classification rate of 88.57% was achieved with Platt’s method in offline mode, and the classification rate of VM-SVM (Venn machine based on Support Vector Machine) was 86.35%, just 2.22% lower. The validity of Venn predictors performed better than that of corresponding classical probabilistic prediction methods. The validity of VM-SVM was superior to the other methods. The results demonstrated that Venn machine is a flexible tool to make precise and valid probabilistic prediction in the application of E-nose, and VM-SVM achieved the best performance for the probabilistic prediction of ginseng samples
    corecore