12 research outputs found
Context-aware Mixture of Deep Neural Networks
Alpha-beta network is a mixture of deep neural networks, implementing a mixture of experts, where each component is a neural network. It is trained using the expectation-maximization algorithm. It enables context-awareness as each component is pushed to give context-specific predictions. Such structure enables context uncertainty quantification as well. The effectiveness of alpha-beta network was assessed using two real-world activity datasets: UCI OPPORTUNITY and an in-house dataset. The model has shown superior performance compared to the baselines
BoXHED2.0: Scalable boosting of dynamic survival analysis
Modern applications of survival analysis increasingly involve time-dependent
covariates. In healthcare settings, such covariates provide dynamic patient
histories that can be used to assess health risks in realtime by tracking the
hazard function. Hazard learning is thus particularly useful in healthcare
analytics, and the open-source package BoXHED 1.0 provides the first
implementation of a gradient boosted hazard estimator that is fully
nonparametric. This paper introduces BoXHED 2.0, a quantum leap over BoXHED 1.0
in several ways. Crucially, BoXHED 2.0 can deal with survival data that goes
far beyond right-censoring and it also supports recurring events. To our
knowledge, this is the only nonparametric machine learning implementation that
is able to do so. Another major improvement is that BoXHED 2.0 is orders of
magnitude more scalable, due in part to a novel data preprocessing step that
sidesteps the need for explicit quadrature when dealing with time-dependent
covariates. BoXHED 2.0 supports the use of GPUs and multicore CPUs, and is
available from GitHub: www.github.com/BoXHED.Comment: 12 page
Complete genome sequence of Trueperella pyogenes strain Arash114, isolated from the uterus of a water buffalo (Bubalus bubalis) in Iran
Objective: Trueperella pyogenes has been considered a major causative agent of metritis, abortion, and death in a broad range of domestic and wild animals, including cattle, swine, sheep, goats, camels, buffalo, deer, antelopes, reptiles, and birds.
Data description: Here, we report the complete chromosome sequence of Trueperella pyogenes strain Arash114, isolated from the uterus of a water buffalo (Bubalus bubalis) died due to the infection caused by this pathogen. The genome assembly comprised 2,338,282 bp, with a 59.5% GC content. Annotation of the genome showed 46 tRNA genes, 6 rRNA, 1 CRISPR and 2059 coding sequences. Also, several genes coding for antimicrobial resistance such as tetW and virulence factor including plo, nanH, nanP, cbp and 4 fimbrial proteins were found. This study will advance our knowledge regarding the metabolism, virulence factors, antibiotic resistance and evolution of Arash114 strain and serve as an appropriate template for future researches.
Keywords: Complete genome sequencing; Trueperella pyogenes; Uterus infection; Water buffalo
Complete genome sequence of Trueperella pyogenes strain Arash114, isolated from the uterus of a water buffalo (Bubalus bubalis) in Iran
AbstractObjectiveTrueperella pyogeneshas been considered a major causative agent of metritis, abortion, and death in a broad range of domestic and wild animals, including cattle, swine, sheep, goats, camels, buffalo, deer, antelopes, reptiles, and birds.Data descriptionHere, we report the complete chromosome sequence ofTrueperella pyogenesstrain Arash114, isolated from the uterus of a water buffalo (Bubalus bubalis) died due to the infection caused by this pathogen. The genome assembly comprised 2,338,282 bp, with a 59.5% GC content. Annotation of the genome showed 46 tRNA genes, 6 rRNA, 1 CRISPR and 2059 coding sequences. Also, several genes coding for antimicrobial resistance such astetWand virulence factor includingplo,nanH,nanP,cbpand 4 fimbrial proteins were found. This study will advance our knowledge regarding the metabolism, virulence factors, antibiotic resistance and evolution of Arash114 strain and serve as an appropriate template for future researches.</jats:sec
Differential privacy preserved federated learning for prognostic modeling in COVID‐19 patients using large multi‐institutional chest CT dataset
Background
Notwithstanding the encouraging results of previous studies reporting on the efficiency of deep learning (DL) in COVID‐19 prognostication, clinical adoption of the developed methodology still needs to be improved. To overcome this limitation, we set out to predict the prognosis of a large multi‐institutional cohort of patients with COVID‐19 using a DL‐based model.
Purpose
This study aimed to evaluate the performance of deep privacy‐preserving federated learning (DPFL) in predicting COVID‐19 outcomes using chest CT images.
Methods
After applying inclusion and exclusion criteria, 3055 patients from 19 centers, including 1599 alive and 1456 deceased, were enrolled in this study. Data from all centers were split (randomly with stratification respective to each center and class) into a training/validation set (70%/10%) and a hold‐out test set (20%). For the DL model, feature extraction was performed on 2D slices, and averaging was performed at the final layer to construct a 3D model for each scan. The DensNet model was used for feature extraction. The model was developed using centralized and FL approaches. For FL, we employed DPFL approaches. Membership inference attack was also evaluated in the FL strategy. For model evaluation, different metrics were reported in the hold‐out test sets. In addition, models trained in two scenarios, centralized and FL, were compared using the DeLong test for statistical differences.
Results
The centralized model achieved an accuracy of 0.76, while the DPFL model had an accuracy of 0.75. Both the centralized and DPFL models achieved a specificity of 0.77. The centralized model achieved a sensitivity of 0.74, while the DPFL model had a sensitivity of 0.73. A mean AUC of 0.82 and 0.81 with 95% confidence intervals of (95% CI: 0.79–0.85) and (95% CI: 0.77–0.84) were achieved by the centralized model and the DPFL model, respectively. The DeLong test did not prove statistically significant differences between the two models ( p ‐value = 0.98). The AUC values for the inference attacks fluctuate between 0.49 and 0.51, with an average of 0.50 ± 0.003 and 95% CI for the mean AUC of 0.500 to 0.501.
Conclusion
The performance of the proposed model was comparable to centralized models while operating on large and heterogeneous multi‐institutional datasets. In addition, the model was resistant to inference attacks, ensuring the privacy of shared data during the training process.</p
COVID-19 prognostic modeling using CT radiomic features and machine learning algorithms: Analysis of a multi-institutional dataset of 14,339 patients
Background: We aimed to analyze the prognostic power of CT-based radiomics models using data of 14,339 COVID-19 patients.
Methods: Whole lung segmentations were performed automatically using a deep learning-based model to extract 107 intensity and texture radiomics features. We used four feature selection algorithms and seven classifiers. We evaluated the models using ten different splitting and cross-validation strategies, including non-harmonized and ComBat-harmonized datasets. The sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were reported.
Results: In the test dataset (4,301) consisting of CT and/or RT-PCR positive cases, AUC, sensitivity, and specificity of 0.83 ± 0.01 (CI95%: 0.81-0.85), 0.81, and 0.72, respectively, were obtained by ANOVA feature selector + Random Forest (RF) classifier. Similar results were achieved in RT-PCR-only positive test sets (3,644). In ComBat harmonized dataset, Relief feature selector + RF classifier resulted in the highest performance of AUC, reaching 0.83 ± 0.01 (CI95%: 0.81-0.85), with a sensitivity and specificity of 0.77 and 0.74, respectively. ComBat harmonization did not depict statistically significant improvement compared to a non-harmonized dataset. In leave-one-center-out, the combination of ANOVA feature selector and RF classifier resulted in the highest performance.
Conclusion: Lung CT radiomics features can be used for robust prognostic modeling of COVID-19. The predictive power of the proposed CT radiomics model is more reliable when using a large multicentric heterogeneous dataset, and may be used prospectively in clinical setting to manage COVID-19 patients.</p
COVID-19 Prognostic Modeling Using CT Radiomic Features and Machine Learning Algorithms: Analysis of a Multi-Institutional Dataset of 14,339 Patients
AbstractObjectiveIn this large multi-institutional study, we aimed to analyze the prognostic power of computed tomography (CT)-based radiomics models in COVID-19 patients.MethodsCT images of 14,339 COVID-19 patients with overall survival outcome were collected from 19 medical centers. Whole lung segmentations were performed automatically using a previously validated deep learning-based model, and regions of interest were further evaluated and modified by a human observer. All images were resampled to an isotropic voxel size, intensities were discretized into 64-binning size, and 105 radiomics features, including shape, intensity, and texture features were extracted from the lung mask. Radiomics features were normalized using Z-score normalization. High-correlated features using Pearson (R2>0.99) were eliminated. We applied the Synthetic Minority Oversampling Technique (SMOT) algorithm in only the training set for different models to overcome unbalance classes. We used 4 feature selection algorithms, namely Analysis of Variance (ANOVA), Kruskal- Wallis (KW), Recursive Feature Elimination (RFE), and Relief. For the classification task, we used seven classifiers, including Logistic Regression (LR), Least Absolute Shrinkage and Selection Operator (LASSO), Linear Discriminant Analysis (LDA), Random Forest (RF), AdaBoost (AB), Naïve Bayes (NB), and Multilayer Perceptron (MLP). The models were built and evaluated using training and testing sets, respectively. Specifically, we evaluated the models using 10 different splitting and cross-validation strategies, including different types of test datasets (e.g. non-harmonized vs. ComBat-harmonized datasets). The sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were reported for models evaluation.ResultsIn the test dataset (4301) consisting of CT and/or RT-PCR positive cases, AUC, sensitivity, and specificity of 0.83±0.01 (CI95%: 0.81-0.85), 0.81, and 0.72, respectively, were obtained by ANOVA feature selector + RF classifier. In RT-PCR-only positive test sets (3644), similar results were achieved, and there was no statistically significant difference. In ComBat harmonized dataset, Relief feature selector + RF classifier resulted in highest performance of AUC, reaching 0.83±0.01 (CI95%: 0.81-0.85), with sensitivity and specificity of 0.77 and 0.74, respectively. At the same time, ComBat harmonization did not depict statistically significant improvement relevant to non-harmonized dataset. In leave-one-center-out, the combination of ANOVA feature selector and LR classifier resulted in the highest performance of AUC (0.80±0.084) with sensitivity and specificity of 0.77 ± 0.11 and 0.76 ± 0.075, respectively.ConclusionLung CT radiomics features can be used towards robust prognostic modeling of COVID-19 in large heterogeneous datasets gathered from multiple centers. As such, CT radiomics-based model has significant potential for use in prospective clinical settings towards improved management of COVID-19 patients.</jats:sec
