111 research outputs found
A Survey of Tuning Parameter Selection for High-dimensional Regression
Penalized (or regularized) regression, as represented by Lasso and its
variants, has become a standard technique for analyzing high-dimensional data
when the number of variables substantially exceeds the sample size. The
performance of penalized regression relies crucially on the choice of the
tuning parameter, which determines the amount of regularization and hence the
sparsity level of the fitted model. The optimal choice of tuning parameter
depends on both the structure of the design matrix and the unknown random error
distribution (variance, tail behavior, etc). This article reviews the current
literature of tuning parameter selection for high-dimensional regression from
both theoretical and practical perspectives. We discuss various strategies that
choose the tuning parameter to achieve prediction accuracy or support recovery.
We also review several recently proposed methods for tuning-free
high-dimensional regression.Comment: 28 pages, 2 figure
Resampling-based Confidence Intervals for Model-free Robust Inference on Optimal Treatment Regimes
We propose a new procedure for inference on optimal treatment regimes in the
model-free setting, which does not require to specify an outcome regression
model. Existing model-free estimators for optimal treatment regimes are usually
not suitable for the purpose of inference, because they either have nonstandard
asymptotic distributions or do not necessarily guarantee consistent estimation
of the parameter indexing the Bayes rule due to the use of surrogate loss. We
first study a smoothed robust estimator that directly targets the parameter
corresponding to the Bayes decision rule for optimal treatment regimes
estimation. This estimator is shown to have an asymptotic normal distribution.
Furthermore, we verify that a resampling procedure provides asymptotically
accurate inference for both the parameter indexing the optimal treatment regime
and the optimal value function. A new algorithm is developed to calculate the
proposed estimator with substantially improved speed and stability. Numerical
results demonstrate the satisfactory performance of the new methods.Comment: 59 pages, 8 table
Combining Attention-based Multiple Instance Learning and Gaussian Processes for CT Hemorrhage Detection
This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska Curie grant agreement No 860627 (CLARIFY Project) and also from the Spanish Ministry of Science and Innovation under project PID2019-105142RB-C22.Intracranial hemorrhage (ICH) is a life-threatening emer-
gency with high rates of mortality and morbidity. Rapid and accurate de-
tection of ICH is crucial for patients to get a timely treatment. In order to
achieve the automatic diagnosis of ICH, most deep learning models rely
on huge amounts of slice labels for training. Unfortunately, the manual
annotation of CT slices by radiologists is time-consuming and costly. To
diagnose ICH, in this work, we propose to use an attention-based multiple
instance learning (Att-MIL) approach implemented through the combi-
nation of an attention-based convolutional neural network (Att-CNN)
and a variational Gaussian process for multiple instance learning (VGP-
MIL). Only labels at scan-level are necessary for training. Our method
(a) trains the model using scan labels and assigns each slice with an at-
tention weight, which can be used to provide slice-level predictions, and
(b) uses the VGPMIL model based on low-dimensional features extracted
by the Att-CNN to obtain improved predictions both at slice and scan
levels. To analyze the performance of the proposed approach, our model
has been trained on 1150 scans from an RSNA dataset and evaluated
on 490 scans from an external CQ500 dataset. Our method outperforms
other methods using the same scan-level training and is able to achieve
comparable or even better results than other methods relying on slice-
level annotations.European Commission 860627Spanish Government PID2019-105142RB-C2
DeepCOVID-Fuse: A Multi-modality Deep Learning Model Fusing Chest X-Radiographs and Clinical Variables to Predict COVID-19 Risk Levels
Propose: To present DeepCOVID-Fuse, a deep learning fusion model to predict
risk levels in patients with confirmed coronavirus disease 2019 (COVID-19) and
to evaluate the performance of pre-trained fusion models on full or partial
combination of chest x-ray (CXRs) or chest radiograph and clinical variables.
Materials and Methods: The initial CXRs, clinical variables and outcomes
(i.e., mortality, intubation, hospital length of stay, ICU admission) were
collected from February 2020 to April 2020 with reverse-transcription
polymerase chain reaction (RT-PCR) test results as the reference standard. The
risk level was determined by the outcome. The fusion model was trained on 1657
patients (Age: 58.30 +/- 17.74; Female: 807) and validated on 428 patients
(56.41 +/- 17.03; 190) from Northwestern Memorial HealthCare system and was
tested on 439 patients (56.51 +/- 17.78; 205) from a single holdout hospital.
Performance of pre-trained fusion models on full or partial modalities were
compared on the test set using the DeLong test for the area under the receiver
operating characteristic curve (AUC) and the McNemar test for accuracy,
precision, recall and F1.
Results: The accuracy of DeepCOVID-Fuse trained on CXRs and clinical
variables is 0.658, with an AUC of 0.842, which significantly outperformed (p <
0.05) models trained only on CXRs with an accuracy of 0.621 and AUC of 0.807
and only on clinical variables with an accuracy of 0.440 and AUC of 0.502. The
pre-trained fusion model with only CXRs as input increases accuracy to 0.632
and AUC to 0.813 and with only clinical variables as input increases accuracy
to 0.539 and AUC to 0.733.
Conclusion: The fusion model learns better feature representations across
different modalities during training and achieves good outcome predictions even
when only some of the modalities are used in testing
Deep Gaussian processes for multiple instance learning: Application to CT intracranial hemorrhage detection
Background and objective: Intracranial hemorrhage (ICH) is a life-threatening emergency that can lead to brain damage or death, with high rates of mortality and morbidity. The fast and accurate detection of ICH is important for the patient to get an early and efficient treatment. To improve this diagnostic process, the application of Deep Learning (DL) models on head CT scans is an active area of research. Although promising results have been obtained, many of the proposed models require slice-level annotations by radiologists, which are costly and time-consuming. Methods: We formulate the ICH detection as a problem of Multiple Instance Learning (MIL) that allows training with only scan-level annotations. We develop a new probabilistic method based on Deep Gaussian Processes (DGP) that is able to train with this MIL setting and accurately predict ICH at both slice- and scan-level. The proposed DGPMIL model is able to capture complex feature relations by using multiple Gaussian Process (GP) layers, as we show experimentally. Results: To highlight the advantages of DGPMIL in a general MIL setting, we first conduct several controlled experiments on the MNIST dataset. We show that multiple GP layers outperform one-layer GP models, especially for complex feature distributions. For ICH detection experiments, we use two public brain CT datasets (RSNA and CQ500). We first train a Convolutional Neural Network (CNN) with an attention mechanism to extract the image features, which are fed into our DGPMIL model to perform the final predictions. The results show that DGPMIL model outperforms VGPMIL as well as the attention-based CNN for MIL and other state-of-the-art methods for this problem. The best performing DGPMIL model reaches an AUC-ROC of 0.957 (resp. 0.909) and an AUC-PR of 0.961 (resp. 0.889) on the RSNA (resp. CQ500) dataset. Conclusion: The competitive performance at slice- and scan-level shows that DGPMIL model provides an accurate diagnosis on slices without the need for slice-level annotations by radiologists during training. As MIL is a common problem setting, our model can be applied to a broader range of other tasks, especially in medical image classification, where it can help the diagnostic process.Project P20_00286 funded by FEDER/Junta de AndalucĂa-ConsejerĂa de TransformaciĂłn EconĂłmica, Industria, Conocimiento y Universidadesthe European Union’s Horizon 2020 research and innovation programme under the Marie Skodowska Curie grant agreement No 860627 (CLARIFY Project).Funding for open access charge: Universidad de Granada / CBUA
The role of EGFR mutation as a prognostic factor in survival after diagnosis of brain metastasis in non-small cell lung cancer: A systematic review and meta-analysis
Abstract Background The brain is a common site for metastasis in non-small-cell lung cancer (NSCLC). This study was designed to evaluate the relationship between the mutational of the epidermal growth factor receptor (EGFR) and overall survival (OS) in NSCLC patients with brain metastases. Methods Searches were performed in PubMed, EmBase, and the Cochrane Library to identify studies evaluating the association of EGFR mutation with OS in NSCLC patients through September 2017. Results 4373 NSCLC patients with brain metastases in 18 studies were involved. Mutated EGFR associated with significantly improved OS compared with wild type. Subgroup analyses suggested that this relationship persisted in studies conducted in Eastern, with retrospective design, with sample size ≥500, mean age of patients ≥65.0 years, percentage male < 50.0%, percentage of patients receiving tyrosine kinase inhibitor ≥30.0%. Finally, although significant publication bias was observed using the Egger test, the results were not changed after adjustment using the trim and fill method. Conclusions This meta-analysis suggests that EGFR mutation is an important predictive factor linked to improved OS for NSCLC patients with brain metastases. It can serve as a useful index in the prognostic assessment of NSCLC patients with brain metastases
Effect of Covalent Conjugation with Polyphenols by Free Radical Method on Gel Properties of Soybean Protein-Stabilized Emulsion
In this study, a covalent conjugate between ferulic acid (FA) and soybean protein isolate (SPI) was prepared by free radical method and was used to prepare gluconolactone (GDL)-induced emulsion gels. The effects of covalent binding to FA on SPI structure, emulsion properties and emulsion gel characteristics were investigated. The optimum concentration of FA was determined as 150 μmol/g protein based on intermolecular forces, textural properties, and water-holding capacity of SPI-FA (SFA) stabilized emulsion gels. Under this condition, spectral analysis showed that FA had a fluorescence quenching effect on SPI, and after covalent binding to FA, a decrease in the β-folded content and an increase in the α-helix, β-turn and random coil contents of SPI appeared. The absolute value of zeta potential and interfacial protein content of SFA stabilized emulsions increased, and the mean particle size and apparent viscosity decreased. The final storage modulus (G’) of SFA stabilized emulsion gels increased, and the changes in relaxation times and peak ratios observed in low-field nuclear magnetic resonance (NMR) measurements indicated that the SFA stabilized emulsion gels had better hydration properties. Moreover, they had a more uniform and dense porous network structure. These results show that SPI covalently bound to 150 μmol/g protein of FA is valuable in the preparation of emulsion gels
- …