Search CORE

111 research outputs found

A Survey of Tuning Parameter Selection for High-dimensional Regression

Author: Wang Lan
Wu Yunan
Publication venue
Publication date: 09/08/2019
Field of study

Penalized (or regularized) regression, as represented by Lasso and its variants, has become a standard technique for analyzing high-dimensional data when the number of variables substantially exceeds the sample size. The performance of penalized regression relies crucially on the choice of the tuning parameter, which determines the amount of regularization and hence the sparsity level of the fitted model. The optimal choice of tuning parameter depends on both the structure of the design matrix and the unknown random error distribution (variance, tail behavior, etc). This article reviews the current literature of tuning parameter selection for high-dimensional regression from both theoretical and practical perspectives. We discuss various strategies that choose the tuning parameter to achieve prediction accuracy or support recovery. We also review several recently proposed methods for tuning-free high-dimensional regression.Comment: 28 pages, 2 figure

arXiv.org e-Print Archive

University of Miami: Scholarship Miami

Resampling-based Confidence Intervals for Model-free Robust Inference on Optimal Treatment Regimes

Author: Wang Lan
Wu Yunan
Publication venue
Publication date: 03/07/2020
Field of study

We propose a new procedure for inference on optimal treatment regimes in the model-free setting, which does not require to specify an outcome regression model. Existing model-free estimators for optimal treatment regimes are usually not suitable for the purpose of inference, because they either have nonstandard asymptotic distributions or do not necessarily guarantee consistent estimation of the parameter indexing the Bayes rule due to the use of surrogate loss. We first study a smoothed robust estimator that directly targets the parameter corresponding to the Bayes decision rule for optimal treatment regimes estimation. This estimator is shown to have an asymptotic normal distribution. Furthermore, we verify that a resampling procedure provides asymptotically accurate inference for both the parameter indexing the optimal treatment regime and the optimal value function. A new algorithm is developed to calculate the proposed estimator with substantially improved speed and stability. Numerical results demonstrate the satisfactory performance of the new methods.Comment: 59 pages, 8 table

arXiv.org e-Print Archive

University of Miami: Scholarship Miami

Combining Attention-based Multiple Instance Learning and Gaussian Processes for CT Hemorrhage Detection

Author: Hernández Sánchez Enrique
Molina Soriano Rafael
Schmidt Arne
Wu Yunan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/07/2021
Field of study

This work has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska Curie grant agreement No 860627 (CLARIFY Project) and also from the Spanish Ministry of Science and Innovation under project PID2019-105142RB-C22.Intracranial hemorrhage (ICH) is a life-threatening emer- gency with high rates of mortality and morbidity. Rapid and accurate de- tection of ICH is crucial for patients to get a timely treatment. In order to achieve the automatic diagnosis of ICH, most deep learning models rely on huge amounts of slice labels for training. Unfortunately, the manual annotation of CT slices by radiologists is time-consuming and costly. To diagnose ICH, in this work, we propose to use an attention-based multiple instance learning (Att-MIL) approach implemented through the combi- nation of an attention-based convolutional neural network (Att-CNN) and a variational Gaussian process for multiple instance learning (VGP- MIL). Only labels at scan-level are necessary for training. Our method (a) trains the model using scan labels and assigns each slice with an at- tention weight, which can be used to provide slice-level predictions, and (b) uses the VGPMIL model based on low-dimensional features extracted by the Att-CNN to obtain improved predictions both at slice and scan levels. To analyze the performance of the proposed approach, our model has been trained on 1150 scans from an RSNA dataset and evaluated on 490 scans from an external CQ500 dataset. Our method outperforms other methods using the same scan-level training and is able to achieve comparable or even better results than other methods relying on slice- level annotations.European Commission 860627Spanish Government PID2019-105142RB-C2

Repositorio Institucional Universidad de Granada

DeepCOVID-Fuse: A Multi-modality Deep Learning Model Fusing Chest X-Radiographs and Clinical Variables to Predict COVID-19 Risk Levels

Author: Dravid Amil
Katsaggelos Aggelos K.
Wehbe Ramsey Michael
Wu Yunan
Publication venue
Publication date: 20/01/2023
Field of study

Propose: To present DeepCOVID-Fuse, a deep learning fusion model to predict risk levels in patients with confirmed coronavirus disease 2019 (COVID-19) and to evaluate the performance of pre-trained fusion models on full or partial combination of chest x-ray (CXRs) or chest radiograph and clinical variables. Materials and Methods: The initial CXRs, clinical variables and outcomes (i.e., mortality, intubation, hospital length of stay, ICU admission) were collected from February 2020 to April 2020 with reverse-transcription polymerase chain reaction (RT-PCR) test results as the reference standard. The risk level was determined by the outcome. The fusion model was trained on 1657 patients (Age: 58.30 +/- 17.74; Female: 807) and validated on 428 patients (56.41 +/- 17.03; 190) from Northwestern Memorial HealthCare system and was tested on 439 patients (56.51 +/- 17.78; 205) from a single holdout hospital. Performance of pre-trained fusion models on full or partial modalities were compared on the test set using the DeLong test for the area under the receiver operating characteristic curve (AUC) and the McNemar test for accuracy, precision, recall and F1. Results: The accuracy of DeepCOVID-Fuse trained on CXRs and clinical variables is 0.658, with an AUC of 0.842, which significantly outperformed (p < 0.05) models trained only on CXRs with an accuracy of 0.621 and AUC of 0.807 and only on clinical variables with an accuracy of 0.440 and AUC of 0.502. The pre-trained fusion model with only CXRs as input increases accuracy to 0.632 and AUC to 0.813 and with only clinical variables as input increases accuracy to 0.539 and AUC to 0.733. Conclusion: The fusion model learns better feature representations across different modalities during training and achieves good outcome predictions even when only some of the modalities are used in testing

arXiv.org e-Print Archive

Deep Gaussian processes for multiple instance learning: Application to CT intracranial hemorrhage detection

Author: Katsaggelos Aggelos
López Pérez Miguel
Molina Soriano Rafael
Schmidt Arne
Wu Yunan
Publication venue: 'Elsevier BV'
Publication date: 01/06/2022
Field of study

Background and objective: Intracranial hemorrhage (ICH) is a life-threatening emergency that can lead to brain damage or death, with high rates of mortality and morbidity. The fast and accurate detection of ICH is important for the patient to get an early and efficient treatment. To improve this diagnostic process, the application of Deep Learning (DL) models on head CT scans is an active area of research. Although promising results have been obtained, many of the proposed models require slice-level annotations by radiologists, which are costly and time-consuming. Methods: We formulate the ICH detection as a problem of Multiple Instance Learning (MIL) that allows training with only scan-level annotations. We develop a new probabilistic method based on Deep Gaussian Processes (DGP) that is able to train with this MIL setting and accurately predict ICH at both slice- and scan-level. The proposed DGPMIL model is able to capture complex feature relations by using multiple Gaussian Process (GP) layers, as we show experimentally. Results: To highlight the advantages of DGPMIL in a general MIL setting, we first conduct several controlled experiments on the MNIST dataset. We show that multiple GP layers outperform one-layer GP models, especially for complex feature distributions. For ICH detection experiments, we use two public brain CT datasets (RSNA and CQ500). We first train a Convolutional Neural Network (CNN) with an attention mechanism to extract the image features, which are fed into our DGPMIL model to perform the final predictions. The results show that DGPMIL model outperforms VGPMIL as well as the attention-based CNN for MIL and other state-of-the-art methods for this problem. The best performing DGPMIL model reaches an AUC-ROC of 0.957 (resp. 0.909) and an AUC-PR of 0.961 (resp. 0.889) on the RSNA (resp. CQ500) dataset. Conclusion: The competitive performance at slice- and scan-level shows that DGPMIL model provides an accurate diagnosis on slices without the need for slice-level annotations by radiologists during training. As MIL is a common problem setting, our model can be applied to a broader range of other tasks, especially in medical image classification, where it can help the diagnostic process.Project P20_00286 funded by FEDER/Junta de Andalucía-Consejería de Transformación Económica, Industria, Conocimiento y Universidadesthe European Union’s Horizon 2020 research and innovation programme under the Marie Skodowska Curie grant agreement No 860627 (CLARIFY Project).Funding for open access charge: Universidad de Granada / CBUA

Repositorio Institucional Universidad de Granada

The role of EGFR mutation as a prognostic factor in survival after diagnosis of brain metastasis in non-small cell lung cancer: A systematic review and meta-analysis

Author: Han Yunan
Li Wen-Ya
Liu Xing-Yu
Miao Zhi-Feng
Song Yong-Xi
Wang Zhen-Ning
Wu Jian-Hua
Xu Hao
Xu Hui-Mian
Xu Ying-Ying
Yin Song-Cheng
Zhao Ting-Ting
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

Abstract Background The brain is a common site for metastasis in non-small-cell lung cancer (NSCLC). This study was designed to evaluate the relationship between the mutational of the epidermal growth factor receptor (EGFR) and overall survival (OS) in NSCLC patients with brain metastases. Methods Searches were performed in PubMed, EmBase, and the Cochrane Library to identify studies evaluating the association of EGFR mutation with OS in NSCLC patients through September 2017. Results 4373 NSCLC patients with brain metastases in 18 studies were involved. Mutated EGFR associated with significantly improved OS compared with wild type. Subgroup analyses suggested that this relationship persisted in studies conducted in Eastern, with retrospective design, with sample size ≥500, mean age of patients ≥65.0 years, percentage male < 50.0%, percentage of patients receiving tyrosine kinase inhibitor ≥30.0%. Finally, although significant publication bias was observed using the Egger test, the results were not changed after adjustment using the trim and fill method. Conclusions This meta-analysis suggests that EGFR mutation is an important predictive factor linked to improved OS for NSCLC patients with brain metastases. It can serve as a useful index in the prognostic assessment of NSCLC patients with brain metastases

Directory of Open Access Journals

Digital Commons@Becker

Effect of Covalent Conjugation with Polyphenols by Free Radical Method on Gel Properties of Soybean Protein-Stabilized Emulsion

Author: MENG Ganlu CHU Yunan, WU Yi, WANG Jubing, JIN Hua, XU Jing
Publication venue: China Food Publishing Company
Publication date: 01/01/2024
Field of study

In this study, a covalent conjugate between ferulic acid (FA) and soybean protein isolate (SPI) was prepared by free radical method and was used to prepare gluconolactone (GDL)-induced emulsion gels. The effects of covalent binding to FA on SPI structure, emulsion properties and emulsion gel characteristics were investigated. The optimum concentration of FA was determined as 150 μmol/g protein based on intermolecular forces, textural properties, and water-holding capacity of SPI-FA (SFA) stabilized emulsion gels. Under this condition, spectral analysis showed that FA had a fluorescence quenching effect on SPI, and after covalent binding to FA, a decrease in the β-folded content and an increase in the α-helix, β-turn and random coil contents of SPI appeared. The absolute value of zeta potential and interfacial protein content of SFA stabilized emulsions increased, and the mean particle size and apparent viscosity decreased. The final storage modulus (G’) of SFA stabilized emulsion gels increased, and the changes in relaxation times and peak ratios observed in low-field nuclear magnetic resonance (NMR) measurements indicated that the SFA stabilized emulsion gels had better hydration properties. Moreover, they had a more uniform and dense porous network structure. These results show that SPI covalently bound to 150 μmol/g protein of FA is valuable in the preparation of emulsion gels

Directory of Open Access Journals