277 research outputs found

    STU-Net: Scalable and Transferable Medical Image Segmentation Models Empowered by Large-Scale Supervised Pre-training

    Full text link
    Large-scale models pre-trained on large-scale datasets have profoundly advanced the development of deep learning. However, the state-of-the-art models for medical image segmentation are still small-scale, with their parameters only in the tens of millions. Further scaling them up to higher orders of magnitude is rarely explored. An overarching goal of exploring large-scale models is to train them on large-scale medical segmentation datasets for better transfer capacities. In this work, we design a series of Scalable and Transferable U-Net (STU-Net) models, with parameter sizes ranging from 14 million to 1.4 billion. Notably, the 1.4B STU-Net is the largest medical image segmentation model to date. Our STU-Net is based on nnU-Net framework due to its popularity and impressive performance. We first refine the default convolutional blocks in nnU-Net to make them scalable. Then, we empirically evaluate different scaling combinations of network depth and width, discovering that it is optimal to scale model depth and width together. We train our scalable STU-Net models on a large-scale TotalSegmentator dataset and find that increasing model size brings a stronger performance gain. This observation reveals that a large model is promising in medical image segmentation. Furthermore, we evaluate the transferability of our model on 14 downstream datasets for direct inference and 3 datasets for further fine-tuning, covering various modalities and segmentation targets. We observe good performance of our pre-trained model in both direct inference and fine-tuning. The code and pre-trained models are available at https://github.com/Ziyan-Huang/STU-Net

    A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation

    Full text link
    Although deep learning have revolutionized abdominal multi-organ segmentation, models often struggle with generalization due to training on small, specific datasets. With the recent emergence of large-scale datasets, some important questions arise: \textbf{Can models trained on these datasets generalize well on different ones? If yes/no, how to further improve their generalizability?} To address these questions, we introduce A-Eval, a benchmark for the cross-dataset Evaluation ('Eval') of Abdominal ('A') multi-organ segmentation. We employ training sets from four large-scale public datasets: FLARE22, AMOS, WORD, and TotalSegmentator, each providing extensive labels for abdominal multi-organ segmentation. For evaluation, we incorporate the validation sets from these datasets along with the training set from the BTCV dataset, forming a robust benchmark comprising five distinct datasets. We evaluate the generalizability of various models using the A-Eval benchmark, with a focus on diverse data usage scenarios: training on individual datasets independently, utilizing unlabeled data via pseudo-labeling, mixing different modalities, and joint training across all available datasets. Additionally, we explore the impact of model sizes on cross-dataset generalizability. Through these analyses, we underline the importance of effective data usage in enhancing models' generalization capabilities, offering valuable insights for assembling large-scale datasets and improving training strategies. The code and pre-trained models are available at \href{https://github.com/uni-medical/A-Eval}{https://github.com/uni-medical/A-Eval}

    SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks

    Full text link
    Segment Anything Model (SAM) has achieved impressive results for natural image segmentation with input prompts such as points and bounding boxes. Its success largely owes to massive labeled training data. However, directly applying SAM to medical image segmentation cannot perform well because SAM lacks medical knowledge -- it does not use medical images for training. To incorporate medical knowledge into SAM, we introduce SA-Med2D-20M, a large-scale segmentation dataset of 2D medical images built upon numerous public and private datasets. It consists of 4.6 million 2D medical images and 19.7 million corresponding masks, covering almost the whole body and showing significant diversity. This paper describes all the datasets collected in SA-Med2D-20M and details how to process these datasets. Furthermore, comprehensive statistics of SA-Med2D-20M are presented to facilitate the better use of our dataset, which can help the researchers build medical vision foundation models or apply their models to downstream medical applications. We hope that the large scale and diversity of SA-Med2D-20M can be leveraged to develop medical artificial intelligence for enhancing diagnosis, medical image analysis, knowledge sharing, and education. The data with the redistribution license is publicly available at https://github.com/OpenGVLab/SAM-Med2D

    Magnetic resonance imaging based deep-learning model: a rapid, high-performance, automated tool for testicular volume measurements

    Get PDF
    BackgroundTesticular volume (TV) is an essential parameter for monitoring testicular functions and pathologies. Nevertheless, current measurement tools, including orchidometers and ultrasonography, encounter challenges in obtaining accurate and personalized TV measurements.PurposeBased on magnetic resonance imaging (MRI), this study aimed to establish a deep learning model and evaluate its efficacy in segmenting the testes and measuring TV.Materials and methodsThe study cohort consisted of retrospectively collected patient data (N = 200) and a prospectively collected dataset comprising 10 healthy volunteers. The retrospective dataset was divided into training and independent validation sets, with an 8:2 random distribution. Each of the 10 healthy volunteers underwent 5 scans (forming the testing dataset) to evaluate the measurement reproducibility. A ResUNet algorithm was applied to segment the testes. Volume of each testis was calculated by multiplying the voxel volume by the number of voxels. Manually determined masks by experts were used as ground truth to assess the performance of the deep learning model.ResultsThe deep learning model achieved a mean Dice score of 0.926 ± 0.034 (0.921 ± 0.026 for the left testis and 0.926 ± 0.034 for the right testis) in the validation cohort and a mean Dice score of 0.922 ± 0.02 (0.931 ± 0.019 for the left testis and 0.932 ± 0.022 for the right testis) in the testing cohort. There was strong correlation between the manual and automated TV (R2 ranging from 0.974 to 0.987 in the validation cohort; R2 ranging from 0.936 to 0.973 in the testing cohort). The volume differences between the manual and automated measurements were 0.838 ± 0.991 (0.209 ± 0.665 for LTV and 0.630 ± 0.728 for RTV) in the validation cohort and 0.815 ± 0.824 (0.303 ± 0.664 for LTV and 0.511 ± 0.444 for RTV) in the testing cohort. Additionally, the deep-learning model exhibited excellent reproducibility (intraclass correlation >0.9) in determining TV.ConclusionThe MRI-based deep learning model is an accurate and reliable tool for measuring TV

    Potential Diagnostic Applications of Multi-Delay Arterial Spin Labeling in Early Alzheimer’s Disease: The Chinese Imaging, Biomarkers, and Lifestyle Study

    Get PDF
    Background: Cerebral blood flow (CBF) alterations are involved in the onset and progression of Alzheimer’s disease (AD) and can be a potential biomarker. However, CBF measured by single-delay arterial spin labeling (ASL) for discrimination of mild cognitive impairment (MCI, an early stage of AD) was lack of accuracy. Multi-delay ASL can not only provide CBF quantification but also provide arterial transit time (ATT). Unfortunately, the technique was scarcely applied to the diagnosis of AD. Here, we detected the utility of ASL with 1-delay and 7-delay in ten regions of interest (ROIs) to identify MCI and AD. Materials and Methods: Pseudocontinuous ASL (pCASL) MRI was acquired on a 3T GE scanner in adults from the Chinese Imaging, Biomarkers, and Lifestyle (CIBL) Study of AD cohort, including 26 normal cognition (NC), 37 MCI, and 39 AD. Receiver operating characteristic (ROC) analyses with 1-delay and 7-delay ASL were performed for the identification of MCI and AD. The DeLong test was used to compare ROC curves. Results: For CBF of 1-delay or 7-delay the AUCs showed moderate-high performance for the AD/NC and AD/MCI comparisons (AUC = 0.83∼0.96) (p 0.05). Conclusion: The combination of CBF and ATT with 7-delay ASL showed higher performance for identification of MCI than CBF of 1-delay, when adding to sex, age, APOE ε4 carrier status, and education years, the diagnostic performance was further increased, presenting a potential imaging biomarker in early AD

    Drivers of cropland abandonment in mountainous areas: A household decision model on farming scale and a case study of Southwest China

    Get PDF
    Cropland abandonment has emerged as a prevalent phenomenon in the mountainous areas of China.While there is a general understanding that this new trend is driven by the rising opportunity cost of rural labor, rigorous theoretical and empirical analyses are largely absent. This paper first develops a theoretical model to investigate household decisions on farming scale when off-farm labor market is accessible and there is heterogeneity of farmland productivity and distribution. The model is capable of explaining the hidden reasons of cropland abandonment in sloping and agriculturally less-favored locations. The model also unveils the impacts of heterogeneity of household labor on fallow decisions and the efficiency loss due to an imperfect labor market. The model is empirically tested by applying the Probit and Logit estimators to a unique household and land-plot survey dataset which contains 5258 plots of599 rural households in Chongqing, a provincial level municipality, in Southwest China. The survey shows that more than 30% of the sample plots have been abandoned, mainly since 1992. The econometric results are consistent with our theoretical expectations. This work would help policy-makers and stakeholders to identify areas with a high probability of land abandonment and farming practice which is less sustainable in the mountainous areas

    Cerebrospinal fluid oligoclonal bands in Chinese patients with multiple sclerosis: the prevalence and its association with clinical features

    Get PDF
    BackgroundCerebrospinal fluid oligoclonal band (CSF-OCB) is an established biomarker in diagnosing multiple sclerosis (MS), however, there are no nationwide data on CSF-OCB prevalence and its diagnostic performance in Chinese MS patients, especially in the virtue of common standard operation procedure (SOP).MethodsWith a consensus SOP and the same isoelectric focusing system, we conducted a nationwide multi-center study on OCB status in consecutively, and recruited 483 MS patients and 880 non-MS patients, including neuro-inflammatory diseases (NID, n = 595) and non-inflammatory neurological diseases (NIND, n=285). Using a standardized case report form (CRF) to collect the clinical, radiological, immunological, and CSF data, we explored the association of CSF-OCB positivity with patient characters and the diagnostic performance of CSF-OCB in Chinese MS patients. Prospective source data collection, and retrospective data acquisition and statistical data analysis were used.Findings369 (76.4%) MS patients were OCB-positive, while 109 NID patients (18.3%) and 6 NIND patients (2.1%) were OCB-positive, respectively. Time from symptom onset to diagnosis was significantly shorter in OCB-positive than that in OCB-negative MS patients (13.2 vs 23.7 months, P=0.020). The prevalence of CSF-OCB in Chinese MS patients was significantly higher in high-latitude regions (41°-50°N)(P=0.016), and at high altitudes (>1000m)(P=0.025). The diagnostic performance of CSF-OCB differentiating MS from non-MS patients yielded a sensitivity of 76%, a specificity of 87%.InterpretationThe nationwide prevalence of CSF-OCB was 76.4% in Chinese MS patients, and demonstrated a good diagnostic performance in differentiating MS from other CNS diseases. The CSF-OCB prevalence showed a correlation with high latitude and altitude in Chinese MS patients
    • …
    corecore