43 research outputs found

    Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation

    Full text link
    This paper presents a comprehensive evaluation of the Optical Character Recognition (OCR) capabilities of the recently released GPT-4V(ision), a Large Multimodal Model (LMM). We assess the model's performance across a range of OCR tasks, including scene text recognition, handwritten text recognition, handwritten mathematical expression recognition, table structure recognition, and information extraction from visually-rich document. The evaluation reveals that GPT-4V performs well in recognizing and understanding Latin contents, but struggles with multilingual scenarios and complex tasks. Specifically, it showed limitations when dealing with non-Latin languages and complex tasks such as handwriting mathematical expression recognition, table structure recognition, and end-to-end semantic entity recognition and pair extraction from document image. Based on these observations, we affirm the necessity and continued research value of specialized OCR models. In general, despite its versatility in handling diverse OCR tasks, GPT-4V does not outperform existing state-of-the-art OCR models. How to fully utilize pre-trained general-purpose LMMs such as GPT-4V for OCR downstream tasks remains an open problem. The study offers a critical reference for future research in OCR with LMMs. Evaluation pipeline and results are available at https://github.com/SCUT-DLVCLab/GPT-4V_OCR

    Development and validation of an interpretable machine learning model and online web-based calculator based on social-ecosystem theory for early prediction of postpartum depression: a longitudinal study

    Get PDF
    BackgroundPostpartum depression (PPD) has emerged as a global public health issue that can cause significant harm to mothers and their families. Currently, there is an urgent need for a robust early risk prediction model to enable accurate predictions of postpartum depression in hospitals.MethodsThis was a longitudinal study. Using social ecosystem theory, we collected multi-dimensional and multi-angle risk factors for early postpartum depression from delivery to discharge, and conducted 42-day postpartum follow-ups using the Edinburgh Postnatal Depression Scale (EPDS). We strictly adhered to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) checklist, used 10 machine learning (ML) algorithms to construct and validate the prediction model, and employed the Shapley additive explanation (SHAP) algorithm to explain the model. Risk stratification was performed through K-Means clustering analysis, ultimately resulting in an clinical screening tool for early PPD risk prediction.ResultsThe results showed that by comparing the performance of prediction models constructed by the 10 ML algorithms, the model constructed using the random forest algorithm was selected as the best, with an area under the receiver operating characteristic curve (AUC) of 0.91 (95% CI: 0.85–0.96) and 0.77 (95% CI: 0.70–0.85) in internal and external validation. Low risk probability (0, 0.26], medium risk probability (0.26, 0.63), and high risk probability (0.63, 1) were obtained through K-Means clustering analysis, and the SHAP value of the model was interpreted. Finally, we developed an online risk prediction calculator.ConclusionThis study developed an interpretable risk prediction model for early PPD, which may help healthcare providers to identify and implement intervention measures early, preventing the occurrence of PPD

    AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking

    Full text link
    We introduce the largest abdominal CT dataset (termed AbdomenAtlas) of 20,460 three-dimensional CT volumes sourced from 112 hospitals across diverse populations, geographies, and facilities. AbdomenAtlas provides 673K high-quality masks of anatomical structures in the abdominal region annotated by a team of 10 radiologists with the help of AI algorithms. We start by having expert radiologists manually annotate 22 anatomical structures in 5,246 CT volumes. Following this, a semi-automatic annotation procedure is performed for the remaining CT volumes, where radiologists revise the annotations predicted by AI, and in turn, AI improves its predictions by learning from revised annotations. Such a large-scale, detailed-annotated, and multi-center dataset is needed for two reasons. Firstly, AbdomenAtlas provides important resources for AI development at scale, branded as large pre-trained models, which can alleviate the annotation workload of expert radiologists to transfer to broader clinical applications. Secondly, AbdomenAtlas establishes a large-scale benchmark for evaluating AI algorithms -- the more data we use to test the algorithms, the better we can guarantee reliable performance in complex clinical scenarios. An ISBI & MICCAI challenge named BodyMaps: Towards 3D Atlas of Human Body was launched using a subset of our AbdomenAtlas, aiming to stimulate AI innovation and to benchmark segmentation accuracy, inference efficiency, and domain generalizability. We hope our AbdomenAtlas can set the stage for larger-scale clinical trials and offer exceptional opportunities to practitioners in the medical imaging community. Codes, models, and datasets are available at https://www.zongweiz.com/datasetComment: Published in Medical Image Analysi

    Using a specific wavelength for broadcast/multicast transmission in hybrid WDM/TDM PON

    No full text

    Survival Comparisons of Hepatic Arterial Infusion Chemotherapy With mFOLFOX and Transarterial Chemoembolization in Patients With Unresectable Intrahepatic Cholangiocarcinoma

    No full text
    BackgroundIntrahepatic cholangiocarcinoma (ICC) has a poor prognosis and 40%-60% of patients present with advanced disease at the time of diagnosis. Transarterial chemoembolization (TACE) and hepatic arterial infusion chemotherapy (HAIC) have recently been used in unresectable ICC. The aim of this study was to compare the survival differences of unresectable ICC patients after TACE and HAIC treatment.MethodsBetween March 2011 and October 2019, a total of 126 patients with unresectable ICC, as evident from biopsies and imaging, and who had received TACE or HAIC were enrolled in this study. Baseline characteristics and survival differences were compared between the TACE and HAIC treatment groups.ResultsICC Patients had significantly higher survival rates after the HAIC treatment, compared with those after TACE treatment [1-year overall survival (OS) rates: 60.2% vs. 42.9%, 2-year OS rates: 38.7% vs. 29.4%, P=0.028; 1-year progression-free survival (PFS) rates: 15.0% vs. 20.0%, 2-year PFS rates: 0% vs. 0%, P=0.641; 1-year only intrahepatic PFS (OIPFS) rates: 35.0% vs. 24.4%, 2-year OIPFS rates: 13.1% vs. 14.6%, P = 0.026]. Multivariate Cox regression analysis showed that HAIC was a significant and independent factor for OS and OIPFS in the study cohort.ConclusionsHAIC is superior to TACE for treatment of unresectable ICC. A new tumor response evaluation procedure for HAIC treatment in unresectable ICC patients is needed to provide better therapeutic strategies. A randomized clinical trial comparing the survival benefits of HAIC and TACE is therefore being considered.</jats:sec
    corecore