Search CORE

66 research outputs found

Joint Layout Analysis, Character Detection and Recognition for Historical Document Digitization

Author: Jin Lianwen
Ma Weihong
Wang Jiapeng
Wang Yongpan
Wu Sihang
Zhang Hesuo
Publication venue
Publication date: 14/07/2020
Field of study

In this paper, we propose an end-to-end trainable framework for restoring historical documents content that follows the correct reading order. In this framework, two branches named character branch and layout branch are added behind the feature extraction network. The character branch localizes individual characters in a document image and recognizes them simultaneously. Then we adopt a post-processing method to group them into text lines. The layout branch based on fully convolutional network outputs a binary mask. We then use Hough transform for line detection on the binary mask and combine character results with the layout information to restore document content. These two branches can be trained in parallel and are easy to train. Furthermore, we propose a re-score mechanism to minimize recognition error. Experiment results on the extended Chinese historical document MTHv2 dataset demonstrate the effectiveness of the proposed framework.Comment: 6 pages, 6 figure

arXiv.org e-Print Archive

Crossref

Large Language Models are Zero Shot Hypothesis Proposers

Author: Chen Zhang-Ren
Li Haoxiang
Qi Biqing
Tian Kai
Zeng Sihang
Zhang Kaiyan
Zhou Bowen
Publication venue
Publication date: 10/11/2023
Field of study

Significant scientific discoveries have driven the progress of human civilisation. The explosion of scientific literature and data has created information barriers across disciplines that have slowed the pace of scientific discovery. Large Language Models (LLMs) hold a wealth of global and interdisciplinary knowledge that promises to break down these information barriers and foster a new wave of scientific discovery. However, the potential of LLMs for scientific discovery has not been formally explored. In this paper, we start from investigating whether LLMs can propose scientific hypotheses. To this end, we construct a dataset consist of background knowledge and hypothesis pairs from biomedical literature. The dataset is divided into training, seen, and unseen test sets based on the publication date to control visibility. We subsequently evaluate the hypothesis generation capabilities of various top-tier instructed models in zero-shot, few-shot, and fine-tuning settings, including both closed and open-source LLMs. Additionally, we introduce an LLM-based multi-agent cooperative framework with different role designs and external tools to enhance the capabilities related to generating hypotheses. We also design four metrics through a comprehensive review to evaluate the generated hypotheses for both ChatGPT-based and human evaluations. Through experiments and analyses, we arrive at the following findings: 1) LLMs surprisingly generate untrained yet validated hypotheses from testing literature. 2) Increasing uncertainty facilitates candidate generation, potentially enhancing zero-shot hypothesis generation capabilities. These findings strongly support the potential of LLMs as catalysts for new scientific discoveries and guide further exploration.Comment: Instruction Workshop @ NeurIPS 202

arXiv.org e-Print Archive

Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)

Author: Feng Hongwei
Gu Zhouhong
Huang Wenhao
Jiang Sihang
Li Zihan
Wang Jianchen
Wang Shusen
Wang Zili
Xiao Yanghua
Xiong Zhuozhi
Ye Haoning
Zhang Lin
Zhang Yikai
Zhu Xiaoxuan
Publication venue
Publication date: 11/07/2023
Field of study

This paper introduces the Life Scapes Reasoning Benchmark (LSR-Benchmark), a novel dataset targeting real-life scenario reasoning, aiming to close the gap in artificial neural networks' ability to reason in everyday contexts. In contrast to domain knowledge reasoning datasets, LSR-Benchmark comprises free-text formatted questions with rich information on real-life scenarios, human behaviors, and character roles. The dataset consists of 2,162 questions collected from open-source online sources and is manually annotated to improve its quality. Experiments are conducted using state-of-the-art language models, such as gpt3.5-turbo and instruction fine-tuned llama models, to test the performance in LSR-Benchmark. The results reveal that humans outperform these models significantly, indicating a persisting challenge for machine learning models in comprehending daily human life

arXiv.org e-Print Archive

The combined therapeutic effects of \u3csup\u3e131\u3c/sup\u3eiodine-labeled multifunctional copper sulfide-loaded microspheres in treating breast cancer

Author: Chen Zhigang
Feng Dagan
Fulham Michel
Huang Gang
Li Panli
Li Wei
Liu Jianjun
Liu Qiufang
Qian Yuyi
Song Shaoli
Sun Xiaoguang
Wang Zerong
Zhang Sihang
Publication venue: DigitalCommons@URI
Publication date: 01/01/2018
Field of study

Compared to conventional cancer treatment, combination therapy based on well-designed nanoscale platforms may offer an opportunity to eliminate tumors and reduce recurrence and metastasis. In this study, we prepared multifunctional microspheres loading 131I-labeled hollow copper sulfide nanoparticles and paclitaxel (131I-HCuSNPs-MS-PTX) for imaging and therapeutics of W256/B breast tumors in rats. 18F-fluordeoxyglucose (18F-FDG) positron emission tomography/computed tomography (PET/CT) imaging detected that the expansion of the tumor volume was delayed (P\u3c0.05) following intra-tumoral (i.t.) injection with 131I-HCuSNPs-MS-PTX plus near-infrared (NIR) irradiation. The immunohistochemical analysis further confirmed the anti-tumor effect. The single photon emission computed tomography (SPECT)/photoacoustic imaging mediated by 131I-HCuSNPs-MS-PTX demonstrated that microspheres were mainly distributed in the tumors with a relatively low distribution in other organs. Our results revealed that 131I-HCuSNPs-MS-PTX offered combined photothermal, chemo- and radio-therapies, eliminating tumors at a relatively low dose, as well as allowing SPECT/CT and photoacoustic imaging monitoring of distribution of the injected agents non-invasively. The copper sulfide-loaded microspheres, 131I-HCuSNPs-MS-PTX, can serve as a versatile theranostic agent in an orthotopic breast cancer model

DigitalCommons@URI

Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

Author: Feng Hongwei
Gu Zhouhong
He Qianyu
Huang Wenhao
Jiang Sihang
Li Zihan
Wang Jianchen
Xiao Yanghua
Xiong Zhuozhi
Xu Rui
Ye Haoning
Zhang Lin
Zheng Weiguo
Zhu Xiaoxuan
Publication venue
Publication date: 09/06/2023
Field of study

New Natural Langauge Process~(NLP) benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present Xiezhi, the most comprehensive evaluation suite designed to assess holistic domain knowledge. Xiezhi comprises multiple-choice questions across 516 diverse disciplines ranging from 13 different subjects with 220,000 questions and accompanied by Xiezhi-Specialty and Xiezhi-Interdiscipline, both with 15k questions. We conduct evaluation of the 47 cutting-edge LLMs on Xiezhi. Results indicate that LLMs exceed average performance of humans in science, engineering, agronomy, medicine, and art, but fall short in economics, jurisprudence, pedagogy, literature, history, and management. We anticipate Xiezhi will help analyze important strengths and shortcomings of LLMs, and the benchmark is released in https://github.com/MikeGu721/XiezhiBenchmark .Comment: Under review of NeurIPS 202

arXiv.org e-Print Archive

Landau level splitting in Cd3As2 under high magnetic fields

Author: Cao Junzhi
Chen Zhi-Gang
Dai Xi
Huang Junwei
Jin Zhao
Li Liang
Li Shiyan
Liang Sihang
Liu Yanwen
Wang Qisi
Wang Zhijun
Xia Zhengcai
Xiu Faxian
Zhang Cheng
Zhao Jun
Zou Jin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Three-dimensional topological Dirac semimetals (TDSs) are a new kind of Dirac materials that exhibit linear energy dispersion in the bulk and can be viewed as three-dimensional graphene. It has been proposed that TDSs can be driven to other exotic phases like Weyl semimetals, topological insulators and topological superconductors by breaking certain symmetries. Here we report the first transport experiment on Landau level splitting in TDS Cd3As2 single crystals under high magnetic fields, suggesting the removal of spin degeneracy by breaking time reversal symmetry. The detected Berry phase develops an evident angular dependence and possesses a crossover from nontrivial to trivial state under high magnetic fields, a strong hint for a fierce competition between the orbit-coupled field strength and the field-generated mass term. Our results unveil the important role of symmetry breaking in TDSs and further demonstrate a feasible path to generate a Weyl semimetal phase by breaking time reversal symmetry.Comment: 31 page

arXiv.org e-Print Archive

Crossref

Queensland University of Technology ePrints Archive

PubMed Central

University of Southern Queensland ePrints

University of Queensland eSpace

131I-Labeled Copper Sulfide-Loaded Microspheres to Treat Hepatic Tumors via Hepatic Artery Embolization

Author: Feng Dagan
Fulham Michael
Huang Gang
Li Panli
Liu Jianjun
Liu Qiufang
Lu Wei
Qian Yuyi
Song Shaoli
Sun Xiaoguang
Zhang Sihang
Publication venue: 'Ivyspring International Publisher'
Publication date: 01/01/2018
Field of study

Sydney eScholarship

Two-Stage Liver and Tumor Segmentation Algorithm Based on Convolutional Neural Network

Author: Lu Meng
Qianqian Zhang
Sihang Bu
Publication venue: 'MDPI AG'
Publication date: 01/09/2021
Field of study

The liver is an essential metabolic organ of the human body, and malignant liver tumors seriously affect and threaten human life. The segmentation algorithm for liver and liver tumors is one of the essential branches of computer-aided diagnosis. This paper proposed a two-stage liver and tumor segmentation algorithm based on the convolutional neural network (CNN). In the present study, we used two stages to segment the liver and tumors: liver localization and tumor segmentation. In the liver localization stage, the network segments the liver region, adopts the encoding–decoding structure and long-distance feature fusion operation, and utilizes the shallow features’ spatial information to improve liver identification. In the tumor segmentation stage, based on the liver segmentation results of the first two steps, a CNN model was designed to accurately identify the liver tumors by using the 2D image features and 3D spatial features of the CT image slices. At the same time, we use the attention mechanism to improve the segmentation performance of small liver tumors. The proposed algorithm was tested on the public data set Liver Tumor Segmentation Challenge (LiTS). The Dice coefficient of liver segmentation was 0.967, and the Dice coefficient of tumor segmentation was 0.725. The proposed algorithm can accurately segment the liver and liver tumors in CT images. Compared with other state-of-the-art algorithms, the segmentation results of the proposed algorithm rank the highest in the Dice coefficient

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals