69 research outputs found

    Joint appearance and motion model for multi-class multi-object tracking

    Get PDF
    Model-free tracking is a widely-accepted approach to track an arbitrary object in a video using a single frame annotation with no further prior knowledge about the object of interest. Extending this problem to track multiple objects is really challenging because: a) the tracker is not aware of the objects’ type while trying to distinguish them from background (detection task) , and b) The tracker needs to distinguish one object from other potentially similar objects (data association task) to generate stable trajectories. In order to track multiple arbitrary objects, most existing model-free tracking approaches rely on tracking each target individually by updating their appearance model independently. Therefore, in this scenario they often fail to perform well due to confusion between the appearance of similar objects, their sudden appearance changes and occlusion. To tackle this problem, we propose to use both appearance and motion models, and to learn them jointly using graphical models and deep neural networks features. We introduce an indicator variable to predict sudden appearance change and/or occlusion. When these happen, our model does not update the appearance model thus avoiding using the background and/or incorrect object to update the appearance of the object of interest mistakenly, and relies on our motion model to track. Moreover, we consider the correlation among all targets, and seek the joint optimal locations for all targets simultaneously as a graphical model inference problem. We learn the joint parameters for both appearance model and motion model in an online fashion under the framework of LaRank. Experiment results show that our method outperforms the state-of-the-art.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 201

    Datasets for Large Language Models: A Comprehensive Survey

    Full text link
    This paper embarks on an exploration into the Large Language Model (LLM) datasets, which play a crucial role in the remarkable advancements of LLMs. The datasets serve as the foundational infrastructure analogous to a root system that sustains and nurtures the development of LLMs. Consequently, examination of these datasets emerges as a critical topic in research. In order to address the current lack of a comprehensive overview and thorough analysis of LLM datasets, and to gain insights into their current status and future trends, this survey consolidates and categorizes the fundamental aspects of LLM datasets from five perspectives: (1) Pre-training Corpora; (2) Instruction Fine-tuning Datasets; (3) Preference Datasets; (4) Evaluation Datasets; (5) Traditional Natural Language Processing (NLP) Datasets. The survey sheds light on the prevailing challenges and points out potential avenues for future investigation. Additionally, a comprehensive review of the existing available dataset resources is also provided, including statistics from 444 datasets, covering 8 language categories and spanning 32 domains. Information from 20 dimensions is incorporated into the dataset statistics. The total data size surveyed surpasses 774.5 TB for pre-training corpora and 700M instances for other datasets. We aim to present the entire landscape of LLM text datasets, serving as a comprehensive reference for researchers in this field and contributing to future studies. Related resources are available at: https://github.com/lmmlzn/Awesome-LLMs-Datasets.Comment: 181 pages, 21 figure

    UPOCR: Towards Unified Pixel-Level OCR Interface

    Full text link
    In recent years, the optical character recognition (OCR) field has been proliferating with plentiful cutting-edge approaches for a wide spectrum of tasks. However, these approaches are task-specifically designed with divergent paradigms, architectures, and training strategies, which significantly increases the complexity of research and maintenance and hinders the fast deployment in applications. To this end, we propose UPOCR, a simple-yet-effective generalist model for Unified Pixel-level OCR interface. Specifically, the UPOCR unifies the paradigm of diverse OCR tasks as image-to-image transformation and the architecture as a vision Transformer (ViT)-based encoder-decoder. Learnable task prompts are introduced to push the general feature representations extracted by the encoder toward task-specific spaces, endowing the decoder with task awareness. Moreover, the model training is uniformly aimed at minimizing the discrepancy between the generated and ground-truth images regardless of the inhomogeneity among tasks. Experiments are conducted on three pixel-level OCR tasks including text removal, text segmentation, and tampered text detection. Without bells and whistles, the experimental results showcase that the proposed method can simultaneously achieve state-of-the-art performance on three tasks with a unified single model, which provides valuable strategies and insights for future research on generalist OCR models. Code will be publicly available

    Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation

    Full text link
    This paper presents a comprehensive evaluation of the Optical Character Recognition (OCR) capabilities of the recently released GPT-4V(ision), a Large Multimodal Model (LMM). We assess the model's performance across a range of OCR tasks, including scene text recognition, handwritten text recognition, handwritten mathematical expression recognition, table structure recognition, and information extraction from visually-rich document. The evaluation reveals that GPT-4V performs well in recognizing and understanding Latin contents, but struggles with multilingual scenarios and complex tasks. Specifically, it showed limitations when dealing with non-Latin languages and complex tasks such as handwriting mathematical expression recognition, table structure recognition, and end-to-end semantic entity recognition and pair extraction from document image. Based on these observations, we affirm the necessity and continued research value of specialized OCR models. In general, despite its versatility in handling diverse OCR tasks, GPT-4V does not outperform existing state-of-the-art OCR models. How to fully utilize pre-trained general-purpose LMMs such as GPT-4V for OCR downstream tasks remains an open problem. The study offers a critical reference for future research in OCR with LMMs. Evaluation pipeline and results are available at https://github.com/SCUT-DLVCLab/GPT-4V_OCR

    Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

    Full text link
    Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection and recognition) and information extraction, which completely ignored the high correlation among them during optimization. In this paper, we propose a robust visual information extraction system (VIES) towards real-world scenarios, which is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction by taking a single document image as input and outputting the structured information. Specifically, the information extraction branch collects abundant visual and semantic representations from text spotting for multimodal feature fusion and conversely, provides higher-level semantic clues to contribute to the optimization of text spotting. Moreover, regarding the shortage of public benchmarks, we construct a fully-annotated dataset called EPHOIE (https://github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances. Compared with the state-of-the-art methods, our VIES shows significant superior performance on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used SROIE dataset under the end-to-end scenario.Comment: 8 pages, 5 figures, to be published in AAAI 202

    Prognostic Significance of Serum Cysteine-Rich Protein 61 in Patients with Acute Heart Failure

    Get PDF
    Background/Aims: Cyr61-cysteine-rich protein 61 (CCN1/CYR61) is a multifunctional matricellular protein involved in the regulation of fibrogenesis. Animal experiments have demonstrated that CCN1 can inhibit cardiac fibrosis in cardiac hypertrophy. However, no study has been conducted to assess the relation between serum CCN1 and prognosis of acute heart failure (AHF). Methods: We measured the serum CCN1 levels of 183 patients with AHF, and the patients were followed up for 6 months. The associations between CCN1 levels and some clinical covariates, especially left ventricular ejection fraction (LVEF), estimated glomerular filtration rate (eGFR), atrial fibrillation and age, were estimated. The AHF patients were followed up for 6 months. The endpoint was all-cause mortality. Kaplan-Meier curve analysis and multivariable Cox proportional hazards analysis were employed to evaluate the prognostic ability of CCN1. We used calibration, discrimination and reclassification to assess the mortality risk prediction of adding CCN1. Results: Serum CCN1 concentrations in AHF patients were significantly increased compared with those in individuals without AHF (237 pg/ml vs. 124.8 pg/ml, p< 0.001). CCN1 level was associated with the level of NT-proBNP (r=0.349, p< 0.001) and was not affected by LVEF, eGFR, age or atrial fibrillation in AHF patients. Importantly, Kaplan-Meier curve analysis illustrated that the AHF patients with serum CCN1 level > 260 pg/ ml had a lower survival rate (p< 0.001). Multivariate Cox hazard analysis suggests that CCN1 functions as an independent predictor of mortality for AHF patients (LgCCN1, hazard ratio 5.825, 95% confidence interval: 1.828-18.566, p=0.003). In addition, the inclusion of CCN1 in the model with NT-proBNP significantly improved the C-statistic for predicting death (0.758, p< 0.001). The integrated discrimination index was 0.019 (p< 0.001), and the net reclassification index increased significantly after addition of CCN1 (23.9%, p=0.0179). Conclusions: CCN1 is strongly predictive of 6-month mortality in patients with AHF, suggesting serum CCN1 as a promising candidate prognostic biomarker for AHF patients

    Development and Efficacy Evaluation of an SP01-adjuvanted Inactivated Escherichia Coli Mutant Vaccine Against Bovine Coliform Mastitis

    Get PDF
    Escherichia coli ( E. coli ) is one of the most common pathogens causing clinical mastitis in cattle, but no vaccine is available to prevent this disease in China. Therefore, development of an E. coli vaccine against bovine clinical mastitis is urgently needed. The candidate vaccine (Ch-O111-1) and challenge (LZ06) strains were screened from milk samples of cows with clinical mastitis. To extend the cross-protection of the Ch-O111-1 strain, we deleted the galE gene fragment of the Ch-O111-1 strain through homologous recombination between the Ch-O111-1 strain and pCVD442/ΔgalE plasmid, which was identified through conventional methods, including PCR, SDS-PAGE and sequencing. The Ch-O111-1/ΔgalE (Z9) strain was characterized by extensive cross-reactivity and attenuated virulence. We prepared inactivated Z9 vaccines with different adjuvants. Immunization of inactivated Z9 antigen induced adjuvant-, dosage- and inoculation time-dependent antibody titers in cows and mice. Furthermore, immunization with SP01-adjuvanted inactivated Z9 vaccine protected cows against severe clinical mastitis caused by LZ06 and protected mice against death due to LZ06. An SP01-adjuvanted inactivated Z9 vaccine was successfully developed and found to protect cows against severe mastitis caused by Escherichia coli

    Construction and Evaluation of the Brucella Double Gene Knock-out Vaccine Strain MB6 Δbp26ΔwboA (RM6)

    Get PDF
    Brucellosis is a serious zoonotic infection worldwide. To date, vaccination is the most effective measure against brucellosis. This study was aimed at obtaining a vaccine strain that has high protective efficacy and low toxicity, and allows vaccination to be differentiated from infection. Using homologous recombination, we constructed a double gene-deletion Brucella strain MB6 Δbp26ΔwboA (RM6) and evaluated its characteristics, safety and efficacy. The RM6 strain had good proliferative ability and stable biological characteristics in vivo and in vitro. Moreover, it had a favorable safety profile and elicited specific immune responses in mice and sheep. The RM6 strain may have substantial practical application value

    Preparation of Equine Immunoglobulin F(ab′) 2 against Smallpox and Evaluation of its Immunoprotective Effect

    Get PDF
    Smallpox, a severe infectious disease caused by the smallpox virus, causes a death rate as high as 30% within 15-20 days after infection. Therefore, development of anti-Smallpox product as a strategic reserve is urgently needed. We prepared and tested pepsin-digested F(ab′) 2 fragments of serum IgG from horses. Transmission electron microscopy indicated that the purified virus showed morphology consistent with VVTT. The titer was above 1.0 × 10 7 PFU/mL. The purity of the antigen exceeded 90%, according to HPLC. After purification and cleavage, the yield of the purified product F(ab′) 2 was approximately 1.3%, its purity exceeded 90%, and the neutralizing antibody titer exceeded 1:3200. F(ab′) 2 fragments had good preventive and therapeutic effects in mice at antibody doses of 5.2 mg/mL and 2.6 mg/mL. The viral loads of the drug-treated mice were suppressed to varying degrees, and the higher dose groups (5.2 and 2.6 mg/mL) showed a 2-3 fold lower viral load than that in the control group. A process for producing equine immunoglobulin F(ab′) 2 against VVTT was established. The prepared horse anti-smallpox immunoglobulin product had good neutralizing antibody effects on VVTT. The highly purified preparation may serve as a potential candidate for smallpox treatment
    • …
    corecore