55 research outputs found

    What Large Language Models Bring to Text-rich VQA?

    Full text link
    Text-rich VQA, namely Visual Question Answering based on text recognition in the images, is a cross-modal task that requires both image comprehension and text recognition. In this work, we focus on investigating the advantages and bottlenecks of LLM-based approaches in addressing this problem. To address the above concern, we separate the vision and language modules, where we leverage external OCR models to recognize texts in the image and Large Language Models (LLMs) to answer the question given texts. The whole framework is training-free benefiting from the in-context ability of LLMs. This pipeline achieved superior performance compared to the majority of existing Multimodal Large Language Models (MLLM) on four text-rich VQA datasets. Besides, based on the ablation study, we find that LLM brings stronger comprehension ability and may introduce helpful knowledge for the VQA problem. The bottleneck for LLM to address text-rich VQA problems may primarily lie in visual part. We also combine the OCR module with MLLMs and pleasantly find that the combination of OCR module with MLLM also works. It's worth noting that not all MLLMs can comprehend the OCR information, which provides insights into how to train an MLLM that preserves the abilities of LLM

    Effect of childhood maltreatment on cognitive function and its relationship with personality development and social coping style in major depression disorder patients: A latent class model and network analysis

    Get PDF
    Study objectivesThe study aimed to (1) analyze the interrelationships among different types of childhood adversity, diverse personality dimensions, and individual coping style integratively among major depressive disorder (MDD) patients and healthy participants using a network approach; (2) explore the latent class of child maltreatment (CM) and its relationship with cognitive function.MethodsData were collected from the Objective Diagnostic Markers and Personalized Intervention in MDD Patients (ODMPIM) study, including 1,629 Chinese participants. Using the Childhood Trauma Questionnaire to assess CM, the Simplified Coping Style Questionnaire to measure individual coping style, Eysenck Personality Questionnaire Revised-Short Form for personality characters, and a series of neurocognitive tests, including seven tests with 18 subtests for cognitive assessments. We used the “Network Module” in Jeffreys’s Amazing Statistics Program (JASP) and R package for network analysis. A latent class analysis was performed with SAS9.4.ResultsChild maltreatment was more common in MDD patients than in healthy controls, except for emotional abuse. Network analysis showed that emotional abuse, emotional neglect, physical abuse, and physical neglect formed quadrangle connections. Personality dimensions were associated with physical neglect and emotional abuse. All types of CM (excluding sex abuse) showed an association with coping style. Emotional neglect showed the highest centrality measures. Physical neglect had a high level of closeness. To a concerning strength, emotional and physical neglect showed the highest levels. The structure of the networks is variant between groups (M = 0.28, P = 0.04). Latent class analysis (LCA) revealed that three classes provided the best fit statistics. Neglect and abuse classes tended to perform more poorly on the five cognitive domains.ConclusionThis study provided insights on multi-type of CM. Neglect played an important role in different routes through the relation between CM with personality traits and social coping style. However, neglect has often been ignored in previous studies and should receive more public attention

    MyoPS A Benchmark of Myocardial Pathology Segmentation Combining Three-Sequence Cardiac Magnetic Resonance Images

    Get PDF
    Assessment of myocardial viability is essential in diagnosis and treatment management of patients suffering from myocardial infarction, and classification of pathology on myocardium is the key to this assessment. This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) combining three-sequence cardiac magnetic resonance (CMR) images, which was first proposed in the MyoPS challenge, in conjunction with MICCAI 2020. The challenge provided 45 paired and pre-aligned CMR images, allowing algorithms to combine the complementary information from the three CMR sequences for pathology segmentation. In this article, we provide details of the challenge, survey the works from fifteen participants and interpret their methods according to five aspects, i.e., preprocessing, data augmentation, learning strategy, model architecture and post-processing. In addition, we analyze the results with respect to different factors, in order to examine the key obstacles and explore potential of solutions, as well as to provide a benchmark for future research. We conclude that while promising results have been reported, the research is still in the early stage, and more in-depth exploration is needed before a successful application to the clinics. Note that MyoPS data and evaluation tool continue to be publicly available upon registration via its homepage (www.sdspeople.fudan.edu.cn/zhuangxiahai/0/myops20/)

    Collagen/Polyethylene Oxide Nanofibrous Membranes with Improved Hemostasis and Cytocompatibility for Wound Dressing

    No full text
    As a promising agent for biomedical application, collagen has been used as a nanofiber to architecturally mimic its fibrillar structure in Extracellular Matrix (ECM); however, it has to be modified by techniques, such as crosslinking, to overcome its limitations in structural stability along with potential toxicity. Here, we prepared collagen/polyethylene oxide (PEO) nanofibrous membranes with varying crosslinking degrees and their properties, such as water stability, mechanical properties, blood clotting capacity and cytocompatibility, were studied systematically. By investigating the relationship between crosslinking degree and their properties, nanofibrous membranes with improved morphology retention, blood clotting capacity and cytocompatibility have been achieved. The result of circular dichroism measurement demonstrated that a triple helical fraction around 60.5% was retained. Moreover, the electrospun collagen/PEO at crosslinking degrees above 60.6% could maintain more than 72% of its original weight and its nanofibrous morphology under physiological conditions could be well preserved for up to 7 days. Furthermore, the crosslinked collagen/PEO membrane could provide a more friendly and suitable environment to promote cell proliferation, and about 70% of the clot can be formed in 5 min. With its superior performance in water stability, hemostasis and cytocompatibility, we anticipate that this nanofibrous membrane has great potential for wound dressing

    Experimental Study on the Time-Dependent Characteristics of MLPS Transparent Soil Strength

    No full text
    The time-dependent characteristics of transparent soil strength, composed of magnesium lithium phyllosilicate, is important for applying a thixotropic clay surrogate. The gas injection method was employed to obtain the strength, represented as cracking pressure, which was then correlated to variables including rest time, disturbance time, and recovery time. Three concentrations (3, 4, and 5%) were tested. The results show that the strength was directly proportional to the rest time, recovery time, and concentration while the disturbance time reversed. The calculated limit strengths for 3%, 4%, and 5% transparent soils were 3.831 kPa, 8.849 kPa, and 12.048 kPa, respectively. Experimental data also showed that the residual strength for higher concentration transparent soil was more significant than the lower ones. The elastic property immediately generated partial strength recovery after disturbance, while the viscosity property resulted in a slow recovery stage similar to the rest stage. The strength recovery rate was also sensitive to concentration. Furthermore, the strength with 3%, 4%, and 5% concentrations could regain limit values after sufficient recovery, which were calculated as 4.303 kPa, 8.255 kPa, and 14.884 kPa, respectively

    An Improved Single-Epoch Attitude Determination Method for Low-Cost Single-Frequency GNSS Receivers

    No full text
    GNSS attitude determination has been widely used in various navigation and positioning applications, due to its advantages of low cost and high efficiency. The navigation positioning and attitude determination modules in the consumer market mostly use low-cost receivers and face many problems such as large multipath effects, frequent cycle slips and even loss of locks. Ambiguity fixing is the key to GNSS attitude determination and will face more challenges in the complex urban environment. Based on the CLAMBDA algorithm, this paper proposes a CLAMBDA-search algorithm based on the multi-baseline GNSS model. This algorithm improves the existing CLAMBDA method through a fixed geometry constraint among baselines in the vehicle coordinate system. A fixed single-baseline solution reduces two degrees of freedom of vehicle rigid body, and a global minimization search for the ambiguity objective function in the other degree of freedom is conducted to calculate the baseline vector and its Euler angles. In addition, in order to make up for the shortcomings of short baseline ambiguity in complex environments, this paper proposes different validation strategies. Using three low-cost receivers (ublox M8T) and patch antennas, static and dynamic on-board experiments with different baseline length set-ups were carried out in different environments. Both the experiments prove that the method proposed in this paper has greatly improved the ambiguity fixing performance and also the Euler angle calculation accuracy, with an acceptable calculation burden. It is a practical vehicle-mounted attitude determination algorithm

    Effect of Deep Cryogenic Treatment on Microstructure and Properties of Sintered Fe–Co–Cu-Based Diamond Composites

    No full text
    Metal-matrix-impregnated diamond composites are used for fabricating many kinds of diamond tools. In the efforts to satisfy the increasing engineering requirements, researchers have brought more attention to find novel methods of enhancing the performance of impregnated diamond composites. In this study, deep cryogenic treatment was applied to Fe–Co–Cu-based diamond composites to improve their performance. Relative density, hardness, bending strength, and grinding ratio of matrix and diamond composite samples were measured by a series of tests. Meanwhile, the fracture morphologies of all samples after the bending strength test were observed and analyzed by scanning electron microscope. The results showed that the hardness and bending strength of matrix increased slightly after deep cryogenic treatment. The grinding ratio of impregnated diamond composites exhibited a great increase by 32.9% as a result of deep cryogenic treatment. The strengthening mechanism was analyzed in detail. The Fe–Co–Cu-based impregnated composites subjected to deep cryogenic treatment for 1 h exhibited the best overall performance

    NL3DLogTNN: An Effective Hyperspectral Image Denoising Method Combined Non-Local Self-Similarity and Low-Fibered- Rank Regularization

    No full text
    Hyperspectral image denoising is an important research topic in the field of remote sensing image processing. Recently, methods based on non-local low-rank tensor approximation have gained widespread attention towing to their ability to fully exploit non-local self-similarity. However, existing non-local low-rank tensor approximation methods fall short in capturing the correlations between various modes in hyperspectral images, thus failing to achieve the optimal approximation. To solve this issue, a novel three-directional log-based tensor nuclear norm (3DLogTNN)–based non-local hyperspectral image denoising model NL3DLogTNN is proposed. The correlation between the various modes of the model was obtained by performing TNN decomposition in three directions on the extracted non-local comparable blocks, better capturing the global low-rank property of the image. To effectively solve the proposed NL3DLogTNN model, we developed an approximate alternating direction method of multipliers (ADMM)-based methodology and offered a thorough numerical convergence proof. Extensive experiments are conducted on hyperspectral image datasets with simulated noise and real-world noise, which demonstrated that the proposed NL3DLogTNN model outperforms state-of-the-art methods in terms of quantitative and visual performance evaluation
    corecore