26 research outputs found

    Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

    Full text link
    In this work, we investigate the potential of a large language model (LLM) to directly comprehend visual signals without the necessity of fine-tuning on multi-modal datasets. The foundational concept of our method views an image as a linguistic entity, and translates it to a set of discrete words derived from the LLM's vocabulary. To achieve this, we present the Vision-to-Language Tokenizer, abbreviated as V2T Tokenizer, which transforms an image into a ``foreign language'' with the combined aid of an encoder-decoder, the LLM vocabulary, and a CLIP model. With this innovative image encoding, the LLM gains the ability not only for visual comprehension but also for image denoising and restoration in an auto-regressive fashion-crucially, without any fine-tuning. We undertake rigorous experiments to validate our method, encompassing understanding tasks like image recognition, image captioning, and visual question answering, as well as image denoising tasks like inpainting, outpainting, deblurring, and shift restoration. Code and models are available at https://github.com/zh460045050/V2L-Tokenizer.Comment: Accepted by CVPR 202

    Scribble Hides Class: Promoting Scribble-Based Weakly-Supervised Semantic Segmentation with Its Class Label

    Full text link
    Scribble-based weakly-supervised semantic segmentation using sparse scribble supervision is gaining traction as it reduces annotation costs when compared to fully annotated alternatives. Existing methods primarily generate pseudo-labels by diffusing labeled pixels to unlabeled ones with local cues for supervision. However, this diffusion process fails to exploit global semantics and class-specific cues, which are important for semantic segmentation. In this study, we propose a class-driven scribble promotion network, which utilizes both scribble annotations and pseudo-labels informed by image-level classes and global semantics for supervision. Directly adopting pseudo-labels might misguide the segmentation model, thus we design a localization rectification module to correct foreground representations in the feature space. To further combine the advantages of both supervisions, we also introduce a distance entropy loss for uncertainty reduction, which adapts per-pixel confidence weights according to the reliable region determined by the scribble and pseudo-label's boundary. Experiments on the ScribbleSup dataset with different qualities of scribble annotations outperform all the previous methods, demonstrating the superiority and robustness of our method.The code is available at https://github.com/Zxl19990529/Class-driven-Scribble-Promotion-Network

    Retinal image synthesis from multiple-landmarks input with generative adversarial networks

    Get PDF
    Background Medical datasets, especially medical images, are often imbalanced due to the different incidences of various diseases. To address this problem, many methods have been proposed to synthesize medical images using generative adversarial networks (GANs) to enlarge training datasets for facilitating medical image analysis. For instance, conventional methods such as image-to-image translation techniques are used to synthesize fundus images with their respective vessel trees in the field of fundus image. Methods In order to improve the image quality and details of the synthetic images, three key aspects of the pipeline are mainly elaborated: the input mask, architecture of GANs, and the resolution of paired images. We propose a new preprocessing pipeline named multiple-channels-multiple-landmarks (MCML), aiming to synthesize color fundus images from a combination of vessel tree, optic disc, and optic cup images. We compared both single vessel mask input and MCML mask input on two public fundus image datasets (DRIVE and DRISHTI-GS) with different kinds of Pix2pix and Cycle-GAN architectures. A new Pix2pix structure with ResU-net generator is also designed, which has been compared with the other models. Results and conclusion As shown in the results, the proposed MCML method outperforms the single vessel-based methods for each architecture of GANs. Furthermore, we find that our Pix2pix model with ResU-net generator achieves superior PSNR and SSIM performance than the other GANs. High-resolution paired images are also beneficial for improving the performance of each GAN in this work. Finally, a Pix2pix network with ResU-net generator using MCML and high-resolution paired images are able to generate good and realistic fundus images in this work, indicating that our MCML method has great potential in the field of glaucoma computer-aided diagnosis based on fundus image

    Comprehensive prognostic modeling of locoregional recurrence after radiotherapy for patients with locoregionally advanced hypopharyngeal squamous cell carcinoma

    Get PDF
    Purpose: To propose and evaluate a comprehensive modeling approach combing radiomics, dosiomics and clinical components, for more accurate prediction of locoregional recurrence risk after radiotherapy for patients with locoregionally advanced HPSCC. // Materials and methods: Clinical data of 77 HPSCC patients were retrospectively investigated, whose median follow-up duration was 23.27 (4.83-81.40) months. From the planning CT and dose distribution, 1321 radiomics and dosiomics features were extracted respectively from planning gross tumor volume (PGTV) region each patient. After stability test, feature dimension was further reduced by Principal Component Analysis (PCA), yielding Radiomic and Dosiomic Principal Components (RPCs and DPCs) respectively. Multiple Cox regression models were constructed using various combinations of RPC, DPC and clinical variables as the predictors. Akaike information criterion (AIC) and C-index were used to evaluate the performance of Cox regression models. // Results: PCA was performed on 338 radiomic and 873 dosiomic features that were tested as stable (ICC1 > 0.7 and ICC2 > 0.95), yielding 5 RPCs and DPCs respectively. Three comprehensive features (RPC0, P<0.01, DPC0, P<0.01 and DPC3, P<0.05) were found to be significant in the individual Radiomic or Dosiomic Cox regression models. The model combining the above features and clinical variable (total stage IVB) provided best risk stratification of locoregional recurrence (C-index, 0.815; 95%CI, 0.770-0.859) and prevailing balance between predictive accuracy and complexity (AIC, 143.65) than any other investigated models using either single factors or two combined components. // Conclusion: This study provided quantitative tools and additional evidence for the personalized treatment selection and protocol optimization for HPSCC, a relatively rare cancer. By combining complementary information from radiomics, dosiomics, and clinical variables, the proposed comprehensive model provided more accurate prediction of locoregional recurrence risk after radiotherapy

    Highly Efficient Production of Soluble Proteins from Insoluble Inclusion Bodies by a Two-Step-Denaturing and Refolding Method

    Get PDF
    The production of recombinant proteins in a large scale is important for protein functional and structural studies, particularly by using Escherichia coli over-expression systems; however, approximate 70% of recombinant proteins are over-expressed as insoluble inclusion bodies. Here we presented an efficient method for generating soluble proteins from inclusion bodies by using two steps of denaturation and one step of refolding. We first demonstrated the advantages of this method over a conventional procedure with one denaturation step and one refolding step using three proteins with different folding properties. The refolded proteins were found to be active using in vitro tests and a bioassay. We then tested the general applicability of this method by analyzing 88 proteins from human and other organisms, all of which were expressed as inclusion bodies. We found that about 76% of these proteins were refolded with an average of >75% yield of soluble proteins. This “two-step-denaturing and refolding” (2DR) method is simple, highly efficient and generally applicable; it can be utilized to obtain active recombinant proteins for both basic research and industrial purposes

    Genomic selection to improve husk tightness based on genomic molecular markers in maize

    Get PDF
    IntroductionThe husk tightness (HTI) in maize plays a crucial role in regulating the water content of ears during the maturity stage, thereby influencing the quality of mechanical grain harvesting in China. Genomic selection (GS), which employs molecular markers, offers a promising approach for identifying and selecting inbred lines with the desired HTI trait in maize breeding. However, the effectiveness of GS is contingent upon various factors, including the genetic architecture of breeding populations, sequencing platforms, and statistical models.MethodsAn association panel of maize inbred lines was grown across three sites over two years, divided into four subgroups. GS analysis for HTI prediction was performed using marker data from three sequencing platforms and six marker densities with six statistical methods.ResultsThe findings indicate that a loosely attached husk can aid in the dissipation of water from kernels in temperate maize germplasms across most environments but not nessarily for tropical-origin maize. Considering the balance between GS prediction accuracy and breeding cost, the optimal prediction strategy is the rrBLUP model, the 50K sequencing platform, a 30% proportion of the test population, and a marker density of r2=0.1. Additionally, selecting a specific SS subgroup for sampling the testing set significantly enhances the predictive capacity for husk tightness.DiscussionThe determination of the optimal GS prediction strategy for HTI provides an economically feasible reference for the practice of molecular breeding. It also serves as a reference method for GS breeding of other agronomic traits

    Recent Advances on Endocrine Disrupting Effects of UV Filters

    No full text
    Ultraviolet (UV) filters are used widely in cosmetics, plastics, adhesives and other industrial products to protect human skin or products against direct exposure to deleterious UV radiation. With growing usage and mis-disposition of UV filters, they currently represent a new class of contaminants of emerging concern with increasingly reported adverse effects to humans and other organisms. Exposure to UV filters induce various endocrine disrupting effects, as revealed by increasing number of toxicological studies performed in recent years. It is necessary to compile a systematic review on the current research status on endocrine disrupting effects of UV filters toward different organisms. We therefore summarized the recent advances on the evaluation of the potential endocrine disruptors and the mechanism of toxicity for many kinds of UV filters such as benzophenones, camphor derivatives and cinnamate derivatives

    A super-resolution method-based pipeline for fundus fluorescein angiography imaging

    Get PDF
    Abstract Background Fundus fluorescein angiography (FFA) imaging is a standard diagnostic tool for many retinal diseases such as age-related macular degeneration and diabetic retinopathy. High-resolution FFA images facilitate the detection of small lesions such as microaneurysms, and other landmark changes, in the early stages; this can help an ophthalmologist improve a patient’s cure rate. However, only low-resolution images are available in most clinical cases. Super-resolution (SR), which is a method to improve the resolution of an image, has been successfully employed for natural and remote sensing images. To the best of our knowledge, no one has applied SR techniques to FFA imaging so far. Methods In this work, we propose a SR method-based pipeline for FFA imaging. The aim of this pipeline is to enhance the image quality of FFA by using SR techniques. Several SR frameworks including neighborhood embedding, sparsity-based, locally-linear regression and deep learning-based approaches are investigated. Based on a clinical FFA dataset collected from Second Affiliated Hospital to Xuzhou Medical University, each SR method is implemented and evaluated for the pipeline to improve the resolution of FFA images. Results and conclusion As shown in our results, most SR algorithms have a positive impact on the enhancement of FFA images. Super-resolution forests (SRF), a random forest-based SR method has displayed remarkable high effectiveness and outperformed other methods. Hence, SRF should be one potential way to benefit ophthalmologists by obtaining high-resolution FFA images in a clinical setting
    corecore