14 research outputs found

    Kosmos-2.5: A Multimodal Literate Model

    Full text link
    We present Kosmos-2.5, a multimodal literate model for machine reading of text-intensive images. Pre-trained on large-scale text-intensive images, Kosmos-2.5 excels in two distinct yet cooperative transcription tasks: (1) generating spatially-aware text blocks, where each block of text is assigned its spatial coordinates within the image, and (2) producing structured text output that captures styles and structures into the markdown format. This unified multimodal literate capability is achieved through a shared Transformer architecture, task-specific prompts, and flexible text representations. We evaluate Kosmos-2.5 on end-to-end document-level text recognition and image-to-markdown text generation. Furthermore, the model can be readily adapted for any text-intensive image understanding task with different prompts through supervised fine-tuning, making it a general-purpose tool for real-world applications involving text-rich images. This work also paves the way for the future scaling of multimodal large language models

    Identification of Novel Inhibitors against Coactivator Associated Arginine Methyltransferase 1 Based on Virtual Screening and Biological Assays

    No full text
    Overexpression of coactivator associated arginine methyltransferase 1 (CARM1), a protein arginine N-methyltransferase (PRMT) family enzyme, is associated with various diseases including cancers. Consequently, the development of small-molecule inhibitors targeting PRMTs has significant value for both research and therapeutic purposes. In this study, together with structure-based virtual screening with biochemical assays, two compounds DC_C11 and DC_C66 were identified as novel inhibitors of CARM1. Cellular studies revealed that the two inhibitors are cell membrane permeable and effectively blocked proliferation of cancer cells including HELA, K562, and MCF7. We further predicted the binding mode of these inhibitors through molecular docking analysis, which indicated that the inhibitors competitively occupied the binding site of the substrate and destroyed the protein-protein interactions between CARM1 and its substrates. Overall, this study has shed light on the development of small-molecule CARM1 inhibitors with novel scaffolds

    A representative-based framework for parsing and summarizing events in surveillance videos

    No full text
    This paper presents a novel representative-based framework for parsing and summarizing events in long surveillance videos. The proposed framework first extracts object blob sequences and utilizes them to represent events in a surveillance video. Then, a sequence filtering strategy is introduced which detects and eliminates noisy blob sequences based on their spatial and temporal characteristics. After clustering the blob sequences into different event types, we further introduce a representative-based model which integrates location, size, and appearance cues to select a representative blob sequence from each cluster, and creates a snapshot image for each representative blob sequence. Based on the blob-sequence clustering and representative-sequence selection results, two schemes are further proposed to summarize contents of the input surveillance video: (1) type-based scheme which shows snapshot images to users and creates a summary video for a specific event cluster according to user-selected snapshot image; (2) representative-based scheme which creates a summary video only with the extracted representative blob sequences. Experimental results show that our approach can create more effective and well-organized summarization results compared with the state-of-the-art methods

    ThiNet: Pruning CNN Filters for a Thinner Net

    No full text

    FRIH: Fine-grained Region-aware Image Harmonization

    Full text link
    Image harmonization aims to generate a more realistic appearance of foreground and background for a composite image. Existing methods perform the same harmonization process for the whole foreground. However, the implanted foreground always contains different appearance patterns. All the existing solutions ignore the difference of each color block and losing some specific details. Therefore, we propose a novel global-local two stages framework for Fine-grained Region-aware Image Harmonization (FRIH), which is trained end-to-end. In the first stage, the whole input foreground mask is used to make a global coarse-grained harmonization. In the second stage, we adaptively cluster the input foreground mask into several submasks by the corresponding pixel RGB values in the composite image. Each submask and the coarsely adjusted image are concatenated respectively and fed into a lightweight cascaded module, adjusting the global harmonization performance according to the region-aware local feature. Moreover, we further designed a fusion prediction module by fusing features from all the cascaded decoder layers together to generate the final result, which could utilize the different degrees of harmonization results comprehensively. Without bells and whistles, our FRIH algorithm achieves the best performance on iHarmony4 dataset (PSNR is 38.19 dB) with a lightweight model. The parameters for our model are only 11.98 M, far below the existing methods

    Identification of Selective, Cell Active Inhibitors of Protein Arginine Methyltransferase 5 through Structure-Based Virtual Screening and Biological Assays

    No full text
    Protein arginine methyltransferase 5 (PRMT5), a type II PRMT enzyme, is reported as an important therapeutic target in leukemia and lymphoma. In the present study, based on the combination of virtual screening and biochemical validations, we discovered a series of small-molecule inhibitors targeting PRMT5. Among those, DC_Y134 exhibited the most potent activity with IC<sub>50</sub> value of 1.7 μM and displayed good selectivity against other methyltransferases. Further treatment with DC_Y134 inhibited the proliferation of several hematological malignancy cell lines by causing cell cycle arrest and apoptosis. Western blot assays indicated that DC_Y134 reduced the cellular symmetrically dimethylated levels. In addition, we analyzed the binding mode of DC_Y134 through molecular docking, which revealed that DC_Y134 occupies the binding site of substrate arginine and explained the selectivity of this inhibitor. Taken together, compound DC_Y134 could be used to elucidate the biological roles of PRMT5 and serve as a lead compound for treatment of hematologic malignancies

    Manipulation of the Electronic Transport Properties of Charge-Transfer Oxide Thin Films of NdNi O3 Using Static and Electric-Field-Controllable Dynamic Lattice Strain

    Get PDF
    Using perovskite-type charge-transfer oxide thin films of NdNiO3 (NNO) as a model system, we demonstrate that the effects of lattice strain on the electronic transport properties can be more comprehensively understood by growing NNO films on a number of (001)-, (011)-, and (111)-cut single-crystal substrates with different lattice mismatches including the relaxor-based 0.31Pb(In1/2Nb1/2)O3-0.35Pb(Mg1/3Nb2/3)O3-0.34PbTiO3 (PIN-PMN-PT) and 0.71Pb(Mg1/3Nb2/3)O3-0.29PbTiO3 (PMN-PT) ferroelectric (FE) single crystals. In addition to the static lattice strains from conventional substrates (e.g., SrTiO3, LaAlO3), we in situ impose in-plane compressive or tensile strains to NNO films using FE/ferroelastic domain switching of FE substrates. An unprecedented electric-field-induced large out-of-plane compressive strain (-0.53%) and in-plane tensile strain (+0.81%) are achieved in the 25-nm NNO film by switching the polarization direction of the PIN-PMN-PT substrate at T = 200 K. This value is approximately 7.4 to 45 times larger than those previously reported in FE substrate-based heterostructures. As a result of the induced large lattice strain, the resistivity of the NNO film is modulated up to 125%. Further, taking advantage of the linear piezoelectric strain, a quantitative relationship between the resistivity and the in-plane strain of the NNO film is established, with a gauge fact of (Δρ/ρ)/δϵxx∼40.8. Moreover, using the domain-engineered FE/ferroelastic switching of PMN-PT substrates, multiple stable resistance states with good retention and endurance properties can be obtained at room temperature and the metal-to-insulator transition temperature (T MI ) of NNO films can be modified by precisely controlling the electric-field-pulse sequence as a result of the nonvolatile remnant strain transferring from the PMN-PT to the NNO film. Our results demonstrate that the electric-field-tunable ferroelastic/piezoelectric strain approach can be utilized to gain deeper insight into the intrinsic strain-property relationship of perovskite nickelate films and provide a simple and energy efficient way to construct multistate resistive memories
    corecore