140 research outputs found

    A Multiagent Evolutionary Algorithm for the Resource-Constrained Project Portfolio Selection and Scheduling Problem

    Get PDF
    A multiagent evolutionary algorithm is proposed to solve the resource-constrained project portfolio selection and scheduling problem. The proposed algorithm has a dual level structure. In the upper level a set of agents make decisions to select appropriate project portfolios. Each agent selects its project portfolio independently. The neighborhood competition operator and self-learning operator are designed to improve the agent’s energy, that is, the portfolio profit. In the lower level the selected projects are scheduled simultaneously and completion times are computed to estimate the expected portfolio profit. A priority rule-based heuristic is used by each agent to solve the multiproject scheduling problem. A set of instances were generated systematically from the widely used Patterson set. Computational experiments confirmed that the proposed evolutionary algorithm is effective for the resource-constrained project portfolio selection and scheduling problem

    Turning a CLIP Model into a Scene Text Detector

    Full text link
    The recent large-scale Contrastive Language-Image Pretraining (CLIP) model has shown great potential in various downstream tasks via leveraging the pretrained vision and language knowledge. Scene text, which contains rich textual and visual information, has an inherent connection with a model like CLIP. Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection. In contrast to these works, this paper proposes a new method, termed TCM, focusing on Turning the CLIP Model directly for text detection without pretraining process. We demonstrate the advantages of the proposed TCM as follows: (1) The underlying principle of our framework can be applied to improve existing scene text detector. (2) It facilitates the few-shot training capability of existing methods, e.g., by using 10% of labeled data, we significantly improve the performance of the baseline method with an average of 22% in terms of the F-measure on 4 benchmarks. (3) By turning the CLIP model into existing scene text detection methods, we further achieve promising domain adaptation ability. The code will be publicly released at https://github.com/wenwenyu/TCM.Comment: CVPR202

    Turning a CLIP Model into a Scene Text Spotter

    Full text link
    We exploit the potential of the large-scale Contrastive Language-Image Pretraining (CLIP) model to enhance scene text detection and spotting tasks, transforming it into a robust backbone, FastTCM-CR50. This backbone utilizes visual prompt learning and cross-attention in CLIP to extract image and text-based prior knowledge. Using predefined and learnable prompts, FastTCM-CR50 introduces an instance-language matching process to enhance the synergy between image and text embeddings, thereby refining text regions. Our Bimodal Similarity Matching (BSM) module facilitates dynamic language prompt generation, enabling offline computations and improving performance. FastTCM-CR50 offers several advantages: 1) It can enhance existing text detectors and spotters, improving performance by an average of 1.7% and 1.5%, respectively. 2) It outperforms the previous TCM-CR50 backbone, yielding an average improvement of 0.2% and 0.56% in text detection and spotting tasks, along with a 48.5% increase in inference speed. 3) It showcases robust few-shot training capabilities. Utilizing only 10% of the supervised data, FastTCM-CR50 improves performance by an average of 26.5% and 5.5% for text detection and spotting tasks, respectively. 4) It consistently enhances performance on out-of-distribution text detection and spotting datasets, particularly the NightTime-ArT subset from ICDAR2019-ArT and the DOTA dataset for oriented object detection. The code is available at https://github.com/wenwenyu/TCM.Comment: arXiv admin note: text overlap with arXiv:2302.1433

    FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning

    Full text link
    Multimodal tasks in the fashion domain have significant potential for e-commerce, but involve challenging vision-and-language learning problems - e.g., retrieving a fashion item given a reference image plus text feedback from a user. Prior works on multimodal fashion tasks have either been limited by the data in individual benchmarks, or have leveraged generic vision-and-language pre-training but have not taken advantage of the characteristics of fashion data. Additionally, these works have mainly been restricted to multimodal understanding tasks. To address these gaps, we make two key contributions. First, we propose a novel fashion-specific pre-training framework based on weakly-supervised triplets constructed from fashion image-text pairs. We show the triplet-based tasks are an effective addition to standard multimodal pre-training tasks. Second, we propose a flexible decoder-based model architecture capable of both fashion retrieval and captioning tasks. Together, our model design and pre-training approach are competitive on a diverse set of fashion tasks, including cross-modal retrieval, image retrieval with text feedback, image captioning, relative image captioning, and multimodal categorization.Comment: 14 pages, 4 figures. To appear at Conference on Empirical Methods in Natural Language Processing (EMNLP) 202

    Looking and Listening: Audio Guided Text Recognition

    Full text link
    Text recognition in the wild is a long-standing problem in computer vision. Driven by end-to-end deep learning, recent studies suggest vision and language processing are effective for scene text recognition. Yet, solving edit errors such as add, delete, or replace is still the main challenge for existing approaches. In fact, the content of the text and its audio are naturally corresponding to each other, i.e., a single character error may result in a clear different pronunciation. In this paper, we propose the AudioOCR, a simple yet effective probabilistic audio decoder for mel spectrogram sequence prediction to guide the scene text recognition, which only participates in the training phase and brings no extra cost during the inference stage. The underlying principle of AudioOCR can be easily applied to the existing approaches. Experiments using 7 previous scene text recognition methods on 12 existing regular, irregular, and occluded benchmarks demonstrate our proposed method can bring consistent improvement. More importantly, through our experimentation, we show that AudioOCR possesses a generalizability that extends to more challenging scenarios, including recognizing non-English text, out-of-vocabulary words, and text with various accents. Code will be available at https://github.com/wenwenyu/AudioOCR

    Antifungal active ingredient from the twigs and leaves of Clausena lansium Lour. Skeels (Rutaceae)

    Get PDF
    Two novel amides, named clauphenamides A and B, and twelve other known compounds were isolated from the twigs and leaves of Clausena lansium Lour. Skeels (Rutaceae). Their structures were elucidated on the basis of extensive spectroscopic analysis and comparison with data reported in the literature. Clauphenamide A (1) featured in the unit of N-2-(4,8-dimethoxyfuro [2,3-b]quinolin-7-yl)vinyl, and clauphenamide B (2) was a unprecedented N-phenethyl cinnamide dimer. Other known compounds belong to pyrrolidone amides (3 and 4), furacoumarins (7–10), simple coumarins (11–14), lignan (5) and sesquiterpene (6). Compounds 5, 6, 10 and 12 were separated from the genus (Clausena) for the first time, while 13 was isolated in the species (C. lansium) for the first time. The antifungal activities of the isolated compounds were assayed. As a result, at the concentration of 100 μg/ml, compared with the control (chlorothalonil, inhibition rate of 83.67%), compounds 1 and 2 were found to exhibit moderate antifungal activity against B. dothidea with inhibition rates of 68.39% and 52.05%, respectively. Compounds 11–14 also exhibited moderate activity against B. dothidea and F. oxysporum, with inhibition rates greater than 40%. In addition, compared with the control (chlorothalonil, inhibition rate of 69.02%), compounds 11–14 showed strong antifungal activity to P. oryzae, with inhibition rates greater than 55%. Among them, compound 14 has the strongest antifungal activity against P. oryzae, and the inhibition rate (65.44%) is close to that of the control chlorothalonil. Additionally, the structure-activity relationships of the separated compounds are also discussed preliminarily in this paper

    Carrier localization and electronic phase separation in a doped spin-orbit driven Mott phase in Sr3(Ir1-xRux)2O7

    Full text link
    Interest in many strongly spin-orbit coupled 5d-transition metal oxide insulators stems from mapping their electronic structures to a J=1/2 Mott phase. One of the hopes is to establish their Mott parent states and explore these systems' potential of realizing novel electronic states upon carrier doping. However, once doped, little is understood regarding the role of their reduced Coulomb interaction U relative to their strongly correlated 3d-electron cousins. Here we show that, upon hole-doping a candidate J=1/2 Mott insulator, carriers remain localized within a nanoscale phase separated ground state. A percolative metal-insulator transition occurs with interplay between localized and itinerant regions, stabilizing an antiferromagnetic metallic phase beyond the critical region. Our results demonstrate a surprising parallel between doped 5d- and 3d-electron Mott systems and suggest either through the near degeneracy of nearby electronic phases or direct carrier localization that U is essential to the carrier response of this doped spin-orbit Mott insulator.Comment: 25 pages, 4 figures in main text, 4 figures in supplemental tex

    Conditionally Immortalized Mouse Embryonic Fibroblasts Retain Proliferative Activity without Compromising Multipotent Differentiation Potential

    Get PDF
    Mesenchymal stem cells (MSCs) are multipotent cells which reside in many tissues and can give rise to multiple lineages including bone, cartilage and adipose. Although MSCs have attracted significant attention for basic and translational research, primary MSCs have limited life span in culture which hampers MSCs' broader applications. Here, we investigate if mouse mesenchymal progenitors can be conditionally immortalized with SV40 large T antigen and maintain long-term cell proliferation without compromising their multipotency. Using the system which expresses SV40 large T antigen flanked with Cre/loxP sites, we demonstrate that mouse embryonic fibroblasts (MEFs) can be efficiently immortalized by SV40 large T antigen. The conditionally immortalized MEFs (iMEFs) exhibit an enhanced proliferative activity and maintain long-term cell proliferation, which can be reversed by Cre recombinase. The iMEFs express most MSC markers and retain multipotency as they can differentiate into osteogenic, chondrogenic and adipogenic lineages under appropriate differentiation conditions in vitro and in vivo. The removal of SV40 large T reduces the differentiation potential of iMEFs possibly due to the decreased progenitor expansion. Furthermore, the iMEFs are apparently not tumorigenic when they are subcutaneously injected into athymic nude mice. Thus, the conditionally immortalized iMEFs not only maintain long-term cell proliferation but also retain the ability to differentiate into multiple lineages. Our results suggest that the reversible immortalization strategy using SV40 large T antigen may be an efficient and safe approach to establishing long-term cell culture of primary mesenchymal progenitors for basic and translational research, as well as for potential clinical applications
    • …
    corecore