18,317 research outputs found

    Activating Latino Millennial Civic Power: A Report of the Aspen Institute Latinos and Society Convening - Unlocking Latino Millennial Civic Potential

    Get PDF
    The report highlights the ideas and conversations that came from a June 2016 convening of 26 bright minds who developed five inspirational and impactful projects designed to increase Latino Millennial civic engagement

    Cross-Modal Alignment Learning of Vision-Language Conceptual Systems

    Full text link
    Human infants learn the names of objects and develop their own conceptual systems without explicit supervision. In this study, we propose methods for learning aligned vision-language conceptual systems inspired by infants' word learning mechanisms. The proposed model learns the associations of visual objects and words online and gradually constructs cross-modal relational graph networks. Additionally, we also propose an aligned cross-modal representation learning method that learns semantic representations of visual objects and words in a self-supervised manner based on the cross-modal relational graph networks. It allows entities of different modalities with conceptually the same meaning to have similar semantic representation vectors. We quantitatively and qualitatively evaluate our method, including object-to-word mapping and zero-shot learning tasks, showing that the proposed model significantly outperforms the baselines and that each conceptual system is topologically aligned.Comment: 19 pages, 4 figure

    Culture as history: envisioning change across and beyond "eastern" and "western" civilizations in the May Fourth era

    Get PDF
    This essay examines an influential debate that took place during China’sMay Fourth era (circa 1915–1927) concerning the character of ‘‘Eastern’’ and ‘‘Western’’ civilizations. In this debate, both moderates and radicals wrestle with a growing awareness that cultures have not only a spatial existence but also a historical career, which has encouraged the development of certain institutions and attitudes and discouraged others. Spatial terms mark not only the places where knowledge circulates, but also the particular pasts-and thus futures-toward which Chinese thinkers align themselves. This way of figuring ‘‘East’’ and ‘‘West’’ enables May Fourth thinkers to do more than sort civilizational characteristics into categories of the inevitably universal and the irredeemably particular, as many commentators have assumed. It also facilitates the travel of cultural products and practices across the spatial as well as temporal boundaries originally seen to contain them

    SmallCap: Lightweight Image Captioning Prompted with Retrieval Augmentation

    Full text link
    Recent advances in image captioning have focused on scaling the data and model size, substantially increasing the cost of pre-training and finetuning. As an alternative to large models, we present SmallCap, which generates a caption conditioned on an input image and related captions retrieved from a datastore. Our model is lightweight and fast to train as the only learned parameters are in newly introduced cross-attention layers between a pre-trained CLIP encoder and GPT-2 decoder. SmallCap can transfer to new domains without additional finetuning and exploit large-scale data in a training-free fashion because the contents of the datastore can be readily replaced. Our experiments show that SmallCap, trained only on COCO, has competitive performance on this benchmark, and also transfers to other domains without retraining, solely through retrieval from target-domain data. Further improvement is achieved through the training-free exploitation of diverse human-labeled and web data, which proves effective for other domains, including the nocaps image captioning benchmark, designed to test generalization to unseen visual concepts
    • …
    corecore