45 research outputs found

    Adversarial nets with perceptual losses for text-to-image synthesis

    Full text link
    Recent approaches in generative adversarial networks (GANs) can automatically synthesize realistic images from descriptive text. Despite the overall fair quality, the generated images often expose visible flaws that lack structural definition for an object of interest. In this paper, we aim to extend state of the art for GAN-based text-to-image synthesis by improving perceptual quality of generated images. Differentiated from previous work, our synthetic image generator optimizes on perceptual loss functions that measure pixel, feature activation, and texture differences against a natural image. We present visually more compelling synthetic images of birds and flowers generated from text descriptions in comparison to some of the most prominent existing work

    Adversarial Learning of Semantic Relevance in Text to Image Synthesis

    Full text link
    We describe a new approach that improves the training of generative adversarial nets (GANs) for synthesizing diverse images from a text input. Our approach is based on the conditional version of GANs and expands on previous work leveraging an auxiliary task in the discriminator. Our generated images are not limited to certain classes and do not suffer from mode collapse while semantically matching the text input. A key to our training methods is how to form positive and negative training examples with respect to the class label of a given image. Instead of selecting random training examples, we perform negative sampling based on the semantic distance from a positive example in the class. We evaluate our approach using the Oxford-102 flower dataset, adopting the inception score and multi-scale structural similarity index (MS-SSIM) metrics to assess discriminability and diversity of the generated images. The empirical results indicate greater diversity in the generated images, especially when we gradually select more negative training examples closer to a positive example in the semantic space

    Multimodal Sparse Coding for Event Detection

    Full text link
    Unsupervised feature learning methods have proven effective for classification tasks based on a single modality. We present multimodal sparse coding for learning feature representations shared across multiple modalities. The shared representations are applied to multimedia event detection (MED) and evaluated in comparison to unimodal counterparts, as well as other feature learning methods such as GMM supervectors and sparse RBM. We report the cross-validated classification accuracy and mean average precision of the MED system trained on features learned from our unimodal and multimodal settings for a subset of the TRECVID MED 2014 dataset.Comment: Multimodal Machine Learning Workshop at NIPS 201

    A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia

    Full text link
    Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively agreed upon by Wikipedia editors. We discover that some sources (or web domains) deemed untrustworthy in one language (i.e., English) continue to appear in articles in other languages. This trend is especially evident with sources tailored for smaller communities. Furthermore, non-authoritative sources found in the English version of a page tend to persist in other language versions of that page. We finally present a case study on the Chinese, Russian, and Swedish Wikipedias to demonstrate a discrepancy in reference reliability across cultures. Our finding highlights future challenges in coordinating global knowledge on source reliability.Comment: Conference on Information & Knowledge Management (CIKM '23

    Bidirectional Captioning for Clinically Accurate and Interpretable Models

    Full text link
    Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experiment with bidirectional captioning of radiology reports as a form of pretraining and compare the quality and utility of learned embeddings with those from contrastive pretraining methods. We optimize a CNN encoder, transformer decoder architecture named RadTex for the radiology domain. Results show that not only does captioning pretraining yield visual encoders that are competitive with contrastive pretraining (CheXpert competition multi-label AUC of 89.4%), but also that our transformer decoder is capable of generating clinically relevant reports (captioning macro-F1 score of 0.349 using CheXpert labeler) and responding to prompts with targeted, interactive outputs.Comment: 12 pages, 7 figures. Code release to follo

    Antibodies against endogenous retroviruses promote lung cancer immunotherapy

    Get PDF
    B cells are frequently found in the margins of solid tumours as organized follicles in ectopic lymphoid organs called tertiary lymphoid structures (TLS)1,2^{1,2}. Although TLS have been found to correlate with improved patient survival and response to immune checkpoint blockade (ICB), the underlying mechanisms of this association remain elusive1,2^{1,2}. Here we investigate lung-resident B cell responses in patients from the TRACERx 421 (Tracking Non-Small-Cell Lung Cancer Evolution Through Therapy) and other lung cancer cohorts, and in a recently established immunogenic mouse model for lung adenocarcinoma3^{3}. We find that both human and mouse lung adenocarcinomas elicit local germinal centre responses and tumour-binding antibodies, and further identify endogenous retrovirus (ERV) envelope glycoproteins as a dominant anti-tumour antibody target. ERV-targeting B cell responses are amplified by ICB in both humans and mice, and by targeted inhibition of KRAS(G12C) in the mouse model. ERV-reactive antibodies exert anti-tumour activity that extends survival in the mouse model, and ERV expression predicts the outcome of ICB in human lung adenocarcinoma. Finally, we find that effective immunotherapy in the mouse model requires CXCL13-dependent TLS formation. Conversely, therapeutic CXCL13 treatment potentiates anti-tumour immunity and synergizes with ICB. Our findings provide a possible mechanistic basis for the association of TLS with immunotherapy response

    Developing a Series of AI Challenges for the United States Department of the Air Force

    Full text link
    Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requirements. Several projects supported by the DAF-MIT AI Accelerator are developing public challenge problems that address numerous Federal AI research priorities. These challenges target priorities by making large, AI-ready datasets publicly available, incentivizing open-source solutions, and creating a demand signal for dual use technologies that can stimulate further research. In this article, we describe these public challenges being developed and how their application contributes to scientific advances

    Antibodies against endogenous retroviruses promote lung cancer immunotherapy

    Get PDF
    B cells are frequently found in the margins of solid tumours as organized follicles in ectopic lymphoid organs called tertiary lymphoid structures (TLS)1,2. Although TLS have been found to correlate with improved patient survival and response to immune checkpoint blockade (ICB), the underlying mechanisms of this association remain elusive1,2. Here we investigate lung-resident B cell responses in patients from the TRACERx 421 (Tracking Non-Small-Cell Lung Cancer Evolution Through Therapy) and other lung cancer cohorts, and in a recently established immunogenic mouse model for lung adenocarcinoma3. We find that both human and mouse lung adenocarcinomas elicit local germinal centre responses and tumour-binding antibodies, and further identify endogenous retrovirus (ERV) envelope glycoproteins as a dominant anti-tumour antibody target. ERV-targeting B cell responses are amplified by ICB in both humans and mice, and by targeted inhibition of KRAS(G12C) in the mouse model. ERV-reactive antibodies exert anti-tumour activity that extends survival in the mouse model, and ERV expression predicts the outcome of ICB in human lung adenocarcinoma. Finally, we find that effective immunotherapy in the mouse model requires CXCL13-dependent TLS formation. Conversely, therapeutic CXCL13 treatment potentiates anti-tumour immunity and synergizes with ICB. Our findings provide a possible mechanistic basis for the association of TLS with immunotherapy respons
    corecore