53 research outputs found

    Breaking Common Sense: WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images

    Full text link
    Weird, unusual, and uncanny images pique the curiosity of observers because they challenge commonsense. For example, an image released during the 2022 world cup depicts the famous soccer stars Lionel Messi and Cristiano Ronaldo playing chess, which playfully violates our expectation that their competition should occur on the football field. Humans can easily recognize and interpret these unconventional images, but can AI models do the same? We introduce WHOOPS!, a new dataset and benchmark for visual commonsense. The dataset is comprised of purposefully commonsense-defying images created by designers using publicly-available image generation tools like Midjourney. We consider several tasks posed over the dataset. In addition to image captioning, cross-modal matching, and visual question answering, we introduce a difficult explanation generation task, where models must identify and explain why a given image is unusual. Our results show that state-of-the-art models such as GPT3 and BLIP2 still lag behind human performance on WHOOPS!. We hope our dataset will inspire the development of AI models with stronger visual commonsense reasoning abilities. Data, models and code are available at the project website: whoops-benchmark.github.i

    VASR: Visual Analogies of Situation Recognition

    Full text link
    A core process in human cognition is analogical mapping: the ability to identify a similar relational structure between different situations. We introduce a novel task, Visual Analogies of Situation Recognition, adapting the classical word-analogy task into the visual domain. Given a triplet of images, the task is to select an image candidate B' that completes the analogy (A to A' is like B to what?). Unlike previous work on visual analogy that focused on simple image transformations, we tackle complex analogies requiring understanding of scenes. We leverage situation recognition annotations and the CLIP model to generate a large set of 500k candidate analogies. Crowdsourced annotations for a sample of the data indicate that humans agree with the dataset label ~80% of the time (chance level 25%). Furthermore, we use human annotations to create a gold-standard dataset of 3,820 validated analogies. Our experiments demonstrate that state-of-the-art models do well when distractors are chosen randomly (~86%), but struggle with carefully chosen distractors (~53%, compared to 90% human accuracy). We hope our dataset will encourage the development of new analogy-making models. Website: https://vasr-dataset.github.io/Comment: Accepted to AAAI 2023. Website: https://vasr-dataset.github.io

    Code Llama: Open Foundation Models for Code

    Full text link
    We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation models (Code Llama), Python specializations (Code Llama - Python), and instruction-following models (Code Llama - Instruct) with 7B, 13B and 34B parameters each. All models are trained on sequences of 16k tokens and show improvements on inputs with up to 100k tokens. 7B and 13B Code Llama and Code Llama - Instruct variants support infilling based on surrounding content. Code Llama reaches state-of-the-art performance among open models on several code benchmarks, with scores of up to 53% and 55% on HumanEval and MBPP, respectively. Notably, Code Llama - Python 7B outperforms Llama 2 70B on HumanEval and MBPP, and all our models outperform every other publicly available model on MultiPL-E. We release Code Llama under a permissive license that allows for both research and commercial use

    Resource utilization and costs before and after total joint arthroplasty

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The purpose of this study was to compare pre- and post-surgical healthcare costs in commercially insured total joint arthroplasty (TJA) patients with osteoarthritis (OA) in the United States (U.S.).</p> <p>Methods</p> <p>Using a large healthcare claims database, we identified patients over age 39 with hip or knee OA who underwent unilateral primary TJA (hip or knee) between 1/1/2006 and 9/30/2007. Utilization of healthcare services and costs were aggregated into three periods: 12 months "pre-surgery," 91 days "peri-operative," and 3 to 15 month "follow-up," Mean total pre-surgery costs were compared with follow-up costs using Wilcoxon signed-rank test.</p> <p>Results</p> <p>14,912 patients met inclusion criteria for the study. The mean total number of outpatient visits declined from pre-surgery to follow-up (18.0 visits vs 17.1), while the percentage of patients hospitalized increased (from 7.5% to 9.8%) (both <it>p </it>< 0.01). Mean total costs during the follow-up period were 18% higher than during pre-surgery (11,043vs.11,043 vs. 9,632, <it>p </it>< 0.01), largely due to an increase in the costs of inpatient care associated with hospital readmissions (3,300vs.3,300 vs. 1,817, p < 0.01). Pharmacotherapy costs were similar for both periods (2013[followup]vs.2013 [follow-up] vs. 1922 [pre-surgery], p = 0.33); outpatient care costs were slightly lower in the follow-up period (4338vs.4338 vs. 4571, <it>p </it>< 0.01). Mean total costs for the peri-operative period were $36,553.</p> <p>Conclusions</p> <p>Mean total utilization of outpatient healthcare services declined slightly in the first year following TJA (exclusive of the peri-operative period), while mean total healthcare costs increased during the same time period, largely due to increased costs associated with hospital readmissions. Further study is necessary to determine whether healthcare costs decrease in subsequent years.</p

    DataComp: In search of the next generation of multimodal datasets

    Full text link
    Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl. Participants in our benchmark design new filtering techniques or curate new data sources and then evaluate their new dataset by running our standardized CLIP training code and testing the resulting model on 38 downstream test sets. Our benchmark consists of multiple compute scales spanning four orders of magnitude, which enables the study of scaling trends and makes the benchmark accessible to researchers with varying resources. Our baseline experiments show that the DataComp workflow leads to better training sets. In particular, our best baseline, DataComp-1B, enables training a CLIP ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet, outperforming OpenAI's CLIP ViT-L/14 by 3.7 percentage points while using the same training procedure and compute. We release DataComp and all accompanying code at www.datacomp.ai

    Drum Synthesis and Rhythmic Transformation with Adversarial Autoencoders

    Get PDF
    Creative rhythmic transformations of musical audio refer to automated methods for manipulation of temporally-relevant sounds in time. This paper presents a method for joint synthesis and rhythm transformation of drum sounds through the use of adversarial autoencoders (AAE). Users may navigate both the timbre and rhythm of drum patterns in audio recordings through expressive control over a low-dimensional latent space. The model is based on an AAE with Gaussian mixture latent distributions that introduce rhythmic pattern conditioning to represent a wide variety of drum performances. The AAE is trained on a dataset of bar-length segments of percussion recordings, along with their clustered rhythmic pattern labels. The decoder is conditioned during adversarial training for mixing of data-driven rhythmic and timbral properties. The system is trained with over 500000 bars from 5418 tracks in popular datasets covering various musical genres. In an evaluation using real percussion recordings, the reconstruction accuracy and latent space interpolation between drum performances are investigated for audio generation conditioned by target rhythmic patterns

    The PICO project: aquatic exercise for knee osteoarthritis in overweight and obese individuals

    Full text link

    Large-scale sequencing identifies multiple genes and rare variants associated with Crohn's disease susceptibility

    Get PDF
    Genome-wide association studies (GWASs) have identified hundreds of loci associated with Crohn's disease (CD). However, as with all complex diseases, robust identification of the genes dysregulated by noncoding variants typically driving GWAS discoveries has been challenging. Here, to complement GWASs and better define actionable biological targets, we analyzed sequence data from more than 30,000 patients with CD and 80,000 population controls. We directly implicate ten genes in general onset CD for the first time to our knowledge via association to coding variation, four of which lie within established CD GWAS loci. In nine instances, a single coding variant is significantly associated, and in the tenth, ATG4C, we see additionally a significantly increased burden of very rare coding variants in CD cases. In addition to reiterating the central role of innate and adaptive immune cells as well as autophagy in CD pathogenesis, these newly associated genes highlight the emerging role of mesenchymal cells in the development and maintenance of intestinal inflammation.Large-scale sequence-based analyses identify novel risk variants and susceptibility genes for Crohn's disease, and implicate mesenchymal cell-mediated intestinal homeostasis in disease etiology.Cellular mechanisms in basic and clinical gastroenterology and hepatolog
    corecore