129 research outputs found
Much Ado About Time: Exhaustive Annotation of Temporal Data
Large-scale annotated datasets allow AI systems to learn from and build upon
the knowledge of the crowd. Many crowdsourcing techniques have been developed
for collecting image annotations. These techniques often implicitly rely on the
fact that a new input image takes a negligible amount of time to perceive. In
contrast, we investigate and determine the most cost-effective way of obtaining
high-quality multi-label annotations for temporal data such as videos. Watching
even a short 30-second video clip requires a significant time investment from a
crowd worker; thus, requesting multiple annotations following a single viewing
is an important cost-saving strategy. But how many questions should we ask per
video? We conclude that the optimal strategy is to ask as many questions as
possible in a HIT (up to 52 binary questions after watching a 30-second video
clip in our experiments). We demonstrate that while workers may not correctly
answer all questions, the cost-benefit analysis nevertheless favors consensus
from multiple such cheap-yet-imperfect iterations over more complex
alternatives. When compared with a one-question-per-video baseline, our method
is able to achieve a 10% improvement in recall 76.7% ours versus 66.7%
baseline) at comparable precision (83.8% ours versus 83.0% baseline) in about
half the annotation time (3.8 minutes ours compared to 7.1 minutes baseline).
We demonstrate the effectiveness of our method by collecting multi-label
annotations of 157 human activities on 1,815 videos.Comment: HCOMP 2016 Camera Read
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
Computer vision has a great potential to help our daily lives by searching
for lost keys, watering flowers or reminding us to take a pill. To succeed with
such tasks, computer vision methods need to be trained from real and diverse
examples of our daily dynamic scenes. While most of such scenes are not
particularly exciting, they typically do not appear on YouTube, in movies or TV
broadcasts. So how do we collect sufficiently many diverse but boring samples
representing our lives? We propose a novel Hollywood in Homes approach to
collect such data. Instead of shooting videos in the lab, we ensure diversity
by distributing and crowdsourcing the whole process of video creation from
script writing to video recording and annotation. Following this procedure we
collect a new dataset, Charades, with hundreds of people recording videos in
their own homes, acting out casual everyday activities. The dataset is composed
of 9,848 annotated videos with an average length of 30 seconds, showing
activities of 267 people from three continents. Each video is annotated by
multiple free-text descriptions, action labels, action intervals and classes of
interacted objects. In total, Charades provides 27,847 video descriptions,
66,500 temporally localized intervals for 157 action classes and 41,104 labels
for 46 object classes. Using this rich data, we evaluate and provide baseline
results for several tasks including action recognition and automatic
description generation. We believe that the realism, diversity, and casual
nature of this dataset will present unique challenges and new opportunities for
computer vision community
Beyond the Camera: Neural Networks in World Coordinates
Eye movement and strategic placement of the visual field onto the retina,
gives animals increased resolution of the scene and suppresses distracting
information. This fundamental system has been missing from video understanding
with deep networks, typically limited to 224 by 224 pixel content locked to the
camera frame. We propose a simple idea, WorldFeatures, where each feature at
every layer has a spatial transformation, and the feature map is only
transformed as needed. We show that a network built with these WorldFeatures,
can be used to model eye movements, such as saccades, fixation, and smooth
pursuit, even in a batch setting on pre-recorded video. That is, the network
can for example use all 224 by 224 pixels to look at a small detail one moment,
and the whole scene the next. We show that typical building blocks, such as
convolutions and pooling, can be adapted to support WorldFeatures using
available tools. Experiments are presented on the Charades, Olympic Sports, and
Caltech-UCSD Birds-200-2011 datasets, exploring action recognition,
fine-grained recognition, and video stabilization
Characterizing Video Question Answering with Sparsified Inputs
In Video Question Answering, videos are often processed as a full-length
sequence of frames to ensure minimal loss of information. Recent works have
demonstrated evidence that sparse video inputs are sufficient to maintain high
performance. However, they usually discuss the case of single frame selection.
In our work, we extend the setting to multiple number of inputs and other
modalities. We characterize the task with different input sparsity and provide
a tool for doing that. Specifically, we use a Gumbel-based learnable selection
module to adaptively select the best inputs for the final task. In this way, we
experiment over public VideoQA benchmarks and provide analysis on how
sparsified inputs affect the performance. From our experiments, we have
observed only 5.2%-5.8% loss of performance with only 10% of video lengths,
which corresponds to 2-4 frames selected from each video. Meanwhile, we also
observed the complimentary behaviour between visual and textual inputs, even
under highly sparsified settings, suggesting the potential of improving data
efficiency for video-and-language tasks
Multiple genetic loci for bone mineral density and fractures
To access publisher full text version of this article. Please click on the hyperlink in Additional Links fieldBACKGROUND: Bone mineral density influences the risk of osteoporosis later in life and is useful in the evaluation of the risk of fracture. We aimed to identify sequence variants associated with bone mineral density and fracture. METHODS: We performed a quantitative trait analysis of data from 5861 Icelandic subjects (the discovery set), testing for an association between 301,019 single-nucleotide polymorphisms (SNPs) and bone mineral density of the hip and lumbar spine. We then tested for an association between 74 SNPs (most of which were implicated in the discovery set) at 32 loci in replication sets of Icelandic, Danish, and Australian subjects (4165, 2269, and 1491 subjects, respectively). RESULTS: Sequence variants in five genomic regions were significantly associated with bone mineral density in the discovery set and were confirmed in the replication sets (combined P values, 1.2x10(-7) to 2.0x10(-21)). Three regions are close to or within genes previously shown to be important to the biologic characteristics of bone: the receptor activator of nuclear factor-kappaB ligand gene (RANKL) (chromosomal location, 13q14), the osteoprotegerin gene (OPG) (8q24), and the estrogen receptor 1 gene (ESR1) (6q25). The two other regions are close to the zinc finger and BTB domain containing 40 gene (ZBTB40) (1p36) and the major histocompatibility complex region (6p21). The 1p36, 8q24, and 6p21 loci were also associated with osteoporotic fractures, as were loci at 18q21, close to the receptor activator of the nuclear factor-kappaB gene (RANK), and loci at 2p16 and 11p11. CONCLUSIONS: We have discovered common sequence variants that are consistently associated with bone mineral density and with low-trauma fractures in three populations of European descent. Although these variants alone are not clinically useful in the prediction of risk to the individual person, they provide insight into the biochemical pathways underlying osteoporosis
A candidate gene study of the type I interferon pathway implicates IKBKE and IL8 as risk loci for SLE
Systemic Lupus Erythematosus (SLE) is a systemic autoimmune disease in which the type I interferon pathway has a crucial role. We have previously shown that three genes in this pathway, IRF5, TYK2 and STAT4, are strongly associated with risk for SLE. Here, we investigated 78 genes involved in the type I interferon pathway to identify additional SLE susceptibility loci. First, we genotyped 896 single-nucleotide polymorphisms in these 78 genes and 14 other candidate genes in 482 Swedish SLE patients and 536 controls. Genes with P<0.01 in the initial screen were then followed up in 344 additional Swedish patients and 1299 controls. SNPs in the IKBKE, TANK, STAT1, IL8 and TRAF6 genes gave nominal signals of association with SLE in this extended Swedish cohort. To replicate these findings we extracted data from a genomewide association study on SLE performed in a US cohort. Combined analysis of the Swedish and US data, comprising a total of 2136 cases and 9694 controls, implicates IKBKE and IL8 as SLE susceptibility loci (Pmeta=0.00010 and Pmeta=0.00040, respectively). STAT1 was also associated with SLE in this cohort (Pmeta=3.3 × 10−5), but this association signal appears to be dependent of that previously reported for the neighbouring STAT4 gene. Our study suggests additional genes from the type I interferon system in SLE, and highlights genes in this pathway for further functional analysis
The sequences of 150,119 genomes in the UK Biobank
Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data(1,2). Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank(3). This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation
- …