14,361 research outputs found
Automatic tagging and geotagging in video collections and communities
Automatically generated tags and geotags hold great promise
to improve access to video collections and online communi-
ties. We overview three tasks offered in the MediaEval 2010
benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features
Crowdsourcing malaria parasite quantification: an online game for analyzing images of infected thick blood smears
Background: There are 600,000 new malaria cases daily worldwide. The gold standard for estimating the parasite burden and the corresponding severity of the disease consists in manually counting the number of parasites in blood smears through a microscope, a process that can take more than 20 minutes of an expert microscopist’s time.
Objective: This research tests the feasibility of a crowdsourced approach to malaria image analysis. In particular, we investigated whether anonymous volunteers with no prior experience would be able to count malaria parasites in digitized images of thick blood smears by playing a Web-based game.
Methods: The experimental system consisted of a Web-based game where online volunteers were tasked with detecting parasites in digitized blood sample images coupled with a decision algorithm that combined the analyses from several players to produce an improved collective detection outcome. Data were collected through the MalariaSpot website. Random images of thick blood films containing Plasmodium falciparum at medium to low parasitemias, acquired by conventional optical microscopy, were presented to players. In the game, players had to find and tag as many parasites as possible in 1 minute. In the event that players found all the parasites present in the image, they were presented with a new image. In order to combine the choices of different players into a single crowd decision, we implemented an image processing pipeline and a quorum algorithm that judged a parasite tagged when a group of players agreed on its position.
Results: Over 1 month, anonymous players from 95 countries played more than 12,000 games and generated a database of more than 270,000 clicks on the test images. Results revealed that combining 22 games from nonexpert players achieved a parasite counting accuracy higher than 99%. This performance could be obtained also by combining 13 games from players trained for 1 minute. Exhaustive computations measured the parasite counting accuracy for all players as a function of the number of games considered and the experience of the players. In addition, we propose a mathematical equation that accurately models the collective parasite counting performance.
Conclusions: This research validates the online gaming approach for crowdsourced counting of malaria parasites in images of thick blood films. The findings support the conclusion that nonexperts are able to rapidly learn how to identify the typical features of malaria parasites in digitized thick blood samples and that combining the analyses of several users provides similar parasite counting accuracy rates as those of expert microscopists. This experiment illustrates the potential of the crowdsourced gaming approach for performing routine malaria parasite quantification, and more generally for solving biomedical image analysis problems, with future potential for telediagnosis related to global health challenges
A Data-Driven Approach for Tag Refinement and Localization in Web Videos
Tagging of visual content is becoming more and more widespread as web-based
services and social networks have popularized tagging functionalities among
their users. These user-generated tags are used to ease browsing and
exploration of media collections, e.g. using tag clouds, or to retrieve
multimedia content. However, not all media are equally tagged by users. Using
the current systems is easy to tag a single photo, and even tagging a part of a
photo, like a face, has become common in sites like Flickr and Facebook. On the
other hand, tagging a video sequence is more complicated and time consuming, so
that users just tag the overall content of a video. In this paper we present a
method for automatic video annotation that increases the number of tags
originally provided by users, and localizes them temporally, associating tags
to keyframes. Our approach exploits collective knowledge embedded in
user-generated tags and web sources, and visual similarity of keyframes and
images uploaded to social sites like YouTube and Flickr, as well as web sources
like Google and Bing. Given a keyframe, our method is able to select on the fly
from these visual sources the training exemplars that should be the most
relevant for this test sample, and proceeds to transfer labels across similar
images. Compared to existing video tagging approaches that require training
classifiers for each tag, our system has few parameters, is easy to implement
and can deal with an open vocabulary scenario. We demonstrate the approach on
tag refinement and localization on DUT-WEBV, a large dataset of web videos, and
show state-of-the-art results.Comment: Preprint submitted to Computer Vision and Image Understanding (CVIU
A picture is worth a thousand words: The perplexing problem of indexing images
Indexing images has always been problematic due to their richness of content and innate subjectivity. Three traditional approaches to indexing images are described and analyzed. An introduction of the contemporary use of social tagging is presented along with its limitations. Traditional practices can continue to be used as a stand-alone solution, however deficiencies limit retrieval. A collaborative technique is supported by current research and a model created by the authors for its inception is explored. CONTENTdm® is used as an example to illustrate tools that can help facilitate this process. Another potential solution discussed is the expansion of algorithms used in computer extraction to include the input and influence of human indexer intelligence. Further research is recommended in each area to discern the most effective method
Automatic Concept Discovery from Parallel Text and Visual Corpora
Humans connect language and vision to perceive the world. How to build a
similar connection for computers? One possible way is via visual concepts,
which are text terms that relate to visually discriminative entities. We
propose an automatic visual concept discovery algorithm using parallel text and
visual corpora; it filters text terms based on the visual discriminative power
of the associated images, and groups them into concepts using visual and
semantic similarities. We illustrate the applications of the discovered
concepts using bidirectional image and sentence retrieval task and image
tagging task, and show that the discovered concepts not only outperform several
large sets of manually selected concepts significantly, but also achieves the
state-of-the-art performance in the retrieval task.Comment: To appear in ICCV 201
Blindspot: Indistinguishable Anonymous Communications
Communication anonymity is a key requirement for individuals under targeted
surveillance. Practical anonymous communications also require
indistinguishability - an adversary should be unable to distinguish between
anonymised and non-anonymised traffic for a given user. We propose Blindspot, a
design for high-latency anonymous communications that offers
indistinguishability and unobservability under a (qualified) global active
adversary. Blindspot creates anonymous routes between sender-receiver pairs by
subliminally encoding messages within the pre-existing communication behaviour
of users within a social network. Specifically, the organic image sharing
behaviour of users. Thus channel bandwidth depends on the intensity of image
sharing behaviour of users along a route. A major challenge we successfully
overcome is that routing must be accomplished in the face of significant
restrictions - channel bandwidth is stochastic. We show that conventional
social network routing strategies do not work. To solve this problem, we
propose a novel routing algorithm. We evaluate Blindspot using a real-world
dataset. We find that it delivers reasonable results for applications requiring
low-volume unobservable communication.Comment: 13 Page
Notes on the Margins of Metadata; Concerning the Undecidability of the Digital Image
This paper considers the significance of metadata in relation to the image economy of the web. Social practices such as keywording, tagging, rating and viewing increasingly influence the modes of navigation and hence the utility of images in online environments. To a user faced with an avalanche of images, metadata promises to make photographs machine-readable in order to mobilize new knowledge, in a continuation of the archival paradigm. At the same time, metadata enables new topologies of the image, new temporalities and multiplicities which present a challenge to historical models of representation. As photography becomes an encoded discourse, we suggest that the turning away from the visual towards the mathematical and the algorithmic establishes undecidability as a key property of the networked image
- …