362 research outputs found
Machine learning from crowds a systematic review of its applications
Crowdsourcing opens the door to solving a wide variety of problems that previ-ously were unfeasible in the field of machine learning, allowing us to obtain rela-tively low cost labeled data in a small amount of time. However, due to theuncertain quality of labelers, the data to deal with are sometimes unreliable, forcingpractitioners to collect information redundantly, which poses new challenges in thefield. Despite these difficulties, many applications of machine learning usingcrowdsourced data have recently been published that achieved state of the artresults in relevant problems. We have analyzed these applications following a sys-tematic methodology, classifying them into different fields of study, highlightingseveral of their characteristics and showing the recent interest in the use of crowd-sourcing for machine learning. We also identify several exciting research linesbased on the problems that remain unsolved to foster future research in this field
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
Beautiful and damned. Combined effect of content quality and social ties on user engagement
User participation in online communities is driven by the intertwinement of
the social network structure with the crowd-generated content that flows along
its links. These aspects are rarely explored jointly and at scale. By looking
at how users generate and access pictures of varying beauty on Flickr, we
investigate how the production of quality impacts the dynamics of online social
systems. We develop a deep learning computer vision model to score images
according to their aesthetic value and we validate its output through
crowdsourcing. By applying it to over 15B Flickr photos, we study for the first
time how image beauty is distributed over a large-scale social system.
Beautiful images are evenly distributed in the network, although only a small
core of people get social recognition for them. To study the impact of exposure
to quality on user engagement, we set up matching experiments aimed at
detecting causality from observational data. Exposure to beauty is
double-edged: following people who produce high-quality content increases one's
probability of uploading better photos; however, an excessive imbalance between
the quality generated by a user and the user's neighbors leads to a decline in
engagement. Our analysis has practical implications for improving link
recommender systems.Comment: 13 pages, 12 figures, final version published in IEEE Transactions on
Knowledge and Data Engineering (Volume: PP, Issue: 99
ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition
Sign languages are used as a primary language by approximately 70 million
D/deaf people world-wide. However, most communication technologies operate in
spoken and written languages, creating inequities in access. To help tackle
this problem, we release ASL Citizen, the first crowdsourced Isolated Sign
Language Recognition (ISLR) dataset, collected with consent and containing
83,399 videos for 2,731 distinct signs filmed by 52 signers in a variety of
environments. We propose that this dataset be used for sign language dictionary
retrieval for American Sign Language (ASL), where a user demonstrates a sign to
their webcam to retrieve matching signs from a dictionary. We show that
training supervised machine learning classifiers with our dataset advances the
state-of-the-art on metrics relevant for dictionary retrieval, achieving 63%
accuracy and a recall-at-10 of 91%, evaluated entirely on videos of users who
are not present in the training or validation sets. An accessible PDF of this
article is available at the following link:
https://aashakadesai.github.io/research/ASLCitizen_arxiv_updated.pd
It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
We address the problem of 3D human pose estimation from 2D input images using
only weakly supervised training data. Despite showing considerable success for
2D pose estimation, the application of supervised machine learning to 3D pose
estimation in real world images is currently hampered by the lack of varied
training images with corresponding 3D poses. Most existing 3D pose estimation
algorithms train on data that has either been collected in carefully controlled
studio settings or has been generated synthetically. Instead, we take a
different approach, and propose a 3D human pose estimation algorithm that only
requires relative estimates of depth at training time. Such training signal,
although noisy, can be easily collected from crowd annotators, and is of
sufficient quality for enabling successful training and evaluation of 3D pose
algorithms. Our results are competitive with fully supervised regression based
approaches on the Human3.6M dataset, despite using significantly weaker
training data. Our proposed algorithm opens the door to using existing
widespread 2D datasets for 3D pose estimation by allowing fine-tuning with
noisy relative constraints, resulting in more accurate 3D poses.Comment: BMVC 2018. Project page available at
http://www.vision.caltech.edu/~mronchi/projects/RelativePos
- …