Search CORE

2,281 research outputs found

Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers

Author: Czepesvari Csaba
Ma Yao
Olshevsky Alexander
Saligrama Venkatesh
Publication venue
Publication date: 10/07/2018
Field of study

We consider worker skill estimation for the singlecoin Dawid-Skene crowdsourcing model. In practice skill-estimation is challenging because worker assignments are sparse and irregular due to the arbitrary, and uncontrolled availability of workers. We formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlation between workers. We show that the correlation matrix can be successfully recovered and skills identifiable if and only if the sampling matrix (observed components) is irreducible and aperiodic. We then propose an efficient gradient descent scheme and show that skill estimates converges to the desired global optima for such sampling matrices. Our proof is original and the results are surprising in light of the fact that even the weighted rank-one matrix factorization problem is NP hard in general. Next we derive sample complexity bounds for the noisy case in terms of spectral properties of the signless Laplacian of the sampling matrix. Our proposed scheme achieves state-of-art performance on a number of real-world datasets.Published versio

Boston University Institutional Repository (OpenBU)

A survey of spatial crowdsourcing

Author: Gummidi Srinivasa Raghavendra Bhuvan
Pedersen Torben Bach
Xie Xike
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/04/2019
Field of study

VBN

Multi-modal Spatial Crowdsourcing for Enriching Spatial Datasets

Author: Gummidi Srinivasa Raghavendra Bhuvan
Publication venue: Aalborg Universitetsforlag
Publication date: 01/01/2021
Field of study

VBN

Uncovering spatiotemporal biases in place-based social sensing

Author: Janowicz Krzysztof
Keßler Carsten
McKenzie Grant
Publication venue: 'Copernicus GmbH'
Publication date: 01/01/2020
Field of study

VBN

Conflating point of interest (POI) data: A systematic review of matching methods

Author: Hu Yingjie
Ma Yue
Sun Kai
Zhou Ryan Zhenqi
Zhu Yunqiang
Publication venue
Publication date: 23/10/2023
Field of study

Point of interest (POI) data provide digital representations of places in the real world, and have been increasingly used to understand human-place interactions, support urban management, and build smart cities. Many POI datasets have been developed, which often have different geographic coverages, attribute focuses, and data quality. From time to time, researchers may need to conflate two or more POI datasets in order to build a better representation of the places in the study areas. While various POI conflation methods have been developed, there lacks a systematic review, and consequently, it is difficult for researchers new to POI conflation to quickly grasp and use these existing methods. This paper fills such a gap. Following the protocol of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), we conduct a systematic review by searching through three bibliographic databases using reproducible syntax to identify related studies. We then focus on a main step of POI conflation, i.e., POI matching, and systematically summarize and categorize the identified methods. Current limitations and future opportunities are discussed afterwards. We hope that this review can provide some guidance for researchers interested in conflating POI datasets for their research

arXiv.org e-Print Archive

Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data

Author: Balasubramanian Vineeth N
Pal Arghya
Publication venue
Publication date: 01/01/2018
Field of study

Paucity of large curated hand-labeled training data for every domain-of-interest forms a major bottleneck in the deployment of machine learning models in computer vision and other fields. Recent work (Data Programming) has shown how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time. In this work, we present Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label has given a set of weak labeling functions. We validated our method on the MNIST, Fashion MNIST, CIFAR 10 and SVHN datasets, and it outperformed many state-of-the-art models. We conducted extensive experiments to study its usefulness, as well as showed how the proposed ADP framework can be used for transfer learning as well as multi-task learning, where data from two domains are generated simultaneously using the framework along with the label information. Our future work will involve understanding the theoretical implications of this new framework from a game-theoretic perspective, as well as explore the performance of the method on more complex datasets.Comment: CVPR 2018 main conference pape

arXiv.org e-Print Archive