3,525 research outputs found
Blind Multiclass Ensemble Classification
The rising interest in pattern recognition and data analytics has spurred the
development of innovative machine learning algorithms and tools. However, as
each algorithm has its strengths and limitations, one is motivated to
judiciously fuse multiple algorithms in order to find the "best" performing
one, for a given dataset. Ensemble learning aims at such high-performance
meta-algorithm, by combining the outputs from multiple algorithms. The present
work introduces a blind scheme for learning from ensembles of classifiers,
using a moment matching method that leverages joint tensor and matrix
factorization. Blind refers to the combiner who has no knowledge of the
ground-truth labels that each classifier has been trained on. A rigorous
performance analysis is derived and the proposed scheme is evaluated on
synthetic and real datasets.Comment: To appear in IEEE Transactions in Signal Processin
Engineering Crowdsourced Stream Processing Systems
A crowdsourced stream processing system (CSP) is a system that incorporates
crowdsourced tasks in the processing of a data stream. This can be seen as
enabling crowdsourcing work to be applied on a sample of large-scale data at
high speed, or equivalently, enabling stream processing to employ human
intelligence. It also leads to a substantial expansion of the capabilities of
data processing systems. Engineering a CSP system requires the combination of
human and machine computation elements. From a general systems theory
perspective, this means taking into account inherited as well as emerging
properties from both these elements. In this paper, we position CSP systems
within a broader taxonomy, outline a series of design principles and evaluation
metrics, present an extensible framework for their design, and describe several
design patterns. We showcase the capabilities of CSP systems by performing a
case study that applies our proposed framework to the design and analysis of a
real system (AIDR) that classifies social media messages during time-critical
crisis events. Results show that compared to a pure stream processing system,
AIDR can achieve a higher data classification accuracy, while compared to a
pure crowdsourcing solution, the system makes better use of human workers by
requiring much less manual work effort
A Stochastic Team Formation Approach for Collaborative Mobile Crowdsourcing
Mobile Crowdsourcing (MCS) is the generalized act of outsourcing sensing
tasks, traditionally performed by employees or contractors, to a large group of
smart-phone users by means of an open call. With the increasing complexity of
the crowdsourcing applications, requesters find it essential to harness the
power of collaboration among the workers by forming teams of skilled workers
satisfying their complex tasks' requirements. This type of MCS is called
Collaborative MCS (CMCS). Previous CMCS approaches have mainly focused only on
the aspect of team skills maximization. Other team formation studies on social
networks (SNs) have only focused on social relationship maximization. In this
paper, we present a hybrid approach where requesters are able to hire a team
that, not only has the required expertise, but also is socially connected and
can accomplish tasks collaboratively. Because team formation in CMCS is proven
to be NP-hard, we develop a stochastic algorithm that exploit workers knowledge
about their SN neighbors and asks a designated leader to recruit a suitable
team. The proposed algorithm is inspired from the optimal stopping strategies
and uses the odds-algorithm to compute its output. Experimental results show
that, compared to the benchmark exponential optimal solution, the proposed
approach reduces computation time and produces reasonable performance results.Comment: This paper is accepted for publication in 2019 31st International
Conference on Microelectronics (ICM
HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation
Recently, crowdsourcing has emerged as an effective paradigm for
human-powered large scale problem solving in various domains. However, task
requester usually has a limited amount of budget, thus it is desirable to have
a policy to wisely allocate the budget to achieve better quality. In this
paper, we study the principle of information maximization for active sampling
strategies in the framework of HodgeRank, an approach based on Hodge
Decomposition of pairwise ranking data with multiple workers. The principle
exhibits two scenarios of active sampling: Fisher information maximization that
leads to unsupervised sampling based on a sequential maximization of graph
algebraic connectivity without considering labels; and Bayesian information
maximization that selects samples with the largest information gain from prior
to posterior, which gives a supervised sampling involving the labels collected.
Experiments show that the proposed methods boost the sampling efficiency as
compared to traditional sampling schemes and are thus valuable to practical
crowdsourcing experiments.Comment: Accepted by AAAI201
- …