3,525 research outputs found

    Blind Multiclass Ensemble Classification

    Get PDF
    The rising interest in pattern recognition and data analytics has spurred the development of innovative machine learning algorithms and tools. However, as each algorithm has its strengths and limitations, one is motivated to judiciously fuse multiple algorithms in order to find the "best" performing one, for a given dataset. Ensemble learning aims at such high-performance meta-algorithm, by combining the outputs from multiple algorithms. The present work introduces a blind scheme for learning from ensembles of classifiers, using a moment matching method that leverages joint tensor and matrix factorization. Blind refers to the combiner who has no knowledge of the ground-truth labels that each classifier has been trained on. A rigorous performance analysis is derived and the proposed scheme is evaluated on synthetic and real datasets.Comment: To appear in IEEE Transactions in Signal Processin

    Engineering Crowdsourced Stream Processing Systems

    Full text link
    A crowdsourced stream processing system (CSP) is a system that incorporates crowdsourced tasks in the processing of a data stream. This can be seen as enabling crowdsourcing work to be applied on a sample of large-scale data at high speed, or equivalently, enabling stream processing to employ human intelligence. It also leads to a substantial expansion of the capabilities of data processing systems. Engineering a CSP system requires the combination of human and machine computation elements. From a general systems theory perspective, this means taking into account inherited as well as emerging properties from both these elements. In this paper, we position CSP systems within a broader taxonomy, outline a series of design principles and evaluation metrics, present an extensible framework for their design, and describe several design patterns. We showcase the capabilities of CSP systems by performing a case study that applies our proposed framework to the design and analysis of a real system (AIDR) that classifies social media messages during time-critical crisis events. Results show that compared to a pure stream processing system, AIDR can achieve a higher data classification accuracy, while compared to a pure crowdsourcing solution, the system makes better use of human workers by requiring much less manual work effort

    A Stochastic Team Formation Approach for Collaborative Mobile Crowdsourcing

    Full text link
    Mobile Crowdsourcing (MCS) is the generalized act of outsourcing sensing tasks, traditionally performed by employees or contractors, to a large group of smart-phone users by means of an open call. With the increasing complexity of the crowdsourcing applications, requesters find it essential to harness the power of collaboration among the workers by forming teams of skilled workers satisfying their complex tasks' requirements. This type of MCS is called Collaborative MCS (CMCS). Previous CMCS approaches have mainly focused only on the aspect of team skills maximization. Other team formation studies on social networks (SNs) have only focused on social relationship maximization. In this paper, we present a hybrid approach where requesters are able to hire a team that, not only has the required expertise, but also is socially connected and can accomplish tasks collaboratively. Because team formation in CMCS is proven to be NP-hard, we develop a stochastic algorithm that exploit workers knowledge about their SN neighbors and asks a designated leader to recruit a suitable team. The proposed algorithm is inspired from the optimal stopping strategies and uses the odds-algorithm to compute its output. Experimental results show that, compared to the benchmark exponential optimal solution, the proposed approach reduces computation time and produces reasonable performance results.Comment: This paper is accepted for publication in 2019 31st International Conference on Microelectronics (ICM

    HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

    Full text link
    Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains. However, task requester usually has a limited amount of budget, thus it is desirable to have a policy to wisely allocate the budget to achieve better quality. In this paper, we study the principle of information maximization for active sampling strategies in the framework of HodgeRank, an approach based on Hodge Decomposition of pairwise ranking data with multiple workers. The principle exhibits two scenarios of active sampling: Fisher information maximization that leads to unsupervised sampling based on a sequential maximization of graph algebraic connectivity without considering labels; and Bayesian information maximization that selects samples with the largest information gain from prior to posterior, which gives a supervised sampling involving the labels collected. Experiments show that the proposed methods boost the sampling efficiency as compared to traditional sampling schemes and are thus valuable to practical crowdsourcing experiments.Comment: Accepted by AAAI201
    • …
    corecore