102 research outputs found

    Παραδοτέο Π.3.2: Μοντελοποίηση χρήστη και διαχείριση προτιμήσεων

    Get PDF
    Το παρόν παραδοτέο Π.3.2 περιλαμβάνει τα αποτελέσματα της υποδράσης ΥΔ3.2: Ρύθμιση της εξέλιξης υπερχώρων και οικοσυστημάτων πληροφορίας. Στην ενότητα 1 παρουσιάζουμε το πλαίσιο και τα κίνητρα της έρευνας μας, στην ενότητα 2 τρόπο μοντελοποίησης των προφίλ των χρηστών που προτείνουμε καθώς και μία εφαρμογή του συγκεκριμένου μοντέλου, στην ενότητα 3 τον τρόπο διαχείρισης των προφίλ και των αλγορίθμων που μπορούν να λειτουργήσουν κάτω από ένα τέτοιο μοντέλο, καθώς και την επιλεκτική αποδοτική επίλυση σημαντικών αλγορίθμων εξατομίκευσης

    On optimality of jury selection in crowdsourcing

    Get PDF
    Recent advances in crowdsourcing technologies enable computationally challenging tasks (e.g., sentiment analysis and entity resolution) to be performed by Internet workers, driven mainly by monetary incentives. A fundamental question is: how should workers be selected, so that the tasks in hand can be accomplished successfully and economically? In this paper, we study the Jury Selection Problem (JSP): Given a monetary budget, and a set of decision-making tasks (e.g., “Is Bill Gates still the CEO of Microsoft now?”), return the set of workers (called jury), such that their answers yield the highest “Jury Quality” (or JQ). Existing JSP solutions make use of the Majority Voting (MV) strategy, which uses the answer chosen by the largest number of workers. We show that MV does not yield the best solution for JSP. We further prove that among all voting strategies (including deterministic and randomized strategies), Bayesian Voting (BV) can optimally solve JSP. We then examine how to solve JSP based on BV. This is technically challenging, since computing the JQ with BV is NP-hard. We solve this problem by proposing an approximate algorithm that is computationally efficient. Our approximate JQ computation algorithm is also highly accurate, and its error is proved to be bounded within 1%. We extend our solution by considering the task owner’s “belief” (or prior) on the answers of the tasks. Experiments on synthetic and real datasets show that our new approach is consistently better than the best JSP solution known.published_or_final_versio

    Investigation of Database Models for Evolving Graphs

    Get PDF
    We deal with the efficient implementation of storage models for time-varying graphs. To this end, we present an improved approach for the HiNode vertex-centric model based on MongoDB. This approach, apart from its inherent space optimality, exhibits significant improvements in global query execution times, which is the most challenging query type for entity-centric approaches. Not only significant speedups are achieved but more expensive queries can be executed as well, when compared to an implementation based on Cassandra due to the capability to exploit indices to a larger extent and benefit from in-database query processing

    Finite Automata Algorithms in Map-Reduce

    Get PDF
    In this thesis the intersection of several large nondeterministic finite automata (NFA's) as well as minimization of a large deterministic finite automaton (DFA) in map-reduce are studied. We have derived a lower bound on replication rate for computing NFA intersections and provided three concrete algorithms for the problem. Our investigation of the replication rate for each of all three algorithms shows where each algorithm could be applied through detailed experiments on large datasets of finite automata. Denoting n the number of states in DFA A, we propose an algorithm to minimize A in n map-reduce rounds in the worst-case. Our experiments, however, indicate that the number of rounds, in practice, is much smaller than n for all DFA's we examined. In other words, this algorithm converges in d iterations by computing the equivalence classes of each state, where d is the diameter of the input DFA

    On the Parameterized Complexity of Learning Monadic Second-Order Formulas

    Full text link
    Within the model-theoretic framework for supervised learning introduced by Grohe and Tur\'an (TOCS 2004), we study the parameterized complexity of learning concepts definable in monadic second-order logic (MSO). We show that the problem of learning a consistent MSO-formula is fixed-parameter tractable on structures of bounded tree-width and on graphs of bounded clique-width in the 1-dimensional case, that is, if the instances are single vertices (and not tuples of vertices). This generalizes previous results on strings and on trees. Moreover, in the agnostic PAC-learning setting, we show that the result also holds in higher dimensions. Finally, via a reduction to the MSO-model-checking problem, we show that learning a consistent MSO-formula is para-NP-hard on general structures

    Characterizing XML Twig Queries with Examples

    Get PDF
    International audienceTypically, a (Boolean) query is a finite formula that defines a possibly infinite set of database instances that satisfy it (positive examples), and implicitly, the set of instances that do not satisfy the query (negative examples). We investigate the following natural question: for a given class of queries, is it possible to characterize every query with a finite set of positive and negative examples that no other query is consistent with.We study this question for twig queries and XML databases. We show that while twig queries are characterizable, they generally require exponential sets of examples. Consequently, we focus on a practical subclass of anchored twig queries and show that not only are they characterizable but also with polynomially-sized sets of examples. This result is obtained with the use of generalization operations on twig queries, whose application to an anchored twig query yields a properly contained and minimally different query. Our results illustrate further interesting and strong connections between the structure and the semantics of anchored twig queries that the class of arbitrary twig queries does not enjoy. Finally, we show that the class of unions of twig queries is not characterizable
    corecore