1,277 research outputs found

    On the Complexity of Mining Itemsets from the Crowd Using Taxonomies

    Full text link
    We study the problem of frequent itemset mining in domains where data is not recorded in a conventional database but only exists in human knowledge. We provide examples of such scenarios, and present a crowdsourcing model for them. The model uses the crowd as an oracle to find out whether an itemset is frequent or not, and relies on a known taxonomy of the item domain to guide the search for frequent itemsets. In the spirit of data mining with oracles, we analyze the complexity of this problem in terms of (i) crowd complexity, that measures the number of crowd questions required to identify the frequent itemsets; and (ii) computational complexity, that measures the computational effort required to choose the questions. We provide lower and upper complexity bounds in terms of the size and structure of the input taxonomy, as well as the size of a concise description of the output itemsets. We also provide constructive algorithms that achieve the upper bounds, and consider more efficient variants for practical situations.Comment: 18 pages, 2 figures. To be published to ICDT'13. Added missing acknowledgemen

    Managing big data experiments on smartphones

    Get PDF
    The explosive number of smartphones with ever growing sensing and computing capabilities have brought a paradigm shift to many traditional domains of the computing field. Re-programming smartphones and instrumenting them for application testing and data gathering at scale is currently a tedious and time-consuming process that poses significant logistical challenges. Next generation smartphone applications are expected to be much larger-scale and complex, demanding that these undergo evaluation and testing under different real-world datasets, devices and conditions. In this paper, we present an architecture for managing such large-scale data management experiments on real smartphones. We particularly present the building blocks of our architecture that encompassed smartphone sensor data collected by the crowd and organized in our big data repository. The given datasets can then be replayed on our testbed comprising of real and simulated smartphones accessible to developers through a web-based interface. We present the applicability of our architecture through a case study that involves the evaluation of individual components that are part of a complex indoor positioning system for smartphones, coined Anyplace, which we have developed over the years. The given study shows how our architecture allows us to derive novel insights into the performance of our algorithms and applications, by simplifying the management of large-scale data on smartphones

    Report on the Information Retrieval Festival (IRFest2017)

    Get PDF
    The Information Retrieval Festival took place in April 2017 in Glasgow. The focus of the workshop was to bring together IR researchers from the various Scottish universities and beyond in order to facilitate more awareness, increased interaction and reflection on the status of the field and its future. The program included an industry session, research talks, demos and posters as well as two keynotes. The first keynote was delivered by Prof. Jaana Kekalenien, who provided a historical, critical reflection of realism in Interactive Information Retrieval Experimentation, while the second keynote was delivered by Prof. Maarten de Rijke, who argued for more Artificial Intelligence usage in IR solutions and deployments. The workshop was followed by a "Tour de Scotland" where delegates were taken from Glasgow to Aberdeen for the European Conference in Information Retrieval (ECIR 2017
    • …
    corecore