145,236 research outputs found

    Multi-Sorted Inverse Frequent Itemsets Mining: On-Going Research

    Get PDF
    Inverse frequent itemset mining (IFM) consists of generating artificial transactional databases reflecting patterns of real ones, in particular, satisfying given frequency constraints on the itemsets. An extension of IFM called many-sorted IFM, is introduced where the schemes for the datasets to be generated are those typical of Big Tables, as required in emerging big data applications, e.g., social network analytics

    A planetary nervous system for social mining and collective awareness

    Get PDF
    We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level. Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network. The PNS we foresee is the key tool for individual and collective awareness for the knowledge society. We need such a tool for everyone to become fully aware of how powerful is the knowledge of our society we can achieve by leveraging our wisdom as a crowd, and how important is that everybody participates both as a consumer and as a producer of the social knowledge, for it to become a trustable, accessible, safe and useful public good. Graphical abstrac

    A planetary nervous system for social mining and collective awareness

    Get PDF
    We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level. Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network. The PNS we foresee is the key tool for individual and collective awareness for the knowledge society. We need such a tool for everyone to become fully aware of how powerful is the knowledge of our society we can achieve by leveraging our wisdom as a crowd, and how important is that everybody participates both as a consumer and as a producer of the social knowledge, for it to become a trustable, accessible, safe and useful public good.Seventh Framework Programme (European Commission) (grant agreement No. 284709

    Counting Causal Paths in Big Times Series Data on Networks

    Full text link
    Graph or network representations are an important foundation for data mining and machine learning tasks in relational data. Many tools of network analysis, like centrality measures, information ranking, or cluster detection rest on the assumption that links capture direct influence, and that paths represent possible indirect influence. This assumption is invalidated in time-stamped network data capturing, e.g., dynamic social networks, biological sequences or financial transactions. In such data, for two time-stamped links (A,B) and (B,C) the chronological ordering and timing determines whether a causal path from node A via B to C exists. A number of works has shown that for that reason network analysis cannot be directly applied to time-stamped network data. Existing methods to address this issue require statistics on causal paths, which is computationally challenging for big data sets. Addressing this problem, we develop an efficient algorithm to count causal paths in time-stamped network data. Applying it to empirical data, we show that our method is more efficient than a baseline method implemented in an OpenSource data analytics package. Our method works efficiently for different values of the maximum time difference between consecutive links of a causal path and supports streaming scenarios. With it, we are closing a gap that hinders an efficient analysis of big time series data on complex networks.Comment: 10 pages, 2 figure

    Co-Following on Twitter

    Full text link
    We present an in-depth study of co-following on Twitter based on the observation that two Twitter users whose followers have similar friends are also similar, even though they might not share any direct links or a single mutual follower. We show how this observation contributes to (i) a better understanding of language-agnostic user classification on Twitter, (ii) eliciting opportunities for Computational Social Science, and (iii) improving online marketing by identifying cross-selling opportunities. We start with a machine learning problem of predicting a user's preference among two alternative choices of Twitter friends. We show that co-following information provides strong signals for diverse classification tasks and that these signals persist even when (i) the most discriminative features are removed and (ii) only relatively "sparse" users with fewer than 152 but more than 43 Twitter friends are considered. Going beyond mere classification performance optimization, we present applications of our methodology to Computational Social Science. Here we confirm stereotypes such as that the country singer Kenny Chesney (@kennychesney) is more popular among @GOP followers, whereas Lady Gaga (@ladygaga) enjoys more support from @TheDemocrats followers. In the domain of marketing we give evidence that celebrity endorsement is reflected in co-following and we demonstrate how our methodology can be used to reveal the audience similarities between Apple and Puma and, less obviously, between Nike and Coca-Cola. Concerning a user's popularity we find a statistically significant connection between having a more "average" followership and having more followers than direct rivals. Interestingly, a \emph{larger} audience also seems to be linked to a \emph{less diverse} audience in terms of their co-following.Comment: full version of a short paper at Hypertext 201

    Pattern languages in HCI: A critical review

    Get PDF
    This article presents a critical review of patterns and pattern languages in human-computer interaction (HCI). In recent years, patterns and pattern languages have received considerable attention in HCI for their potential as a means for developing and communicating information and knowledge to support good design. This review examines the background to patterns and pattern languages in HCI, and seeks to locate pattern languages in relation to other approaches to interaction design. The review explores four key issues: What is a pattern? What is a pattern language? How are patterns and pattern languages used? and How are values reflected in the pattern-based approach to design? Following on from the review, a future research agenda is proposed for patterns and pattern languages in HCI
    • …
    corecore