145,236 research outputs found
Multi-Sorted Inverse Frequent Itemsets Mining: On-Going Research
Inverse frequent itemset mining (IFM) consists of generating artificial transactional databases reflecting patterns of real ones, in particular, satisfying given frequency constraints on the itemsets. An extension of IFM called many-sorted IFM, is introduced where the schemes for the datasets to be generated are those typical of Big Tables, as required in emerging big data applications, e.g., social network analytics
A planetary nervous system for social mining and collective awareness
We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level. Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network. The PNS we foresee is the key tool for individual and collective awareness for the knowledge society. We need such a tool for everyone to become fully aware of how powerful is the knowledge of our society we can achieve by leveraging our wisdom as a crowd, and how important is that everybody participates both as a consumer and as a producer of the social knowledge, for it to become a trustable, accessible, safe and useful public good. Graphical abstrac
A planetary nervous system for social mining and collective awareness
We present a research roadmap of a Planetary Nervous System (PNS), capable of sensing and mining the digital breadcrumbs of human activities and unveiling the knowledge hidden in the big data for addressing the big questions about social complexity. We envision the PNS as a globally distributed, self-organizing, techno-social system for answering analytical questions about the status of world-wide society, based on three pillars: social sensing, social mining and the idea of trust networks and privacy-aware social mining. We discuss the ingredients of a science and a technology necessary to build the PNS upon the three mentioned pillars, beyond the limitations of their respective state-of-art. Social sensing is aimed at developing better methods for harvesting the big data from the techno-social ecosystem and make them available for mining, learning and analysis at a properly high abstraction level. Social mining is the problem of discovering patterns and models of human behaviour from the sensed data across the various social dimensions by data mining, machine learning and social network analysis. Trusted networks and privacy-aware social mining is aimed at creating a new deal around the questions of privacy and data ownership empowering individual persons with full awareness and control on own personal data, so that users may allow access and use of their data for their own good and the common good. The PNS will provide a goal-oriented knowledge discovery framework, made of technology and people, able to configure itself to the aim of answering questions about the pulse of global society. Given an analytical request, the PNS activates a process composed by a variety of interconnected tasks exploiting the social sensing and mining methods within the transparent ecosystem provided by the trusted network. The PNS we foresee is the key tool for individual and collective awareness for the knowledge society. We need such a tool for everyone to become fully aware of how powerful is the knowledge of our society we can achieve by leveraging our wisdom as a crowd, and how important is that everybody participates both as a consumer and as a producer of the social knowledge, for it to become a trustable, accessible, safe and useful public good.Seventh Framework Programme (European Commission) (grant agreement No. 284709
Counting Causal Paths in Big Times Series Data on Networks
Graph or network representations are an important foundation for data mining
and machine learning tasks in relational data. Many tools of network analysis,
like centrality measures, information ranking, or cluster detection rest on the
assumption that links capture direct influence, and that paths represent
possible indirect influence. This assumption is invalidated in time-stamped
network data capturing, e.g., dynamic social networks, biological sequences or
financial transactions. In such data, for two time-stamped links (A,B) and
(B,C) the chronological ordering and timing determines whether a causal path
from node A via B to C exists. A number of works has shown that for that reason
network analysis cannot be directly applied to time-stamped network data.
Existing methods to address this issue require statistics on causal paths,
which is computationally challenging for big data sets.
Addressing this problem, we develop an efficient algorithm to count causal
paths in time-stamped network data. Applying it to empirical data, we show that
our method is more efficient than a baseline method implemented in an
OpenSource data analytics package. Our method works efficiently for different
values of the maximum time difference between consecutive links of a causal
path and supports streaming scenarios. With it, we are closing a gap that
hinders an efficient analysis of big time series data on complex networks.Comment: 10 pages, 2 figure
Co-Following on Twitter
We present an in-depth study of co-following on Twitter based on the
observation that two Twitter users whose followers have similar friends are
also similar, even though they might not share any direct links or a single
mutual follower. We show how this observation contributes to (i) a better
understanding of language-agnostic user classification on Twitter, (ii)
eliciting opportunities for Computational Social Science, and (iii) improving
online marketing by identifying cross-selling opportunities.
We start with a machine learning problem of predicting a user's preference
among two alternative choices of Twitter friends. We show that co-following
information provides strong signals for diverse classification tasks and that
these signals persist even when (i) the most discriminative features are
removed and (ii) only relatively "sparse" users with fewer than 152 but more
than 43 Twitter friends are considered.
Going beyond mere classification performance optimization, we present
applications of our methodology to Computational Social Science. Here we
confirm stereotypes such as that the country singer Kenny Chesney
(@kennychesney) is more popular among @GOP followers, whereas Lady Gaga
(@ladygaga) enjoys more support from @TheDemocrats followers.
In the domain of marketing we give evidence that celebrity endorsement is
reflected in co-following and we demonstrate how our methodology can be used to
reveal the audience similarities between Apple and Puma and, less obviously,
between Nike and Coca-Cola. Concerning a user's popularity we find a
statistically significant connection between having a more "average"
followership and having more followers than direct rivals. Interestingly, a
\emph{larger} audience also seems to be linked to a \emph{less diverse}
audience in terms of their co-following.Comment: full version of a short paper at Hypertext 201
Pattern languages in HCI: A critical review
This article presents a critical review of patterns and pattern languages in human-computer interaction (HCI). In recent years, patterns and pattern languages have received considerable attention in HCI for their potential as a means for developing and communicating information and knowledge to support good design. This review examines the background to patterns and pattern languages in HCI, and seeks to locate pattern languages in relation to other approaches to interaction design. The review explores four key issues: What is a pattern? What is a pattern language? How are patterns and pattern languages used? and How are values reflected in the pattern-based approach to design? Following on from the review, a future research agenda is proposed for patterns and pattern languages in HCI
- …