31,116 research outputs found
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
A multi-class approach for ranking graph nodes: models and experiments with incomplete data
After the phenomenal success of the PageRank algorithm, many researchers have
extended the PageRank approach to ranking graphs with richer structures beside
the simple linkage structure. In some scenarios we have to deal with
multi-parameters data where each node has additional features and there are
relationships between such features.
This paper stems from the need of a systematic approach when dealing with
multi-parameter data. We propose models and ranking algorithms which can be
used with little adjustments for a large variety of networks (bibliographic
data, patent data, twitter and social data, healthcare data). In this paper we
focus on several aspects which have not been addressed in the literature: (1)
we propose different models for ranking multi-parameters data and a class of
numerical algorithms for efficiently computing the ranking score of such
models, (2) by analyzing the stability and convergence properties of the
numerical schemes we tune a fast and stable technique for the ranking problem,
(3) we consider the issue of the robustness of our models when data are
incomplete. The comparison of the rank on the incomplete data with the rank on
the full structure shows that our models compute consistent rankings whose
correlation is up to 60% when just 10% of the links of the attributes are
maintained suggesting the suitability of our model also when the data are
incomplete
Visual analysis of sensor logs in smart spaces: Activities vs. situations
Models of human habits in smart spaces can be expressed by using a multitude of representations whose readability influences the possibility of being validated by human experts. Our research is focused on developing a visual analysis pipeline (service) that allows, starting from the sensor log of a smart space, to graphically visualize human habits. The basic assumption is to apply techniques borrowed from the area of business process automation and mining on a version of the sensor log preprocessed in order to translate raw sensor measurements into human actions. The proposed pipeline is employed to automatically extract models to be reused for ambient intelligence. In this paper, we present an user evaluation aimed at demonstrating the effectiveness of the approach, by comparing it wrt. a relevant state-of-the-art visual tool, namely SITUVIS
User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy
Recommender systems have become an integral part of many social networks and
extract knowledge from a user's personal and sensitive data both explicitly,
with the user's knowledge, and implicitly. This trend has created major privacy
concerns as users are mostly unaware of what data and how much data is being
used and how securely it is used. In this context, several works have been done
to address privacy concerns for usage in online social network data and by
recommender systems. This paper surveys the main privacy concerns, measurements
and privacy-preserving techniques used in large-scale online social networks
and recommender systems. It is based on historical works on security,
privacy-preserving, statistical modeling, and datasets to provide an overview
of the technical difficulties and problems associated with privacy preserving
in online social networks.Comment: 26 pages, IET book chapter on big data recommender system
On the role of pre and post-processing in environmental data mining
The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed
- …