192 research outputs found

    Recommender Systems for Online and Mobile Social Networks: A survey

    Full text link
    Recommender Systems (RS) currently represent a fundamental tool in online services, especially with the advent of Online Social Networks (OSN). In this case, users generate huge amounts of contents and they can be quickly overloaded by useless information. At the same time, social media represent an important source of information to characterize contents and users' interests. RS can exploit this information to further personalize suggestions and improve the recommendation process. In this paper we present a survey of Recommender Systems designed and implemented for Online and Mobile Social Networks, highlighting how the use of social context information improves the recommendation task, and how standard algorithms must be enhanced and optimized to run in a fully distributed environment, as opportunistic networks. We describe advantages and drawbacks of these systems in terms of algorithms, target domains, evaluation metrics and performance evaluations. Eventually, we present some open research challenges in this area

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    Fast Similarity Graph Construction via Data Sketching Techniques

    Get PDF
    Graphs are mathematical structures used to model objects and their pairwise relationships. Due to their simple but expressive abstract representation, they are commonly used to model various types of relations and processes in technological, social or biological systems and have found numerous applications. A special type of graph is the similarity graph in which nodes represent entities and there is an edge connecting two nodes if the two entities are similar based on some similarity measure. In a typical scenario, raw data of entities are provided in the form of a relational dataset, matrix or a tensor and a similarity graph is built to facilitate graph-based analysis like node importance, node classification, link prediction, community detection, outlier detection, and more. The ability to construct similarity graphs fast is important and with a potential for high impact, thus several approximation techniques have been proposed. In this work, we propose data sketching based methods for fast approximate similarity graph construction. Data sketching techniques are applied on the raw data and are designed to achieve desired error guarantees. They can drastically reduce the size of raw data on which we operate, allowing for faster construction and analysis of similarity graphs, but with approximate results. This is a desirable tradeoff for many applications in diverse domains. Through a thorough experimental evaluation, we demonstrate that our sketching methods outperform sensible baselines and competitor methods proposed for the problem. First, they are much faster than exact methods while maintaining high accuracy in constructing the similarity graph. Furthermore, our methods demonstrate significantly higher accuracy than competitive methods on generic graph analysis tasks. We demonstrate the effectiveness of our methods on different real-world graph applications

    Towards Predictive Rendering in Virtual Reality

    Get PDF
    The strive for generating predictive images, i.e., images representing radiometrically correct renditions of reality, has been a longstanding problem in computer graphics. The exactness of such images is extremely important for Virtual Reality applications like Virtual Prototyping, where users need to make decisions impacting large investments based on the simulated images. Unfortunately, generation of predictive imagery is still an unsolved problem due to manifold reasons, especially if real-time restrictions apply. First, existing scenes used for rendering are not modeled accurately enough to create predictive images. Second, even with huge computational efforts existing rendering algorithms are not able to produce radiometrically correct images. Third, current display devices need to convert rendered images into some low-dimensional color space, which prohibits display of radiometrically correct images. Overcoming these limitations is the focus of current state-of-the-art research. This thesis also contributes to this task. First, it briefly introduces the necessary background and identifies the steps required for real-time predictive image generation. Then, existing techniques targeting these steps are presented and their limitations are pointed out. To solve some of the remaining problems, novel techniques are proposed. They cover various steps in the predictive image generation process, ranging from accurate scene modeling over efficient data representation to high-quality, real-time rendering. A special focus of this thesis lays on real-time generation of predictive images using bidirectional texture functions (BTFs), i.e., very accurate representations for spatially varying surface materials. The techniques proposed by this thesis enable efficient handling of BTFs by compressing the huge amount of data contained in this material representation, applying them to geometric surfaces using texture and BTF synthesis techniques, and rendering BTF covered objects in real-time. Further approaches proposed in this thesis target inclusion of real-time global illumination effects or more efficient rendering using novel level-of-detail representations for geometric objects. Finally, this thesis assesses the rendering quality achievable with BTF materials, indicating a significant increase in realism but also confirming the remainder of problems to be solved to achieve truly predictive image generation

    Enhancing explainability and scrutability of recommender systems

    Get PDF
    Our increasing reliance on complex algorithms for recommendations calls for models and methods for explainable, scrutable, and trustworthy AI. While explainability is required for understanding the relationships between model inputs and outputs, a scrutable system allows us to modify its behavior as desired. These properties help bridge the gap between our expectations and the algorithm’s behavior and accordingly boost our trust in AI. Aiming to cope with information overload, recommender systems play a crucial role in filtering content (such as products, news, songs, and movies) and shaping a personalized experience for their users. Consequently, there has been a growing demand from the information consumers to receive proper explanations for their personalized recommendations. These explanations aim at helping users understand why certain items are recommended to them and how their previous inputs to the system relate to the generation of such recommendations. Besides, in the event of receiving undesirable content, explanations could possibly contain valuable information as to how the system’s behavior can be modified accordingly. In this thesis, we present our contributions towards explainability and scrutability of recommender systems: • We introduce a user-centric framework, FAIRY, for discovering and ranking post-hoc explanations for the social feeds generated by black-box platforms. These explanations reveal relationships between users’ profiles and their feed items and are extracted from the local interaction graphs of users. FAIRY employs a learning-to-rank (LTR) method to score candidate explanations based on their relevance and surprisal. • We propose a method, PRINCE, to facilitate provider-side explainability in graph-based recommender systems that use personalized PageRank at their core. PRINCE explanations are comprehensible for users, because they present subsets of the user’s prior actions responsible for the received recommendations. PRINCE operates in a counterfactual setup and builds on a polynomial-time algorithm for finding the smallest counterfactual explanations. • We propose a human-in-the-loop framework, ELIXIR, for enhancing scrutability and subsequently the recommendation models by leveraging user feedback on explanations. ELIXIR enables recommender systems to collect user feedback on pairs of recommendations and explanations. The feedback is incorporated into the model by imposing a soft constraint for learning user-specific item representations. We evaluate all proposed models and methods with real user studies and demonstrate their benefits at achieving explainability and scrutability in recommender systems.Unsere zunehmende Abhängigkeit von komplexen Algorithmen für maschinelle Empfehlungen erfordert Modelle und Methoden für erklärbare, nachvollziehbare und vertrauenswürdige KI. Zum Verstehen der Beziehungen zwischen Modellein- und ausgaben muss KI erklärbar sein. Möchten wir das Verhalten des Systems hingegen nach unseren Vorstellungen ändern, muss dessen Entscheidungsprozess nachvollziehbar sein. Erklärbarkeit und Nachvollziehbarkeit von KI helfen uns dabei, die Lücke zwischen dem von uns erwarteten und dem tatsächlichen Verhalten der Algorithmen zu schließen und unser Vertrauen in KI-Systeme entsprechend zu stärken. Um ein Übermaß an Informationen zu verhindern, spielen Empfehlungsdienste eine entscheidende Rolle um Inhalte (z.B. Produkten, Nachrichten, Musik und Filmen) zu filtern und deren Benutzern eine personalisierte Erfahrung zu bieten. Infolgedessen erheben immer mehr In- formationskonsumenten Anspruch auf angemessene Erklärungen für deren personalisierte Empfehlungen. Diese Erklärungen sollen den Benutzern helfen zu verstehen, warum ihnen bestimmte Dinge empfohlen wurden und wie sich ihre früheren Eingaben in das System auf die Generierung solcher Empfehlungen auswirken. Außerdem können Erklärungen für den Fall, dass unerwünschte Inhalte empfohlen werden, wertvolle Informationen darüber enthalten, wie das Verhalten des Systems entsprechend geändert werden kann. In dieser Dissertation stellen wir unsere Beiträge zu Erklärbarkeit und Nachvollziehbarkeit von Empfehlungsdiensten vor. • Mit FAIRY stellen wir ein benutzerzentriertes Framework vor, mit dem post-hoc Erklärungen für die von Black-Box-Plattformen generierten sozialen Feeds entdeckt und bewertet werden können. Diese Erklärungen zeigen Beziehungen zwischen Benutzerprofilen und deren Feeds auf und werden aus den lokalen Interaktionsgraphen der Benutzer extrahiert. FAIRY verwendet eine LTR-Methode (Learning-to-Rank), um die Erklärungen anhand ihrer Relevanz und ihres Grads unerwarteter Empfehlungen zu bewerten. • Mit der PRINCE-Methode erleichtern wir das anbieterseitige Generieren von Erklärungen für PageRank-basierte Empfehlungsdienste. PRINCE-Erklärungen sind für Benutzer verständlich, da sie Teilmengen früherer Nutzerinteraktionen darstellen, die für die erhaltenen Empfehlungen verantwortlich sind. PRINCE-Erklärungen sind somit kausaler Natur und werden von einem Algorithmus mit polynomieller Laufzeit erzeugt , um präzise Erklärungen zu finden. • Wir präsentieren ein Human-in-the-Loop-Framework, ELIXIR, um die Nachvollziehbarkeit der Empfehlungsmodelle und die Qualität der Empfehlungen zu verbessern. Mit ELIXIR können Empfehlungsdienste Benutzerfeedback zu Empfehlungen und Erklärungen sammeln. Das Feedback wird in das Modell einbezogen, indem benutzerspezifischer Einbettungen von Objekten gelernt werden. Wir evaluieren alle Modelle und Methoden in Benutzerstudien und demonstrieren ihren Nutzen hinsichtlich Erklärbarkeit und Nachvollziehbarkeit von Empfehlungsdiensten

    Temporal graph mining and distributed processing

    Get PDF
    With the recent growth of social media platforms and the human desire to interact with the digital world a lot of human-human and human-device interaction data is getting generated every second. With the boom of the Internet of Things (IoT) devices, a lot of device-device interactions are also now on the rise. All these interactions are nothing but a representation of how the underlying network is connecting different entities over time. These interactions when modeled as an interaction network presents a lot of unique opportunities to uncover interesting patterns and to understand the dynamics of the network. Understanding the dynamics of the network is very important because it encapsulates the way we communicate, socialize, consume information and get influenced. To this end, in this PhD thesis, we focus on analyzing an interaction network to understand how the underlying network is being used. We define interaction network as a sequence of time-stamped interactions E over edges of a static graph G=(V, E). Interaction networks can be used to model many real-world networks for example, in a social network or a communication network, each interaction over an edge represents an interaction between two users, e.g., emailing, making a call, re-tweeting, or in case of the financial network an interaction between two accounts to represent a transaction. We analyze interaction network under two settings. In the first setting, we study interaction network under a sliding window model. We assume a node could pass information to other nodes if they are connected to them using edges present in a time window. In this model, we study how the importance or centrality of a node evolves over time. In the second setting, we put additional constraints on how information flows between nodes. We assume a node could pass information to other nodes only if there is a temporal path between them. To restrict the length of the temporal paths we consider a time window in this approach as well. We apply this model to solve the time-constrained influence maximization problem. By analyzing the interaction network data under our model we find the top-k most influential nodes. We test our model both on human-human interaction using social network data as well as on location-location interaction using location-based social network(LBSNs) data. In the same setting, we also mine temporal cyclic paths to understand the communication patterns in a network. Temporal cycles have many applications and appear naturally in communication networks where one person posts a message and after a while reacts to a thread of reactions from peers on the post. In financial networks, on the other hand, the presence of a temporal cycle could be indicative of certain types of fraud. We provide efficient algorithms for all our analysis and test their efficiency and effectiveness on real-world data. Finally, given that many of the algorithms we study have huge computational demands, we also studied distributed graph processing algorithms. An important aspect of distributed graph processing is to correctly partition the graph data between different machine. A lot of research has been done on efficient graph partitioning strategies but there is no one good partitioning strategy for all kind of graphs and algorithms. Choosing the best partitioning strategy is nontrivial and is mostly a trial and error exercise. To address this problem we provide a cost model based approach to give a better understanding of how a given partitioning strategy is performing for a given graph and algorithm.Con el reciente crecimiento de las redes sociales y el deseo humano de interactuar con el mundo digital, una gran cantidad de datos de interacción humano-a-humano o humano-a-dispositivo se generan cada segundo. Con el auge de los dispositivos IoT, las interacciones dispositivo-a-dispositivo también están en alza. Todas estas interacciones no son más que una representación de como la red subyacente conecta distintas entidades en el tiempo. Modelar estas interacciones en forma de red de interacciones presenta una gran cantidad de oportunidades únicas para descubrir patrones interesantes y entender la dinamicidad de la red. Entender la dinamicidad de la red es clave ya que encapsula la forma en la que nos comunicamos, socializamos, consumimos información y somos influenciados. Para ello, en esta tesis doctoral, nos centramos en analizar una red de interacciones para entender como la red subyacente es usada. Definimos una red de interacciones como una sequencia de interacciones grabadas en el tiempo E sobre aristas de un grafo estático G=(V, E). Las redes de interacción se pueden usar para modelar gran cantidad de aplicaciones reales, por ejemplo en una red social o de comunicaciones cada interacción sobre una arista representa una interacción entre dos usuarios (correo electrónico, llamada, retweet), o en el caso de una red financiera una interacción entre dos cuentas para representar una transacción. Analizamos las redes de interacción bajo múltiples escenarios. En el primero, estudiamos las redes de interacción bajo un modelo de ventana deslizante. Asumimos que un nodo puede mandar información a otros nodos si estan conectados utilizando aristas presentes en una ventana temporal. En este modelo, estudiamos como la importancia o centralidad de un nodo evoluciona en el tiempo. En el segundo escenario añadimos restricciones adicionales respecto como la información fluye entre nodos. Asumimos que un nodo puede mandar información a otros nodos solo si existe un camino temporal entre ellos. Para restringir la longitud de los caminos temporales también asumimos una ventana temporal. Aplicamos este modelo para resolver este problema de maximización de influencia restringido temporalmente. Analizando los datos de la red de interacción bajo nuestro modelo intentamos descubrir los k nodos más influyentes. Examinamos nuestro modelo en interacciones humano-a-humano, usando datos de redes sociales, como en ubicación-a-ubicación usando datos de redes sociales basades en localización (LBSNs). En el mismo escenario también minamos camínos cíclicos temporales para entender los patrones de comunicación en una red. Existen múltiples aplicaciones para cíclos temporales y aparecen naturalmente en redes de comunicación donde una persona envía un mensaje y después de un tiempo reacciona a una cadena de reacciones de compañeros en el mensaje. En redes financieras, por otro lado, la presencia de un ciclo temporal puede indicar ciertos tipos de fraude. Proponemos algoritmos eficientes para todos nuestros análisis y evaluamos su eficiencia y efectividad en datos reales. Finalmente, dado que muchos de los algoritmos estudiados tienen una gran demanda computacional, también estudiamos los algoritmos de procesado distribuido de grafos. Un aspecto importante de procesado distribuido de grafos es el de correctamente particionar los datos del grafo entre distintas máquinas. Gran cantidad de investigación se ha realizado en estrategias para particionar eficientemente un grafo, pero no existe un particionamento bueno para todos los tipos de grafos y algoritmos. Escoger la mejor estrategia de partición no es trivial y es mayoritariamente un ejercicio de prueba y error. Con tal de abordar este problema, proporcionamos un modelo de costes para dar un mejor entendimiento en como una estrategia de particionamiento actúa dado un grafo y un algoritmo

    Temporal graph mining and distributed processing

    Get PDF
    Cotutela Universitat Politècnica de Catalunya i Université Libre de BruxellesWith the recent growth of social media platforms and the human desire to interact with the digital world a lot of human-human and human-device interaction data is getting generated every second. With the boom of the Internet of Things (IoT) devices, a lot of device-device interactions are also now on the rise. All these interactions are nothing but a representation of how the underlying network is connecting different entities over time. These interactions when modeled as an interaction network presents a lot of unique opportunities to uncover interesting patterns and to understand the dynamics of the network. Understanding the dynamics of the network is very important because it encapsulates the way we communicate, socialize, consume information and get influenced. To this end, in this PhD thesis, we focus on analyzing an interaction network to understand how the underlying network is being used. We define interaction network as a sequence of time-stamped interactions E over edges of a static graph G=(V, E). Interaction networks can be used to model many real-world networks for example, in a social network or a communication network, each interaction over an edge represents an interaction between two users, e.g., emailing, making a call, re-tweeting, or in case of the financial network an interaction between two accounts to represent a transaction. We analyze interaction network under two settings. In the first setting, we study interaction network under a sliding window model. We assume a node could pass information to other nodes if they are connected to them using edges present in a time window. In this model, we study how the importance or centrality of a node evolves over time. In the second setting, we put additional constraints on how information flows between nodes. We assume a node could pass information to other nodes only if there is a temporal path between them. To restrict the length of the temporal paths we consider a time window in this approach as well. We apply this model to solve the time-constrained influence maximization problem. By analyzing the interaction network data under our model we find the top-k most influential nodes. We test our model both on human-human interaction using social network data as well as on location-location interaction using location-based social network(LBSNs) data. In the same setting, we also mine temporal cyclic paths to understand the communication patterns in a network. Temporal cycles have many applications and appear naturally in communication networks where one person posts a message and after a while reacts to a thread of reactions from peers on the post. In financial networks, on the other hand, the presence of a temporal cycle could be indicative of certain types of fraud. We provide efficient algorithms for all our analysis and test their efficiency and effectiveness on real-world data. Finally, given that many of the algorithms we study have huge computational demands, we also studied distributed graph processing algorithms. An important aspect of distributed graph processing is to correctly partition the graph data between different machine. A lot of research has been done on efficient graph partitioning strategies but there is no one good partitioning strategy for all kind of graphs and algorithms. Choosing the best partitioning strategy is nontrivial and is mostly a trial and error exercise. To address this problem we provide a cost model based approach to give a better understanding of how a given partitioning strategy is performing for a given graph and algorithm.Con el reciente crecimiento de las redes sociales y el deseo humano de interactuar con el mundo digital, una gran cantidad de datos de interacción humano-a-humano o humano-a-dispositivo se generan cada segundo. Con el auge de los dispositivos IoT, las interacciones dispositivo-a-dispositivo también están en alza. Todas estas interacciones no son más que una representación de como la red subyacente conecta distintas entidades en el tiempo. Modelar estas interacciones en forma de red de interacciones presenta una gran cantidad de oportunidades únicas para descubrir patrones interesantes y entender la dinamicidad de la red. Entender la dinamicidad de la red es clave ya que encapsula la forma en la que nos comunicamos, socializamos, consumimos información y somos influenciados. Para ello, en esta tesis doctoral, nos centramos en analizar una red de interacciones para entender como la red subyacente es usada. Definimos una red de interacciones como una sequencia de interacciones grabadas en el tiempo E sobre aristas de un grafo estático G=(V, E). Las redes de interacción se pueden usar para modelar gran cantidad de aplicaciones reales, por ejemplo en una red social o de comunicaciones cada interacción sobre una arista representa una interacción entre dos usuarios (correo electrónico, llamada, retweet), o en el caso de una red financiera una interacción entre dos cuentas para representar una transacción. Analizamos las redes de interacción bajo múltiples escenarios. En el primero, estudiamos las redes de interacción bajo un modelo de ventana deslizante. Asumimos que un nodo puede mandar información a otros nodos si estan conectados utilizando aristas presentes en una ventana temporal. En este modelo, estudiamos como la importancia o centralidad de un nodo evoluciona en el tiempo. En el segundo escenario añadimos restricciones adicionales respecto como la información fluye entre nodos. Asumimos que un nodo puede mandar información a otros nodos solo si existe un camino temporal entre ellos. Para restringir la longitud de los caminos temporales también asumimos una ventana temporal. Aplicamos este modelo para resolver este problema de maximización de influencia restringido temporalmente. Analizando los datos de la red de interacción bajo nuestro modelo intentamos descubrir los k nodos más influyentes. Examinamos nuestro modelo en interacciones humano-a-humano, usando datos de redes sociales, como en ubicación-a-ubicación usando datos de redes sociales basades en localización (LBSNs). En el mismo escenario también minamos camínos cíclicos temporales para entender los patrones de comunicación en una red. Existen múltiples aplicaciones para cíclos temporales y aparecen naturalmente en redes de comunicación donde una persona envía un mensaje y después de un tiempo reacciona a una cadena de reacciones de compañeros en el mensaje. En redes financieras, por otro lado, la presencia de un ciclo temporal puede indicar ciertos tipos de fraude. Proponemos algoritmos eficientes para todos nuestros análisis y evaluamos su eficiencia y efectividad en datos reales. Finalmente, dado que muchos de los algoritmos estudiados tienen una gran demanda computacional, también estudiamos los algoritmos de procesado distribuido de grafos. Un aspecto importante de procesado distribuido de grafos es el de correctamente particionar los datos del grafo entre distintas máquinas. Gran cantidad de investigación se ha realizado en estrategias para particionar eficientemente un grafo, pero no existe un particionamento bueno para todos los tipos de grafos y algoritmos. Escoger la mejor estrategia de partición no es trivial y es mayoritariamente un ejercicio de prueba y error. Con tal de abordar este problema, proporcionamos un modelo de costes para dar un mejor entendimiento en como una estrategia de particionamiento actúa dado un grafo y un algoritmo.Postprint (published version
    corecore