416 research outputs found

    Providing awareness, explanation and control of personalized stream filtering in a P2P social network

    Get PDF
    In Online Social Networks (OSNs), users are often overwhelmed with a huge amount of social data, most of which are irrelevant to their interest. Filtering of the social data stream is the common way to deal with this problem, and it has already been applied by OSNs, such as Facebook and Google+. Unfortunately, personalized filtering leads to “the filter bubble” problem where the user is trapped inside a world within the limited boundaries of her interests and cannot be exposed to any surprising, desirable information. Moreover, these OSNs are black boxes, providing no transparency for the user about how the filtering mechanism decides what is to be shown in the activity stream. As a result, the user trust in the system can decline. This thesis presents an interactive method to visualize the personalized stream filtering in OSNs. The proposed visualization helps to create awareness, explanation, and control of personalized stream filtering to alleviate “the filter bubble” problem and increase the users’ trust in the system. The visualization is implemented in MADMICA – a new privacy-aware decentralized OSN, based on the Friendica P2P protocol, which filters the social updates stream of users based on their interests. The results of three user evaluations are presented in this thesis: small-scale pilot study, qualitative study and large-scale quantitative study with 326 participants. The results of the small-scale study show that the filter bubble visualization makes the users aware of the filtering mechanism, engages them in actions to correct and change it, and as a result, increases the users’ trust in the system. The qualitative study reveals a generally higher proportion of desirable user perceptions for the awareness, explanation and control of the filter bubble provided by the visualization. Moreover, the results of the quantitative study demonstrate that the visualization leads to increased users’ awareness of the filter bubble, understandability of the filtering mechanism and to a feeling of control over the data stream they are seeing

    Providing awareness, explanation and control of personalized filtering in a social networking site

    Get PDF
    Social networking sites (SNSs) have applied personalized filtering to deal with overwhelmingly irrelevant social data. However, due to the focus of accuracy, the personalized filtering often leads to “the filter bubble” problem where the users can only receive information that matches their pre-stated preferences but fail to be exposed to new topics. Moreover, these SNSs are black boxes, providing no transparency for the user about how the filtering mechanism decides what is to be shown in the activity stream. As a result, the user’s usage experience and trust in the system can decline. This paper presents an interactive method to visualize the personalized filtering in SNSs. The proposed visualization helps to create awareness, explanation, and control of personalized filtering to alleviate the “filter bubble” problem and increase the users’ trust in the system. Three user evaluations are presented. The results show that users have a good understanding about the filter bubble visualization, and the visualization can increase users’ awareness of the filter bubble, understandability of the filtering mechanism and to a feeling of control over the data stream they are seeing. The intuitiveness of the design is overall good, but a context sensitive help is also preferred. Moreover, the visualization can provide users with better usage experience and increase users’ trust in the system

    Load shedding in network monitoring applications

    Get PDF
    Monitoring and mining real-time network data streams are crucial operations for managing and operating data networks. The information that network operators desire to extract from the network traffic is of different size, granularity and accuracy depending on the measurement task (e.g., relevant data for capacity planning and intrusion detection are very different). To satisfy these different demands, a new class of monitoring systems is emerging to handle multiple and arbitrary monitoring applications. Such systems must inevitably cope with the effects of continuous overload situations due to the large volumes, high data rates and bursty nature of the network traffic. These overload situations can severely compromise the accuracy and effectiveness of monitoring systems, when their results are most valuable to network operators. In this thesis, we propose a technique called load shedding as an effective and low-cost alternative to over-provisioning in network monitoring systems. It allows these systems to handle efficiently overload situations in the presence of multiple, arbitrary and competing monitoring applications. We present the design and evaluation of a predictive load shedding scheme that can shed excess load in front of extreme traffic conditions and maintain the accuracy of the monitoring applications within bounds defined by end users, while assuring a fair allocation of computing resources to non-cooperative applications. The main novelty of our scheme is that it considers monitoring applications as black boxes, with arbitrary (and highly variable) input traffic and processing cost. Without any explicit knowledge of the application internals, the proposed scheme extracts a set of features from the traffic streams to build an on-line prediction model of the resource requirements of each monitoring application, which is used to anticipate overload situations and control the overall resource usage by sampling the input packet streams. This way, the monitoring system preserves a high degree of flexibility, increasing the range of applications and network scenarios where it can be used. Since not all monitoring applications are robust against sampling, we then extend our load shedding scheme to support custom load shedding methods defined by end users, in order to provide a generic solution for arbitrary monitoring applications. Our scheme allows the monitoring system to safely delegate the task of shedding excess load to the applications and still guarantee fairness of service with non-cooperative users. We implemented our load shedding scheme in an existing network monitoring system and deployed it in a research ISP network. We present experimental evidence of the performance and robustness of our system with several concurrent monitoring applications during long-lived executions and using real-world traffic traces.Postprint (published version

    AI Applications to Power Systems

    Get PDF
    Today, the flow of electricity is bidirectional, and not all electricity is centrally produced in large power plants. With the growing emergence of prosumers and microgrids, the amount of electricity produced by sources other than large, traditional power plants is ever-increasing. These alternative sources include photovoltaic (PV), wind turbine (WT), geothermal, and biomass renewable generation plants. Some renewable energy resources (solar PV and wind turbine generation) are highly dependent on natural processes and parameters (wind speed, wind direction, temperature, solar irradiation, humidity, etc.). Thus, the outputs are so stochastic in nature. New data-science-inspired real-time solutions are needed in order to co-develop digital twins of large intermittent renewable plants whose services can be globally delivered

    A Critical Look at Decentralized Personal Data Architectures

    Full text link
    While the Internet was conceived as a decentralized network, the most widely used web applications today tend toward centralization. Control increasingly rests with centralized service providers who, as a consequence, have also amassed unprecedented amounts of data about the behaviors and personalities of individuals. Developers, regulators, and consumer advocates have looked to alternative decentralized architectures as the natural response to threats posed by these centralized services. The result has been a great variety of solutions that include personal data stores (PDS), infomediaries, Vendor Relationship Management (VRM) systems, and federated and distributed social networks. And yet, for all these efforts, decentralized personal data architectures have seen little adoption. This position paper attempts to account for these failures, challenging the accepted wisdom in the web community on the feasibility and desirability of these approaches. We start with a historical discussion of the development of various categories of decentralized personal data architectures. Then we survey the main ideas to illustrate the common themes among these efforts. We tease apart the design characteristics of these systems from the social values that they (are intended to) promote. We use this understanding to point out numerous drawbacks of the decentralization paradigm, some inherent and others incidental. We end with recommendations for designers of these systems for working towards goals that are achievable, but perhaps more limited in scope and ambition

    Efficient algorithms for passive network measurement

    Get PDF
    Network monitoring has become a necessity to aid in the management and operation of large networks. Passive network monitoring consists of extracting metrics (or any information of interest) by analyzing the traffic that traverses one or more network links. Extracting information from a high-speed network link is challenging, given the great data volumes and short packet inter-arrival times. These difficulties can be alleviated by using extremely efficient algorithms or by sampling the incoming traffic. This work improves the state of the art in both these approaches. For one-way packet delay measurement, we propose a series of improvements over a recently appeared technique called Lossy Difference Aggregator. A main limitation of this technique is that it does not provide per-flow measurements. We propose a data structure called Lossy Difference Sketch that is capable of providing such per-flow delay measurements, and, unlike recent related works, does not rely on any model of packet delays. In the problem of collecting measurements under the sliding window model, we focus on the estimation of the number of active flows and in traffic filtering. Using a common approach, we propose one algorithm for each problem that obtains great accuracy with significant resource savings. In the traffic sampling area, the selection of the sampling rate is a crucial aspect. The most sensible approach involves dynamically adjusting sampling rates according to network traffic conditions, which is known as adaptive sampling. We propose an algorithm called Cuckoo Sampling that can operate with a fixed memory budget and perform adaptive flow-wise packet sampling. It is based on a very simple data structure and is computationally extremely lightweight. The techniques presented in this work are thoroughly evaluated through a combination of theoretical and experimental analysis.Postprint (published version

    Presenting tiered recommendations in social activity streams

    Get PDF
    Modern social networking sites offer node-centralized streams that display recent updates from the other nodes in one's network. While such social activity streams are convenient features that help alleviate information overload, they can often become overwhelming themselves, especially high-throughput streams like Twitter’s home timelines. In these cases, recommender systems can help guide users toward the content they will find most important or interesting. However, current efforts to manipulate social activity streams involve hiding updates predicted to be less engaging or reordering them to place new or more engaging content first. These modifications can lead to decreased trust in the system and an inability to consume each update in its chronological context. Instead, I propose a three-tiered approach to displaying recommendations in social activity streams that hides nothing and preserves original context by highlighting updates predicted to be most important and de-emphasizing updates predicted to be least important. This presentation design allows users easily to consume different levels of recommended items chronologically, is able to persuade users to agree with its positive recommendations more than 25% more often than the baseline, and shows no significant loss of perceived accuracy or trust when compared with a filtered stream, possibly even performing better when extreme recommendation errors are intentionally introduced. Numerous directions for future research follow from this work that can shed light on how users react to different recommendation presentation designs and explain how study of an emphasis-based approach might help improve the state of the art

    Contributions to security and privacy protection in recommendation systems

    Get PDF
    A recommender system is an automatic system that, given a customer model and a set of available documents, is able to select and offer those documents that are more interesting to the customer. From the point of view of security, there are two main issues that recommender systems must face: protection of the users' privacy and protection of other participants of the recommendation process. Recommenders issue personalized recommendations taking into account not only the profile of the documents, but also the private information that customers send to the recommender. Hence, the users' profiles include personal and highly sensitive information, such as their likes and dislikes. In order to have a really useful recommender system and improve its efficiency, we believe that users shouldn't be afraid of stating their preferences. The second challenge from the point of view of security involves the protection against a new kind of attack. Copyright holders have shifted their targets to attack the document providers and any other participant that aids in the process of distributing documents, even unknowingly. In addition, new legislation trends such as ACTA or the ¿Sinde-Wert law¿ in Spain show the interest of states all over the world to control and prosecute these intermediate nodes. we proposed the next contributions: 1.A social model that captures user's interests into the users' profiles, and a metric function that calculates the similarity between users, queries and documents. This model represents profiles as vectors of a social space. Document profiles are created by means of the inspection of the contents of the document. Then, user profiles are calculated as an aggregation of the profiles of the documents that the user owns. Finally, queries are a constrained view of a user profile. This way, all profiles are contained in the same social space, and the similarity metric can be used on any pair of them. 2.Two mechanisms to protect the personal information that the user profiles contain. The first mechanism takes advantage of the Johnson-Lindestrauss and Undecomposability of random matrices theorems to project profiles into social spaces of less dimensions. Even if the information about the user is reduced in the projected social space, under certain circumstances the distances between the original profiles are maintained. The second approach uses a zero-knowledge protocol to answer the question of whether or not two profiles are affine without leaking any information in case of that they are not. 3.A distributed system on a cloud that protects merchants, customers and indexers against legal attacks, by means of providing plausible deniability and oblivious routing to all the participants of the system. We use the term DocCloud to refer to this system. DocCloud organizes databases in a tree-shape structure over a cloud system and provide a Private Information Retrieval protocol to avoid that any participant or observer of the process can identify the recommender. This way, customers, intermediate nodes and even databases are not aware of the specific database that answered the query. 4.A social, P2P network where users link together according to their similarity, and provide recommendations to other users in their neighborhood. We defined an epidemic protocol were links are established based on the neighbors similarity, clustering and randomness. Additionally, we proposed some mechanisms such as the use SoftDHT to aid in the identification of affine users, and speed up the process of creation of clusters of similar users. 5.A document distribution system that provides the recommended documents at the end of the process. In our view of a recommender system, the recommendation is a complete process that ends when the customer receives the recommended document. We proposed SCFS, a distributed and secure filesystem where merchants, documents and users are protectedEste documento explora c omo localizar documentos interesantes para el usuario en grandes redes distribuidas mediante el uso de sistemas de recomendaci on. Se de fine un sistema de recomendaci on como un sistema autom atico que, dado un modelo de cliente y un conjunto de documentos disponibles, es capaz de seleccionar y ofrecer los documentos que son m as interesantes para el cliente. Las caracter sticas deseables de un sistema de recomendaci on son: (i) ser r apido, (ii) distribuido y (iii) seguro. Un sistema de recomendaci on r apido mejora la experiencia de compra del cliente, ya que una recomendaci on no es util si es que llega demasiado tarde. Un sistema de recomendaci on distribuido evita la creaci on de bases de datos centralizadas con informaci on sensible y mejora la disponibilidad de los documentos. Por ultimo, un sistema de recomendaci on seguro protege a todos los participantes del sistema: usuarios, proveedores de contenido, recomendadores y nodos intermedios. Desde el punto de vista de la seguridad, existen dos problemas principales a los que se deben enfrentar los sistemas de recomendaci on: (i) la protecci on de la intimidad de los usuarios y (ii) la protecci on de los dem as participantes del proceso de recomendaci on. Los recomendadores son capaces de emitir recomendaciones personalizadas teniendo en cuenta no s olo el per l de los documentos, sino tambi en a la informaci on privada que los clientes env an al recomendador. Por tanto, los per les de usuario incluyen informaci on personal y altamente sensible, como sus gustos y fobias. Con el n de desarrollar un sistema de recomendaci on util y mejorar su e cacia, creemos que los usuarios no deben tener miedo a la hora de expresar sus preferencias. Para ello, la informaci on personal que est a incluida en los per les de usuario debe ser protegida y la privacidad del usuario garantizada. El segundo desafi o desde el punto de vista de la seguridad implica un nuevo tipo de ataque. Dado que la prevenci on de la distribuci on ilegal de documentos con derechos de autor por medio de soluciones t ecnicas no ha sido efi caz, los titulares de derechos de autor cambiaron sus objetivos para atacar a los proveedores de documentos y cualquier otro participante que ayude en el proceso de distribuci on de documentos. Adem as, tratados y leyes como ACTA, la ley SOPA de EEUU o la ley "Sinde-Wert" en España ponen de manfi esto el inter es de los estados de todo el mundo para controlar y procesar a estos nodos intermedios. Los juicios recientes como MegaUpload, PirateBay o el caso contra el Sr. Pablo Soto en España muestran que estas amenazas son una realidad

    Contributions to presence-based systems for deploying ubiquitous communication services

    Get PDF
    Next-Generation Networks (NGNs) will converge the existing fixed and wireless networks. These networks rely on the IMS (IP Multimedia Subsystem), introduced by the 3GPP. The presence service came into being in instant messaging applications. A user¿s presence information consists in any context that is necessary for applications to handle and adapt the user's communications. The presence service is crucial in the IMS to deploy ubiquitous services. SIMPLE is the standard protocol for handling presence and instant messages. This protocol disseminates users' presence information through subscriptions, notifications and publications. SIMPLE generates much signaling traffic for constantly disseminating presence information and maintaining subscriptions, which may overload network servers. This issue is even more harmful to the IMS due to its centralized servers. A key factor in the success of NGNs is to provide users with always-on services that are seamlessly part of their daily life. Personalizing these services according to the users' needs is necessary for the success of these services. To this end, presence information is considered as a crucial tool for user-based personalization. This thesis can be briefly summarized through the following contributions: We propose filtering and controlling the rate of presence publications so as to reduce the information sent over access links. We probabilistically model presence information through Markov chains, and analyzed the efficiency of controlling the rate of publications that are modeled by a particular Markov chain. The reported results show that this technique certainly reduces presence overload. We mathematically study the amount of presence traffic exchanged between domains, and analyze the efficiency of several strategies for reducing this traffic. We propose an strategy, which we call Common Subscribe (CS), for reducing the presence traffic exchanged between federated domains. We compare this strategy traffic with that generated by other optimizations. The reported results show that CS is the most efficient at reducing presence traffic. We analyze the load in the number of messages that several inter-domain traffic optimizations cause to the IMS centralized servers. Our proposed strategy, CS, combined with an RLS (i.e., a SIMPLE optimization) is the only optimization that reduces the IMS load; the others increase this load. We estimate the efficiency of the RLS, thereby concluding that the RLS is not efficient under certain circumstances, and hence this optimization is discouraged. We propose a queuing system for optimizing presence traffic on both the network core and access link, which is capable to adapt the publication and notification rate based on some quality conditions (e.g, maximum delay). We probabilistically model this system, and validate it in different scenarios. We propose, and implement a prototype of, a fully-distributed platform for handling user presence information. This approach allows integrating Internet Services, such as HTTP or VoIP, and optimizing these services in an easy, user-personalized way. We have developed SECE (Sense Everything, Control Everything), a platform for users to create rules that handle their communications and Internet Services proactively. SECE interacts with multiple third-party services for obtaining as much user context as possible. We have developed a natural-English-like formal language for SECE rules. We have enhanced SECE for discovering web services automatically through the Web Ontology Language (OWL). SECE allows composing web services automatically based on real-world events, which is a significant contribution to the Semantic Web. The research presented in this thesis has been published through 3 book chapters, 4 international journals (3 of them are indexed in JCR), 10 international conference papers, 1 demonstration at an international conference, and 1 national conferenceNext-Generation Networks (NGNs) son las redes de próxima generación que soportaran la convergencia de redes de telecomunicación inalámbricas y fijas. La base de NGNs es el IMS (IP Multimedia Subsystem), introducido por el 3GPP. El servicio de presencia nació de aplicaciones de mesajería instantánea. La información de presencia de un usuario consiste en cualquier tipo de información que es de utilidad para manejar las comunicaciones con el usuario. El servicio de presencia es una parte esencial del IMS para el despliegue de servicios ubicuos. SIMPLE es el protocolo estándar para manejar presencia y mensajes instantáneos en el IMS. Este protocolo distribuye la información de presencia de los usuarios a través de suscripciones, notificaciones y publicaciones. SIMPLE genera mucho tráfico por la diseminación constante de información de presencia y el mantenimiento de las suscripciones, lo cual puede saturar los servidores de red. Este problema es todavía más perjudicial en el IMS, debido al carácter centralizado de sus servidores. Un factor clave en el éxito de NGNs es proporcionar a los usuarios servicios ubicuos que esten integrados en su vida diaria y asi interactúen con los usuarios constantemente. La personalización de estos servicios basado en los usuarios es imprescindible para el éxito de los mismos. Para este fin, la información de presencia es considerada como una herramienta base. La tesis realizada se puede resumir brevemente en los siguientes contribuciones: Proponemos filtrar y controlar el ratio de las publicaciones de presencia para reducir la cantidad de información enviada en la red de acceso. Modelamos la información de presencia probabilísticamente mediante cadenas de Markov, y analizamos la eficiencia de controlar el ratio de publicaciones con una cadena de Markov. Los resultados muestran que este mecanismo puede efectivamente reducir el tráfico de presencia. Estudiamos matemáticamente la cantidad de tráfico de presencia generada entre dominios y analizamos el rendimiento de tres estrategias para reducir este tráfico. Proponemos una estrategia, la cual llamamos Common Subscribe (CS), para reducir el tráfico de presencia entre dominios federados. Comparamos el tráfico generado por CS frente a otras estrategias de optimización. Los resultados de este análisis muestran que CS es la estrategia más efectiva. Analizamos la carga en numero de mensajes introducida por diferentes optimizaciones de tráfico de presencia en los servidores centralizados del IMS. Nuestra propuesta, CS, combinada con un RLS (i.e, una optimización de SIMPLE), es la unica optimización que reduce la carga en el IMS. Estimamos la eficiencia del RLS, deduciendo que un RLS no es eficiente en ciertas circunstancias, en las que es preferible no usar esta optimización. Proponemos un sistema de colas para optimizar el tráfico de presencia tanto en el núcleo de red como en la red de acceso, y que puede adaptar el ratio de publicación y notificación en base a varios parametros de calidad (e.g., maximo retraso). Modelamos y analizamos este sistema de colas probabilísticamente en diferentes escenarios. Proponemos una arquitectura totalmente distribuida para manejar las información de presencia del usuario, de la cual hemos implementado un prototipo. Esta propuesta permite la integracion sencilla y personalizada al usuario de servicios de Internet, como HTTP o VoIP, asi como la optimizacón de estos servicios. Hemos desarrollado SECE (Sense Everything, Control Everything), una plataforma donde los usuarios pueden crear reglas para manejar todas sus comunicaciones y servicios de Internet de forma proactiva. SECE interactúa con una multitud de servicios para conseguir todo el contexto possible del usuario. Hemos desarollado un lenguaje formal que parace como Ingles natural para que los usuarios puedan crear sus reglas. Hemos mejorado SECE para descubrir servicios web automaticamente a través del lenguaje OWL (Web Ontology Language)

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    corecore