12 research outputs found

    Modeling, simulations, and experiments to balance performance and fairness in P2P file-sharing systems

    Get PDF
    Doctor of PhilosophyDepartment of Electrical and Computer EngineeringDon GruenbacherCaterina ScoglioIn this dissertation, we investigate research gaps still existing in P2P file-sharing systems: the necessity of fairness maintenance during the content information publishing/retrieving process, and the stranger policies on P2P fairness. First, through a wide range of measurements in the KAD network, we present the impact of a poorly designed incentive fairness policy on the performance of looking up content information. The KAD network, designed to help peers publish and retrieve sharing information, adopts a distributed hash table (DHT) technology and combines itself into the aMule/eMule P2P file-sharing network. We develop a distributed measurement framework that employs multiple test nodes running on the PlanetLab testbed. During the measurements, the routing tables of around 20,000 peers are crawled and analyzed. More than 3,000,000 pieces of source location information from the publishing tables of multiple peers are retrieved and contacted. Based on these measurements, we show that the routing table is well maintained, while the maintenance policy for the source-location-information publishing table is not well designed. Both the current maintenance schedule for the publishing table and the poor incentive policy on publishing peers eventually result in the low availability of the publishing table, which accordingly cause low lookup performance of the KAD network. Moreover, we propose three possible solutions to address these issues: the self-maintenance scheme with short period renewal interval, the chunk-based publishing/retrieving scheme, and the fairness scheme. Second, using both numerical analyses and agent-based simulations, we evaluate the impact of different stranger policies on system performance and fairness. We explore that the extremely restricting stranger policy brings the best fairness at a cost of performance degradation. The varying tendency of performance and fairness under different stranger policies are not consistent. A trade-off exists between controlling free-riding and maintaining system performance. Thus, P2P designers are required to tackle strangers carefully according to their individual design goals. We also show that BitTorrent prefers to maintain fairness with an extremely restricting stranger policy, while aMule/eMule’s fully rewarding stranger policy promotes free-riders’ benefit

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)

    New Fundamental Technologies in Data Mining

    Get PDF
    The progress of data mining technology and large public popularity establish a need for a comprehensive text on the subject. The series of books entitled by "Data Mining" address the need by presenting in-depth description of novel mining algorithms and many useful applications. In addition to understanding each section deeply, the two books present useful hints and strategies to solving problems in the following chapters. The contributing authors have highlighted many future research directions that will foster multi-disciplinary collaborations and hence will lead to significant development in the field of data mining

    Intelligent Systems

    Get PDF
    This book is dedicated to intelligent systems of broad-spectrum application, such as personal and social biosafety or use of intelligent sensory micro-nanosystems such as "e-nose", "e-tongue" and "e-eye". In addition to that, effective acquiring information, knowledge management and improved knowledge transfer in any media, as well as modeling its information content using meta-and hyper heuristics and semantic reasoning all benefit from the systems covered in this book. Intelligent systems can also be applied in education and generating the intelligent distributed eLearning architecture, as well as in a large number of technical fields, such as industrial design, manufacturing and utilization, e.g., in precision agriculture, cartography, electric power distribution systems, intelligent building management systems, drilling operations etc. Furthermore, decision making using fuzzy logic models, computational recognition of comprehension uncertainty and the joint synthesis of goals and means of intelligent behavior biosystems, as well as diagnostic and human support in the healthcare environment have also been made easier

    A new frontier for the study of the commons:Open-source hardware

    Get PDF

    A new frontier for the study of the commons:Open-source hardware

    Get PDF

    Statistical models for the analysis of short user-generated documents: author identification for conversational documents

    Get PDF
    In recent years short user-generated documents have been gaining popularity on the Internet and attention in the research communities. This kind of documents are generated by users of the various online services: platforms for instant messaging communication, for real-time status posting, for discussing and for writing reviews. Each of these services allows users to generate written texts with particular properties and which might require specific algorithms for being analysed. In this dissertation we are presenting our work which aims at analysing this kind of documents. We conducted qualitative and quantitative studies to identify the properties that might allow for characterising them. We compared the properties of these documents with the properties of standard documents employed in the literature, such as newspaper articles, and defined a set of characteristics that are distinctive of the documents generated online. We also observed two classes within the online user-generated documents: the conversational documents and those involving group discussions. We later focused on the class of conversational documents, that are short and spontaneous. We created a novel collection of real conversational documents retrieved online (e.g. Internet Relay Chat) and distributed it as part of an international competition (PAN @ CLEF'12). The competition was about author characterisation, which is one of the possible studies of authorship attribution documented in the literature. Another field of study is authorship identification, that became our main topic of research. We approached the authorship identification problem in its closed-class variant. For each problem we employed documents from the collection we released and from a collection of Twitter messages, as representative of conversational or short user-generated documents. We proved the unsuitability of standard authorship identification techniques for conversational documents and proposed novel methods capable of reaching better accuracy rates. As opposed to standard methods that worked well only for few authors, the proposed technique allowed for reaching significant results even for hundreds of users

    A Data-driven Statistical Approach to Customer Behaviour Analysis and Modelling in Online Freemium Games

    Get PDF
    The video games industry is one of the most attractive and lucrative segments in the entertainment and digital media, with big business of more than $150 billion worldwide. A popular approach in this industry is the online freemium model, wherein the game is downloadable free of cost, while advanced and bonus content have optional charges. Monetisation is through micro payments by customers and the focus is on maintaining average revenue per user and lifetime value of players. The overall aim of this research is to develop suitable data-driven methods to gain insight about customer behaviour in online freemium games, with a view to providing recommendations for successful business in this industry.Three important aspects of user behaviour are modelled in this research - engagement, time until defection, and number of micro transactions made. A multiple logistic regression using penalised likelihood approach is found to be most suitable for modelling and demonstrates good fit and accuracy for assigning observations to engaged and non-engaged categories. Cox’s proportional hazards model is adopted to analyse time to defection, and a negative binomial zero-inflated model results in the best fit to the data on micro payments. Cluster analysis techniques are used to classify the wide variety of customers based on their gameplay styles, and social network models are developed to identify prominent ‘actors’ based on social interactions. Some of the significant predictors of engagement and monetisation are amount of premium in-game currency, success in missions and competency in virtual fights, and quantity of virtual resources used in the game.This research offers extensive insight into what drives the reputation, virality and commercial viability of freemium games. In particular it helps to fill a gap in understanding the behaviour of online game players by demonstrating the effectiveness of applying a data analytic approach. It gives more insight into the determinants of player behaviour than relying on observational studies or those based on survey research. Additionally, it refines statistical models and demonstrates their implementation in R to new and complex data types representing online customer behaviours
    corecore