15 research outputs found

    Distributed Information Retrieval using Keyword Auctions

    Get PDF
    This report motivates the need for large-scale distributed approaches to information retrieval, and proposes solutions based on keyword auctions

    A Empirically Validated Framework for Limiting Free-Riding in P2P Networks through the Use of Social Network Information

    Get PDF
    In order to overcome the problem of free-riding in current P2P system, we suggest applying social network theory. Based on our exploration of the overlapping research fields of social networks and peer-to-peer networks, we propose a new P2P framework within this paper. It specifies social network information that can be used in a P2P system to avoid performance inefficiencies caused by free-riding or by policies to overcome free-riding. To identify this specific social network information, we conduct a survey among a small group of students, who use Skype, a popular P2P system. We use descriptive analysis and multiple regression analysis to analyze the survey data. The results of the analyses provide an indication that the idea of using social network information in P2P systems is valid and that it is supported by P2P users. Based on our findings, we make recommendations for a successful implementation of social-network-information-based P2P systems that can overcome free-riding issues and, consequently, improve the performance of P2P systems

    An Ising-based approach for tracking illegal P2P content distributors

    Get PDF
    This thesis focuses on the problem of tracking illegal P2P content distributors. By viewing the collection of files of a peer as a relatively precise reflection of its owner, we use the Ising model which originates from statistical physics to mathematically model the behavior of P2P networks and identify the relationships of peers. Based on it, we develop an effective approach to track the behavioral-based structures of P2P networks and use it as a guidance to narrow down the search scope for illegal P2P content distributors. The sum-product algorithm and mean field algorithm which are based on the Ising model are then used to efficiently compute the marginal distribution of peers that are holding or held a particular file of known contraband. Experimental results have shown that this behavioral-based approach significantly outperforms several tracking algorithms that ignore the relationships of peers in P2P networks

    On exploiting social relationship and personal background for content discovery in P2P networks

    Get PDF
    International audienceContent discovery is a critical issue in unstructured Peer-to-Peer (P2P) networks as nodes maintain only local network information. However, similarly without global information about human networks, one still can find specific persons via his/her friends by using social information. Therefore, in this paper, we investigate the problem of how social information (i.e., friends and background information) could benefit content discovery in P2P networks. We collect social information of 384, 494 user profiles from Facebook, and build a social P2P network model based on the empirical analysis. In this model, we enrich nodes in P2P networks with social information and link nodes via their friendships. Each node extracts two types of social features-Knowledge and Similarity-and assigns more weight to the friends that have higher similarity and more knowledge. Furthermore, we present a novel content discovery algorithm which can explore the latent relationships among a node's friends. A node computes stable scores for all its friends regarding their weight and the latent relationships. It then selects the top friends with higher scores to query content. Extensive experiments validate performance of the proposed mechanism. In particular, for personal interests searching, the proposed mechanism can achieve 100% of Search Success Rate by selecting the top 20 friends within two-hop. It also achieves 6.5 Hits on average, which improves 8x the performance of the compared methods

    An Efficient Holistic Data Distribution and Storage Solution for Online Social Networks

    Get PDF
    In the past few years, Online Social Networks (OSNs) have dramatically spread over the world. Facebook [4], one of the largest worldwide OSNs, has 1.35 billion users, 82.2% of whom are outside the US [36]. The browsing and posting interactions (text content) between OSN users lead to user data reads (visits) and writes (updates) in OSN datacenters, and Facebook now serves a billion reads and tens of millions of writes per second [37]. Besides that, Facebook has become one of the top Internet traļ¬ƒc sources [36] by sharing tremendous number of large multimedia ļ¬les including photos and videos. The servers in datacenters have limited resources (e.g. bandwidth) to supply latency eļ¬ƒcient service for multimedia ļ¬le sharing among the rapid growing users worldwide. Most online applications operate under soft real-time constraints (e.g., ā‰¤ 300 ms latency) for good user experience, and its service latency is negatively proportional to its income. Thus, the service latency is a very important requirement for Quality of Service (QoS) to the OSN as a web service, since it is relevant to the OSNā€™s revenue and user experience. Also, to increase OSN revenue, OSN service providers need to constrain capital investment, operation costs, and the resource (bandwidth) usage costs. Therefore, it is critical for the OSN to supply a guaranteed QoS for both text and multimedia contents to users while minimizing its costs. To achieve this goal, in this dissertation, we address three problems. i) Data distribution among datacenters: how to allocate data (text contents) among data servers with low service latency and minimized inter-datacenter network load; ii) Eļ¬ƒcient multimedia ļ¬le sharing: how to facilitate the servers in datacenters to eļ¬ƒciently share multimedia ļ¬les among users; iii) Cost minimized data allocation among cloud storages: how to save the infrastructure (datacenters) capital investment and operation costs by leveraging commercial cloud storage services. Data distribution among datacenters. To serve the text content, the new OSN model, which deploys datacenters globally, helps reduce service latency to worldwide distributed users and release the load of the existing datacenters. However, it causes higher inter-datacenter communica-tion load. In the OSN, each datacenter has a full copy of all data, and the master datacenter updates all other datacenters, generating tremendous load in this new model. The distributed data storage, which only stores a userā€™s data to his/her geographically closest datacenters, simply mitigates the problem. However, frequent interactions between distant users lead to frequent inter-datacenter com-munication and hence long service latencies. Therefore, the OSNs need a data allocation algorithm among datacenters with minimized network load and low service latency. Eļ¬ƒcient multimedia ļ¬le sharing. To serve multimedia ļ¬le sharing with rapid growing user population, the ļ¬le distribution method should be scalable and cost eļ¬ƒcient, e.g. minimiza-tion of bandwidth usage of the centralized servers. The P2P networks have been widely used for ļ¬le sharing among a large amount of users [58, 131], and meet both scalable and cost eļ¬ƒcient re-quirements. However, without fully utilizing the altruism and trust among friends in the OSNs, current P2P assisted ļ¬le sharing systems depend on strangers or anonymous users to distribute ļ¬les that degrades their performance due to user selļ¬sh and malicious behaviors. Therefore, the OSNs need a cost eļ¬ƒcient and trustworthy P2P-assisted ļ¬le sharing system to serve multimedia content distribution. Cost minimized data allocation among cloud storages. The new trend of OSNs needs to build worldwide datacenters, which introduce a large amount of capital investment and maintenance costs. In order to save the capital expenditures to build and maintain the hardware infrastructures, the OSNs can leverage the storage services from multiple Cloud Service Providers (CSPs) with existing worldwide distributed datacenters [30, 125, 126]. These datacenters provide diļ¬€erent Get/Put latencies and unit prices for resource utilization and reservation. Thus, when se-lecting diļ¬€erent CSPsā€™ datacenters, an OSN as a cloud customer of a globally distributed application faces two challenges: i) how to allocate data to worldwide datacenters to satisfy application SLA (service level agreement) requirements including both data retrieval latency and availability, and ii) how to allocate data and reserve resources in datacenters belonging to diļ¬€erent CSPs to minimize the payment cost. Therefore, the OSNs need a data allocation system distributing data among CSPsā€™ datacenters with cost minimization and SLA guarantee. In all, the OSN needs an eļ¬ƒcient holistic data distribution and storage solution to minimize its network load and cost to supply a guaranteed QoS for both text and multimedia contents. In this dissertation, we propose methods to solve each of the aforementioned challenges in OSNs. Firstly, we verify the beneļ¬ts of the new trend of OSNs and present OSN typical properties that lay the basis of our design. We then propose Selective Data replication mechanism in Distributed Datacenters (SD3) to allocate user data among geographical distributed datacenters. In SD3,a datacenter jointly considers update rate and visit rate to select user data for replication, and further atomizes a userā€™s diļ¬€erent types of data (e.g., status update, friend post) for replication, making sure that a replica always reduces inter-datacenter communication. Secondly, we analyze a BitTorrent ļ¬le sharing trace, which proves the necessity of proximity-and interest-aware clustering. Based on the trace study and OSN properties, to address the second problem, we propose a SoCial Network integrated P2P ļ¬le sharing system for enhanced Eļ¬ƒciency and Trustworthiness (SOCNET) to fully and cooperatively leverage the common-interest, geographically-close and trust properties of OSN friends. SOCNET uses a hierarchical distributed hash table (DHT) to cluster common-interest nodes, and then further clusters geographically close nodes into a subcluster, and connects the nodes in a subcluster with social links. Thus, when queries travel along trustable social links, they also gain higher probability of being successfully resolved by proximity-close nodes, simultaneously enhancing eļ¬ƒciency and trustworthiness. Thirdly, to handle the third problem, we model the cost minimization problem under the SLA constraints using integer programming. According to the system model, we propose an Eco-nomical and SLA-guaranteed cloud Storage Service (ES3), which ļ¬nds a data allocation and resource reservation schedule with cost minimization and SLA guarantee. ES3 incorporates (1) a data al-location and reservation algorithm, which allocates each data item to a datacenter and determines the reservation amount on datacenters by leveraging all the pricing policies; (2) a genetic algorithm based data allocation adjustment approach, which makes data Get/Put rates stable in each data-center to maximize the reservation beneļ¬t; and (3) a dynamic request redirection algorithm, which dynamically redirects a data request from an over-utilized datacenter to an under-utilized datacenter with suļ¬ƒcient reserved resource when the request rate varies greatly to further reduce the payment. Finally, we conducted trace driven experiments on a distributed testbed, PlanetLab, and real commercial cloud storage (Amazon S3, Windows Azure Storage and Google Cloud Storage) to demonstrate the eļ¬ƒciency and eļ¬€ectiveness of our proposed systems in comparison with other systems. The results show that our systems outperform others in the network savings and data distribution eļ¬ƒciency

    Towards Efficient File Sharing and Packet Routing in Mobile Opportunistic Networks

    Get PDF
    With the increasing popularity of portable digital devices (e.g., smartphones, laptops, and tablets), mobile opportunistic networks (MONs) [40, 90] consisting of portable devices have attracted much attention recently. MONs are also known as pocket switched networks (PSNs) [52]. MONs can be regarded as a special form of mobile ad hoc networks (MANETs) [7] or delay tolerant networks (DTNs) [35, 56]. In such networks, mobile nodes (devices) move continuously and meet opportunistically. Two mobile nodes can communicate with each other only when they are within the communication range of each other in a peer-to-peer (P2P) manner (i.e., without the need of infrastructures). Therefore, such a network structure can potentially provide file sharing or packet routing services among portable devices without the support of network infrastructures. On the other hand, mobile opportunistic networks often experience frequent network partition, and no end-to-end contemporaneous path can be ensured in the network. These distinctive properties make traditional file sharing or packet routing algorithms in Internet or mobile networks a formidable challenge in MONs. In summary, it is essential and important to achieve efficient file sharing and packet routing algorithms in MONs, which are the key for providing practical and novel services and applications over such networks. In this Dissertation, we develop several methods to resolve the aforementioned challenges. Firstly, we propose two methods to enhance file sharing efficiency in MONs by creating replicas and by leveraging social network properties, respectively. In the first method, we investigate how to create file replicas to optimize file availability for file sharing in MONs. We introduce a new concept of resource for file replication, which considers both node storage and meeting frequency with other nodes. We theoretically study the influence of resource allocation on the average file access delay and derive a resource allocation rule to minimize the average file access delay. We also propose a distributed file replication protocol to realize the deduced optimal file replication rule. In the second method, we leverage social network properties to improve the file searching efficiency in MONs. This method groups common-interest nodes that frequently meet with each other into a community. It takes advantage of node mobility by designating stable nodes, which have the most frequent contact with community members, as community coordinators for intra-community file request forwarding, and highly-mobile nodes that visit other communities frequently as community ambassadors for inter-community file request forwarding. Based on such a community structure, an interest-oriented file searching scheme is proposed to first search local community and then search the community that is most likely to contain the requested file, leading to highly efficient file sharing in MONs. Secondly, we propose two methods to realize efficient packet routing among mobile nodes and among different landmarks in MONs, respectively. The first method utilizes distributed social map to route packets to mobile nodes efficiently with a low-cost in MONs. Each node builds its own social map consisting of nodes it has met and their frequently encountered nodes in a distributed manner. Based on both encountering frequency and social closeness of two linked nodes in the social map, we decide the weight of each link to reflect the packet delivery ability between the two nodes. The social map enables more accurate forwarder selection through a broader view and reduces the cost on information exchange. The second method realizes high-throughput packet routing among different landmarks in MONs. It selects popular places that nodes visit frequently as landmarks and divides the entire MON area into sub-areas represented by landmarks. Nodes transiting between two landmarks relay packets between the two landmarks. The frequency of node transits between two landmarks is measured to represent the forwarding capacity between them, based on which routing tables are built on each landmark to guide packet routing. Finally, packets are routed landmark by landmark to reach their destination landmarks. Extensive analysis and real-trace based experiments are conducted to support the designs in this Dissertation and demonstrate the effectiveness of the proposed methods in comparison with the state-of-art methods. In the future, we plan to further enhance the file sharing and packet routing efficiency by considering more realistic scenarios or including more useful information. We will also investigate the security and privacy issues in the proposed methods
    corecore