38 research outputs found

    Context-Aware Telco Outdoor Localization

    Get PDF
    Recent years have witnessed the fast growth in telecommunication (Telco) techniques from 2G to upcoming 5G. Precise outdoor localization is important for Telco operators to manage, operate and optimize Telco networks. Differing from GPS, Telco localization is a technique employed by Telco operators to localize outdoor mobile devices by using measurement report (MR) data. When given MR samples containing noisy signals (e.g., caused by Telco signal interference and attenuation), Telco localization often suffers from high errors. To this end, the main focus of this paper is how to improve Telco localization accuracy via the algorithms to detect and repair outlier positions with high errors. Specifically, we propose a context-aware Telco localization technique, namely RLoc, which consists of three main components: a machine-learning-based localization algorithm, a detection algorithm to find flawed samples, and a repair algorithm to replace outlier localization results by better ones (ideally ground truth positions). Unlike most existing works to detect and repair every flawed MR sample independently, we instead take into account spatio-temporal locality of MR locations and exploit trajectory context to detect and repair flawed positions. Our experiments on the real MR data sets from 2G GSM and 4G LTE Telco networks verify that our work RLoc can greatly improve Telco location accuracy. For example, RLoc on a large 4G MR data set can achieve 32.2 meters of median errors, around 17.4 percent better than state-of-the-art.Peer reviewe

    Explaining the Power-Law Distribution of Human Mobility Through Transportation Modality Decomposition

    Get PDF
    Human mobility has been empirically observed to exhibit Lévy flight characteristics and behaviour with power-law distributed jump size. The fundamental mechanisms behind this behaviour has not yet been fully explained. In this paper, we propose to explain the Lévy walk behaviour observed in human mobility patterns by decomposing them into different classes according to the different transportation modes, such as Walk/Run, Bike, Train/Subway or Car/Taxi/Bus. Our analysis is based on two real-life GPS datasets containing approximately 10 and 20 million GPS samples with transportation mode information. We show that human mobility can be modelled as a mixture of different transportation modes, and that these single movement patterns can be approximated by a lognormal distribution rather than a power-law distribution. Then, we demonstrate that the mixture of the decomposed lognormal flight distributions associated with each modality is a power-law distribution, providing an explanation to the emergence of Lévy Walk patterns that characterize human mobility patterns.Peer reviewe

    Optimal Proactive Caching in Peer-to-Peer Network: Analysis and Application ABSTRACT

    No full text
    As a promising new technology with the unique properties like high efficiency, scalability and fault tolerance, Peer-to-Peer (P2P) technology is used as the underlying network to build new Internet-scale applications. However, one of the well known issues in such an application (for example WWW) is that the distribution of data popularities is heavily tailed with a Zipf-like distribution. With consideration of the skewed popularity we adopt a proactive caching approach to handle the challenge, and focus on two key problems: where(i.e. the placement strategy: where to place the replicas) and how(i.e. the degree problem: how many replicas are assigned to one specific content)? For the where problem, we propose a novel approach which can be generally applied to structured P2P networks. Next, we solve two optimization objectives related to the how problem: MAX PERF and MIN COST. Our solution is called PoP-Cache, and we discover two interesting properties: (1) the number of replicas assigned to each content is proportional to its popularity; (2) the derived optimal solutions are related to the entropy of popularity. To our knowledge, none of the previous works has mentioned such results. Finally, we apply the results of PoPCache to propose a P2P base web caching, called as Web-PoPCache. By means of web cache trace driven simulation, our extensive evaluation results demonstrate the advantages of PoPCache and Web-PoPCache

    Distributed top-k full-text content dissemination

    No full text
    The World Wide Web (WWW) is experiencing an explosive growth of live content. To timely disseminate end users with fresh content that is relevant to users' interests is a very useful but challenging task. In this paper we design a distributed top-k document dissemination scheme: by inputting query terms and a number k as subscription conditions, end users subscribe to the top-k most relevant documents. When the documents and subscription conditions are distributed, the difficulties are how to forward the personalized top-k documents to the needed users with low forwarding cost. To this end, we propose document forwarding and filtering algorithms on a set of dedicate servers. Experiments based on real query logs and document datasets show the efficiency of our proposed algorithms

    A distributed full-text top-k document dissemination system in distributed hash tables

    No full text
    Recent years witnessed the explosive growth of 'live' web content in the World Wide Web like Weblogs, RSS feeds, and real-time news, etc. The popular usage of RSS feeds/readers enables end users to subscribe for favorite contents via input RSS URLs. However, the RSS feeds/readers architecture suffers from (i) the high bandwidth consumption issue, and (ii) limited filtering semantics. In this paper, we proposed a stateful full text dissemination scheme over structured P2Ps to address both issues. Specifically, for the semantic side, end users are allowed to subscribe for favorite contents via input keywords; for the network bandwidth side, the cooperative content polling, filtering and disseminating via DHT-based P2P overlay networks save the network bandwidth consumption. Our contributions include the novel techniques to (i) reduce the unit-publishing cost by pruning irreverent documents during the forwarding path towards destinations, and (ii) reduce the publication amount by selecting a very small number of meaningful terms. Based on real data sets, our experimental results show that the proposed scheme can significantly reduce the publishing cost with low maintenance overhead and a high document quality

    Toward Efficient Filter Privacy-Aware Content-Based Pub/Sub Systems

    No full text
    In recent years, the content-based publish/subscribe [12], [22] has become a popular paradigm to decouple information producers and consumers with the help of brokers. Unfortunately, when users register their personal interests to the brokers, the privacy pertaining to filters defined by honest subscribers could be easily exposed by untrusted brokers, and this situation is further aggravated by the collusion attack between untrusted brokers and compromised subscribers. To protect the filter privacy, we introduce an anonymizer engine to separate the roles of brokers into two parts, and adapt the k-anonymity and l-diversity models to the content-based pub/sub. When the anonymization model is applied to protect the filter privacy, there is an inherent tradeoff between the anonymization level and the publication redundancy. By leveraging partial-order-based generalization of filters to track filters satisfying k-anonymity and l-diversity, we design algorithms to minimize the publication redundancy. Our experiments show the proposed scheme, when compared with studied counterparts, has smaller forwarding cost while achieving comparable attack resilience

    Detecting Malware Based on DNS Graph Mining

    No full text
    Malware remains a major threat to nowadays Internet. In this paper, we propose a DNS graph mining-based malware detection approach. A DNS graph is composed of DNS nodes, which represent server IPs, client IPs, and queried domain names in the process of DNS resolution. After the graph construction, we next transform the problem of malware detection to the graph mining task of inferring graph nodes' reputation scores using the belief propagation algorithm. The nodes with lower reputation scores are inferred as those infected by malwares with higher probability. For demonstration, we evaluate the proposed malware detection approach with real-world dataset. Our real-world dataset is collected from campus DNS servers for three months and we built a DNS graph consisting of 19,340,820 vertices and 24,277,564 edges. On the graph, we achieve a true positive rate 80.63% with a false positive rate 0.023%. With a false positive of 1.20%, the true positive rate was improved to 95.66%. We detected 88,592 hosts infected by malware or C&C servers, accounting for the percentage of 5.47% among all hosts. Meanwhile, 117,971 domains are considered to be related to malicious activities, accounting for 1.5% among all domains. The results indicate that our method is efficient and effective in detecting malwares
    corecore