1,524 research outputs found

    HLOC: Hints-Based Geolocation Leveraging Multiple Measurement Frameworks

    Full text link
    Geographically locating an IP address is of interest for many purposes. There are two major ways to obtain the location of an IP address: querying commercial databases or conducting latency measurements. For structural Internet nodes, such as routers, commercial databases are limited by low accuracy, while current measurement-based approaches overwhelm users with setup overhead and scalability issues. In this work we present our system HLOC, aiming to combine the ease of database use with the accuracy of latency measurements. We evaluate HLOC on a comprehensive router data set of 1.4M IPv4 and 183k IPv6 routers. HLOC first extracts location hints from rDNS names, and then conducts multi-tier latency measurements. Configuration complexity is minimized by using publicly available large-scale measurement frameworks such as RIPE Atlas. Using this measurement, we can confirm or disprove the location hints found in domain names. We publicly release HLOC's ready-to-use source code, enabling researchers to easily increase geolocation accuracy with minimum overhead.Comment: As published in TMA'17 conference: http://tma.ifip.org/main-conference

    Measuring the Relationships between Internet Geography and RTT

    Get PDF
    When designing distributed systems and Internet protocols, designers can benefit from statistical models of the Internet that can be used to estimate their performance. However, it is frequently impossible for these models to include every property of interest. In these cases, model builders have to select a reduced subset of network properties, and the rest will have to be estimated from those available. In this paper we present a technique for the analysis of Internet round trip times (RTT) and its relationship with other geographic and network properties. This technique is applied on a novel dataset comprising ∼19 million RTT measurements derived from ∼200 million RTT samples between ∼54 thousand DNS servers. Our main contribution is an information-theoretical analysis that allows us to determine the amount of information that a given subset of geographic or network variables (such as RTT or great circle distance between geolocated hosts) gives about other variables of interest. We then provide bounds on the error that can be expected when using statistical estimators for the variables of interest based on subsets of other variables

    Passport: Enabling Accurate Country-Level Router Geolocation using Inaccurate Sources

    Full text link
    When does Internet traffic cross international borders? This question has major geopolitical, legal and social implications and is surprisingly difficult to answer. A critical stumbling block is a dearth of tools that accurately map routers traversed by Internet traffic to the countries in which they are located. This paper presents Passport: a new approach for efficient, accurate country-level router geolocation and a system that implements it. Passport provides location predictions with limited active measurements, using machine learning to combine information from IP geolocation databases, router hostnames, whois records, and ping measurements. We show that Passport substantially outperforms existing techniques, and identify cases where paths traverse countries with implications for security, privacy, and performance

    Passport: enabling accurate country-level router geolocation using inaccurate sources

    Full text link
    When does Internet traffic cross international borders? This question has major geopolitical, legal and social implications and is surprisingly difficult to answer. A critical stumbling block is a dearth of tools that accurately map routers traversed by Internet traffic to the countries in which they are located. This paper presents Passport: a new approach for efficient, accurate country-level router geolocation and a system that implements it. Passport provides location predictions with limited active measurements, using machine learning to combine information from IP geolocation databases, router hostnames, whois records, and ping measurements. We show that Passport substantially outperforms existing techniques, and identify cases where paths traverse countries with implications for security, privacy, and performance.First author draf

    Smartphone-based geolocation of Internet hosts

    Get PDF
    The location of Internet hosts is frequently used in distributed applications and networking services. Examples include customized advertising, distribution of content, and position-based security. Unfortunately the relationship between an IP address and its position is in general very weak. This motivates the study of measurement-based IP geolocation techniques, where the position of the target host is actively estimated using the delays between a number of landmarks and the target itself. This paper discusses an IP geolocation method based on crowdsourcing where the smartphones of users operate as landmarks. Since smartphones rely on wireless connections, a specific delay-distance model was derived to capture the characteristics of this novel operating scenario

    Beyond Counting: New Perspectives on the Active IPv4 Address Space

    Full text link
    In this study, we report on techniques and analyses that enable us to capture Internet-wide activity at individual IP address-level granularity by relying on server logs of a large commercial content delivery network (CDN) that serves close to 3 trillion HTTP requests on a daily basis. Across the whole of 2015, these logs recorded client activity involving 1.2 billion unique IPv4 addresses, the highest ever measured, in agreement with recent estimates. Monthly client IPv4 address counts showed constant growth for years prior, but since 2014, the IPv4 count has stagnated while IPv6 counts have grown. Thus, it seems we have entered an era marked by increased complexity, one in which the sole enumeration of active IPv4 addresses is of little use to characterize recent growth of the Internet as a whole. With this observation in mind, we consider new points of view in the study of global IPv4 address activity. Our analysis shows significant churn in active IPv4 addresses: the set of active IPv4 addresses varies by as much as 25% over the course of a year. Second, by looking across the active addresses in a prefix, we are able to identify and attribute activity patterns to network restructurings, user behaviors, and, in particular, various address assignment practices. Third, by combining spatio-temporal measures of address utilization with measures of traffic volume, and sampling-based estimates of relative host counts, we present novel perspectives on worldwide IPv4 address activity, including empirical observation of under-utilization in some areas, and complete utilization, or exhaustion, in others.Comment: in Proceedings of ACM IMC 201

    Teleoperation of passivity-based model reference robust control over the internet

    Get PDF
    This dissertation offers a survey of a known theoretical approach and novel experimental results in establishing a live communication medium through the internet to host a virtual communication environment for use in Passivity-Based Model Reference Robust Control systems with delays. The controller which is used as a carrier to support a robust communication between input-to-state stability is designed as a control strategy that passively compensates for position errors that arise during contact tasks and strives to achieve delay-independent stability for controlling of aircrafts or other mobile objects. Furthermore the controller is used for nonlinear systems, coordination of multiple agents, bilateral teleoperation, and collision avoidance thus maintaining a communication link with an upper bound of constant delay is crucial for robustness and stability of the overall system. For utilizing such framework an elucidation can be formulated by preparing site survey for analyzing not only the geographical distances separating the nodes in which the teleoperation will occur but also the communication parameters that define the virtual topography that the data will travel through. This survey will first define the feasibility of the overall operation since the teleoperation will be used to sustain a delay based controller over the internet thus obtaining a hypothetical upper bound for the delay via site survey is crucial not only for the communication system but also the delay is required for the design of the passivity-based model reference robust control. Following delay calculation and measurement via site survey, bandwidth tests for unidirectional and bidirectional communication is inspected to ensure that the speed is viable to maintain a real-time connection. Furthermore from obtaining the results it becomes crucial to measure the consistency of the delay throughout a sampled period to guarantee that the upper bound is not breached at any point within the communication to jeopardize the robustness of the controller. Following delay analysis a geographical and topological overview of the communication is also briefly examined via a trace-route to understand the underlying nodes and their contribution to the delay and round-trip consistency. To accommodate the communication channel for the controller the input and output data from both nodes need to be encapsulated within a transmission control protocol via a multithreaded design of a robust program within the C language. The program will construct a multithreaded client-server relationship in which the control data is transmitted. For added stability and higher level of security the channel is then encapsulated via an internet protocol security by utilizing a protocol suite for protecting the communication by authentication and encrypting each packet of the session using negotiation of cryptographic keys during each session

    Confounds and Consequences in Geotagged Twitter Data

    Full text link
    Twitter is often used in quantitative studies that identify geographically-preferred topics, writing styles, and entities. These studies rely on either GPS coordinates attached to individual messages, or on the user-supplied location field in each profile. In this paper, we compare these data acquisition techniques and quantify the biases that they introduce; we also measure their effects on linguistic analysis and text-based geolocation. GPS-tagging and self-reported locations yield measurably different corpora, and these linguistic differences are partially attributable to differences in dataset composition by age and gender. Using a latent variable model to induce age and gender, we show how these demographic variables interact with geography to affect language use. We also show that the accuracy of text-based geolocation varies with population demographics, giving the best results for men above the age of 40.Comment: final version for EMNLP 201
    • …
    corecore