89 research outputs found

    Large-Scale Geolocation for NetFlow

    Get PDF
    Current approaches perform geolocation mostly on-demand and in a small-scale fashion. As soon as geolocation needs to be performed in real-time in high-speed and large-scale networks, these approaches are not scalable anymore. To solve this problem, we propose two approaches to large-scale geolocation. Firstly, we present an exporter-based approach, which adds geolocation data to flow records in a way that is transparent to any flow collector. Secondly, we present a flow collector-based approach, which adds native geolocation to NetFlow data from any flow exporter. After presenting prototypes for both approaches, we demonstrate the applicability of large-scale geolocation by means of use cases.V současné době je geolokace prováděna na vyžádání a v malém rozsahu. Tento přístup je nevhodný pro vysokorychlostní a rozsáhlé sítě. K vyřešení tohoto problému, jsme navrhli dva přístupy pro rozsáhlou geolokaci. První přístup provádí geolokaci na straně exportéru, jež přidává geolokační data do záznamů o IP tocích způsobem, který je transparentní pro libovolný kolektor. Druhý přístup provádí geolokaci na straně kolektoru a je schopen zpracovat data z libovolného zdroje NetFlow dat. Po představení prototypů obou přístupů ukazujeme použitelnost rozsáhlé geolokace prostřednictvím několika případů užití

    SSHCure: a flow-based SSH intrusion detection system

    Get PDF
    SSH attacks are a main area of concern for network managers, due to the danger associated with a successful compromise. Detecting these attacks, and possibly compromised victims, is therefore a crucial activity. Most existing network intrusion detection systems designed for this purpose rely on the inspection of individual packets and, hence, do not scale to today's high-speed networks. To overcome this issue, this paper proposes SSHCure, a flow-based intrusion detection system for SSH attacks. It employs an efficient algorithm for the real-time detection of ongoing attacks and allows identification of compromised attack targets. A prototype implementation of the algorithm, including a graphical user interface, is implemented as a plugin for the popular NfSen monitoring tool. Finally, the detection performance of the system is validated with empirical traffic data

    An Overview of Internet Measurements:Fundamentals, Techniques, and Trends

    Full text link
    The Internet presents great challenges to the characterization of its structure and behavior. Different reasons contribute to this situation, including a huge user community, a large range of applications, equipment heterogeneity, distributed administration, vast geographic coverage, and the dynamism that are typical of the current Internet. In order to deal with these challenges, several measurement-based approaches have been recently proposed to estimate and better understand the behavior, dynamics, and properties of the Internet. The set of these measurement-based techniques composes the Internet Measurements area of research. This overview paper covers the Internet Measurements area by presenting measurement-based tools and methods that directly influence other conventional areas, such as network design and planning, traffic engineering, quality of service, and network management

    Addressing practical challenges for anomaly detection in backbone networks

    Get PDF
    Network monitoring has always been a topic of foremost importance for both network operators and researchers for multiple reasons ranging from anomaly detection to tra c classi cation or capacity planning. Nowadays, as networks become more and more complex, tra c increases and security threats reproduce, achieving a deeper understanding of what is happening in the network has become an essential necessity. In particular, due to the considerable growth of cybercrime, research on the eld of anomaly detection has drawn signi cant attention in recent years and tons of proposals have been made. All the same, when it comes to deploying solutions in real environments, some of them fail to meet some crucial requirements. Taking this into account, this thesis focuses on lling this gap between the research and the non-research world. Prior to the start of this work, we identify several problems. First, there is a clear lack of detailed and updated information on the most common anomalies and their characteristics. Second, unawareness of sampled data is still common although the performance of anomaly detection algorithms is severely a ected. Third, operators currently need to invest many work-hours to manually inspect and also classify detected anomalies to act accordingly and take the appropriate mitigation measures. This is further exacerbated due to the high number of false positives and false negatives and because anomaly detection systems are often perceived as extremely complex black boxes. Analysing an issue is essential to fully comprehend the problem space and to be able to tackle it properly. Accordingly, the rst block of this thesis seeks to obtain detailed and updated real-world information on the most frequent anomalies occurring in backbone networks. It rst reports on the performance of di erent commercial systems for anomaly detection and analyses the types of network nomalies detected. Afterwards, it focuses on further investigating the characteristics of the anomalies found in a backbone network using one of the tools for more than half a year. Among other results, this block con rms the need of applying sampling in an operational environment as well as the unacceptably high number of false positives and false negatives still reported by current commercial tools. On the whole, the presence of ampling in large networks for monitoring purposes has become almost mandatory and, therefore, all anomaly detection algorithms that do not take that into account might report incorrect results. In the second block of this thesis, the dramatic impact of sampling on the performance of well-known anomaly detection techniques is analysed and con rmed. However, we show that the results change signi cantly depending on the sampling technique used and also on the common metric selected to perform the comparison. In particular, we show that, Packet Sampling outperforms Flow Sampling unlike previously reported. Furthermore, we observe that Selective Sampling (SES), a sampling technique that focuses on small ows, obtains much better results than traditional sampling techniques for scan detection. Consequently, we propose Online Selective Sampling, a sampling technique that obtains the same good performance for scan detection than SES but works on a per-packet basis instead of keeping all ows in memory. We validate and evaluate our proposal and show that it can operate online and uses much less resources than SES. Although the literature is plenty of techniques for detecting anomalous events, research on anomaly classi cation and extraction (e.g., to further investigate what happened or to share evidence with third parties involved) is rather marginal. This makes it harder for network operators to analise reported anomalies because they depend solely on their experience to do the job. Furthermore, this task is an extremely time-consuming and error-prone process. The third block of this thesis targets this issue and brings it together with the knowledge acquired in the previous blocks. In particular, it presents a system for automatic anomaly detection, extraction and classi cation with high accuracy and very low false positives. We deploy the system in an operational environment and show its usefulness in practice. The fourth and last block of this thesis presents a generalisation of our system that focuses on analysing all the tra c, not only network anomalies. This new system seeks to further help network operators by summarising the most signi cant tra c patterns in their network. In particular, we generalise our system to deal with big network tra c data. In particular, it deals with src/dst IPs, src/dst ports, protocol, src/dst Autonomous Systems, layer 7 application and src/dst geolocation. We rst deploy a prototype in the European backbone network of G EANT and show that it can process large amounts of data quickly and build highly informative and compact reports that are very useful to help comprehending what is happening in the network. Second, we deploy it in a completely di erent scenario and show how it can also be successfully used in a real-world use case where we analyse the behaviour of highly distributed devices related with a critical infrastructure sector.La monitoritzaci o de xarxa sempre ha estat un tema de gran import ancia per operadors de xarxa i investigadors per m ultiples raons que van des de la detecci o d'anomalies fins a la classi caci o d'aplicacions. Avui en dia, a mesura que les xarxes es tornen m es i m es complexes, augmenta el tr ansit de dades i les amenaces de seguretat segueixen creixent, aconseguir una comprensi o m es profunda del que passa a la xarxa s'ha convertit en una necessitat essencial. Concretament, degut al considerable increment del ciberactivisme, la investigaci o en el camp de la detecci o d'anomalies ha crescut i en els darrers anys s'han fet moltes i diverses propostes. Tot i aix o, quan s'intenten desplegar aquestes solucions en entorns reals, algunes d'elles no compleixen alguns requisits fonamentals. Tenint aix o en compte, aquesta tesi se centra a omplir aquest buit entre la recerca i el m on real. Abans d'iniciar aquest treball es van identi car diversos problemes. En primer lloc, hi ha una clara manca d'informaci o detallada i actualitzada sobre les anomalies m es comuns i les seves caracter stiques. En segona inst ancia, no tenir en compte la possibilitat de treballar amb nom es part de les dades (mostreig de tr ansit) continua sent bastant est es tot i el sever efecte en el rendiment dels algorismes de detecci o d'anomalies. En tercer lloc, els operadors de xarxa actualment han d'invertir moltes hores de feina per classi car i inspeccionar manualment les anomalies detectades per actuar en conseqüencia i prendre les mesures apropiades de mitigaci o. Aquesta situaci o es veu agreujada per l'alt nombre de falsos positius i falsos negatius i perqu e els sistemes de detecci o d'anomalies s on sovint percebuts com caixes negres extremadament complexes. Analitzar un tema es essencial per comprendre plenament l'espai del problema i per poder-hi fer front de forma adequada. Per tant, el primer bloc d'aquesta tesi pret en proporcionar informaci o detallada i actualitzada del m on real sobre les anomalies m es freqüents en una xarxa troncal. Primer es comparen tres eines comercials per a la detecci o d'anomalies i se n'estudien els seus punts forts i febles, aix com els tipus d'anomalies de xarxa detectats. Posteriorment, s'investiguen les caracter stiques de les anomalies que es troben en la mateixa xarxa troncal utilitzant una de les eines durant m es de mig any. Entre d'altres resultats, aquest bloc con rma la necessitat de l'aplicaci o de mostreig de tr ansit en un entorn operacional, aix com el nombre inacceptablement elevat de falsos positius i falsos negatius en eines comercials actuals. En general, el mostreig de tr ansit de dades de xarxa ( es a dir, treballar nom es amb una part de les dades) en grans xarxes troncals s'ha convertit en gaireb e obligatori i, per tant, tots els algorismes de detecci o d'anomalies que no ho tenen en compte poden veure seriosament afectats els seus resultats. El segon bloc d'aquesta tesi analitza i confi rma el dram atic impacte de mostreig en el rendiment de t ecniques de detecci o d'anomalies plenament acceptades a l'estat de l'art. No obstant, es mostra que els resultats canvien signi cativament depenent de la t ecnica de mostreig utilitzada i tamb e en funci o de la m etrica usada per a fer la comparativa. Contr ariament als resultats reportats en estudis previs, es mostra que Packet Sampling supera Flow Sampling. A m es, a m es, s'observa que Selective Sampling (SES), una t ecnica de mostreig que se centra en mostrejar fluxes petits, obt e resultats molt millors per a la detecci o d'escanejos que no pas les t ecniques tradicionals de mostreig. En conseqü encia, proposem Online Selective Sampling, una t ecnica de mostreig que obt e el mateix bon rendiment per a la detecci o d'escanejos que SES, per o treballa paquet per paquet enlloc de mantenir tots els fluxes a mem oria. Despr es de validar i evaluar la nostra proposta, demostrem que es capa c de treballar online i utilitza molts menys recursos que SES. Tot i la gran quantitat de tècniques proposades a la literatura per a la detecci o d'esdeveniments an omals, la investigaci o per a la seva posterior classi caci o i extracci o (p.ex., per investigar m es a fons el que va passar o per compartir l'evid encia amb tercers involucrats) es m es aviat marginal. Aix o fa que sigui m es dif cil per als operadors de xarxa analalitzar les anomalies reportades, ja que depenen unicament de la seva experi encia per fer la feina. A m es a m es, aquesta tasca es un proc es extremadament lent i propens a errors. El tercer bloc d'aquesta tesi se centra en aquest tema tenint tamb e en compte els coneixements adquirits en els blocs anteriors. Concretament, presentem un sistema per a la detecci o extracci o i classi caci o autom atica d'anomalies amb una alta precisi o i molt pocs falsos positius. Adicionalment, despleguem el sistema en un entorn operatiu i demostrem la seva utilitat pr actica. El quart i ultim bloc d'aquesta tesi presenta una generalitzaci o del nostre sistema que se centra en l'an alisi de tot el tr ansit, no nom es en les anomalies. Aquest nou sistema pret en ajudar m es als operadors ja que resumeix els patrons de tr ansit m es importants de la seva xarxa. En particular, es generalitza el sistema per fer front al "big data" (una gran quantitat de dades). En particular, el sistema tracta IPs origen i dest i, ports origen i destí , protocol, Sistemes Aut onoms origen i dest , aplicaci o que ha generat el tr ansit i fi nalment, dades de geolocalitzaci o (tamb e per origen i dest ). Primer, despleguem un prototip a la xarxa europea per a la recerca i la investigaci o (G EANT) i demostrem que el sistema pot processar grans quantitats de dades r apidament aix com crear informes altament informatius i compactes que s on de gran utilitat per ajudar a comprendre el que est a succeint a la xarxa. En segon lloc, despleguem la nostra eina en un escenari completament diferent i mostrem com tamb e pot ser utilitzat amb exit en un cas d' us en el m on real en el qual s'analitza el comportament de dispositius altament distribuïts

    Insights into the issue in IPv6 adoption: a view from the Chinese IPv6 Application mix

    Get PDF
    Published onlineThis is the author accepted manuscript. The final version is available from Wiley via the DOI in this record.Although IPv6 has been standardized more than 15 years ago, its deployment is still very limited. China has been strongly pushing IPv6, especially due to its limited IPv4 address space. In this paper, we describe measurements from a large Chinese academic network, serving a significant population of IPv6 hosts. We show that despite its expected strength, China is struggling as much as the western world to increase the share of IPv6 traffic. To understand the reasons behind this, we examine the IPv6 applicative ecosystem. We observe a significant IPv6 traffic growth over the past 3 years, with P2P file transfers responsible for more than 80% of the IPv6 traffic, compared with only 15% for IPv4 traffic. Checking the top websites for IPv6 explains the dominance of P2P, with popular P2P trackers appearing systematically among the top visited sites, followed by Chinese popular services (e.g., Tencent), as well as surprisingly popular third-party analytics including Google. Finally, we compare the throughput of IPv6 and IPv4 flows. We find that a larger share of IPv4 flows get a high-throughput compared with IPv6 flows, despite IPv6 traffic not being rate limited. We explain this through the limited amount of HTTP traffic in IPv6 and the presence of Web caches in IPv4. Our findings highlight the main issue in IPv6 adoption, that is, the lack of commercial content, which biases the geographic pattern and flow throughput of IPv6 traffic. Copyright © 2014 John Wiley & Sons, Ltd

    VELOCITY : A NetFlow Based Optimized Geo-IP Lookup Tool

    Get PDF
    Title from PDF of title page, viewed on August 24, 2016Thesis advisor: Deepankar MedhiVitaIncludes bibliographical references (pages 36-38)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2016It is a challenging task for network administrators to monitor their institution's network against undesirable behavior. While NetFlow is useful to gather flow-level data for any Internet connection, its feature is limited to traditional flow-level information such as source IP address, destination IP address, source port number, destination port number, and the protocol type. Thus, if we are to understand geographic dynamics of any flow connected to hosts at an institution from the outside world, it is not currently possible with NetFlow. To address for geo-location information of such flows, we developed the tool, VELOCITY. This tool allows to correlate IP addresses with geo-location information to visualize geo-location of incoming and outgoing flows. The VELOCITY tool consists of four different methods, with increasing order of efficiency of the methods. We found that Method 3 outperforms Methods 1 and 2 in case of filling database with geographical data for the first time. Method 4, which is an extension of Method 3, finds geographical information for IP addresses that are not present in the currently populated database, thereby providing a more optimized approach than Method 3 for incremental flow data. Furthermore, for visualization and near real time experience, we also developed a web application that displays geographical information of IP address of flows on Google maps.Introduction -- Literature survey -- Methods -- WEB application -- Results -- Conclusion -- Appendix A. Xidel -- Appendix B. GNU paralle

    Steering hyper-giants' traffic at scale

    Get PDF
    Large content providers, known as hyper-giants, are responsible for sending the majority of the content traffic to consumers. These hyper-giants operate highly distributed infrastructures to cope with the ever-increasing demand for online content. To achieve 40 commercial-grade performance of Web applications, enhanced end-user experience, improved reliability, and scaled network capacity, hyper-giants are increasingly interconnecting with eyeball networks at multiple locations. This poses new challenges for both (1) the eyeball networks having to perform complex inbound traffic engineering, and (2) hyper-giants having to map end-user requests to appropriate servers. We report on our multi-year experience in designing, building, rolling-out, and operating the first-ever large scale system, the Flow Director, which enables automated cooperation between one of the largest eyeball networks and a leading hyper-giant. We use empirical data collected at the eyeball network to evaluate its impact over two years of operation. We find very high compliance of the hyper-giant to the Flow Director’s recommendations, resulting in (1) close to optimal user-server mapping, and (2) 15% reduction of the hyper-giant’s traffic overhead on the ISP’s long-haul links, i.e., benefits for both parties and end-users alike.EC/H2020/679158/EU/Resolving the Tussle in the Internet: Mapping, Architecture, and Policy Making/ResolutioNe

    The Dark Menace: Characterizing Network-based Attacks in the Cloud

    Get PDF
    ABSTRACT As the cloud computing market continues to grow, the cloud platform is becoming an attractive target for attackers to disrupt services and steal data, and to compromise resources to launch attacks. In this paper, using three months of NetFlow data in 2013 from a large cloud provider, we present the first large-scale characterization of inbound attacks towards the cloud and outbound attacks from the cloud. We investigate nine types of attacks ranging from network-level attacks such as DDoS to application-level attacks such as SQL injection and spam. Our analysis covers the complexity, intensity, duration, and distribution of these attacks, highlighting the key challenges in defending against attacks in the cloud. By characterizing the diversity of cloud attacks, we aim to motivate the research community towards developing future security solutions for cloud systems
    • …
    corecore