418 research outputs found

    MALICIOUS TRAFFIC DETECTION IN DNS INFRASTRUCTURE USING DECISION TREE ALGORITHM

    Get PDF
    Domain Name System (DNS) is an essential component in internet infrastructure to direct domains to IP addresses or conversely. Despite its important role in delivering internet services, attackers often use DNS as a bridge to breach a system. A DNS traffic analysis system is needed for early detection of attacks. However, the available security tools still have many shortcomings, for example broken authentication, sensitive data exposure, injection, etc. This research uses DNS analysis to develop anomaly-based techniques to detect malicious traffic on the DNS infrastructure. To do this, We look for network features that characterize DNS traffic. Features obtained will then be processed using the Decision Tree algorithm to classifyincoming DNS traffic. We experimented with 2.291.024 data traffic data matches the characteristics of BotNet and normal traffic. By dividing the data into 80% training and 20% testing data, our experimental results showed high detection aacuracy (96.36%) indicating the robustness of our method

    Why (and How) Networks Should Run Themselves

    Full text link
    The proliferation of networked devices, systems, and applications that we depend on every day makes managing networks more important than ever. The increasing security, availability, and performance demands of these applications suggest that these increasingly difficult network management problems be solved in real time, across a complex web of interacting protocols and systems. Alas, just as the importance of network management has increased, the network has grown so complex that it is seemingly unmanageable. In this new era, network management requires a fundamentally new approach. Instead of optimizations based on closed-form analysis of individual protocols, network operators need data-driven, machine-learning-based models of end-to-end and application performance based on high-level policy goals and a holistic view of the underlying components. Instead of anomaly detection algorithms that operate on offline analysis of network traces, operators need classification and detection algorithms that can make real-time, closed-loop decisions. Networks should learn to drive themselves. This paper explores this concept, discussing how we might attain this ambitious goal by more closely coupling measurement with real-time control and by relying on learning for inference and prediction about a networked application or system, as opposed to closed-form analysis of individual protocols

    SIEM Network Behaviour Monitoring Framework using Deep Learning Approach for Campus Network Infrastructure

    Get PDF
    One major problem faced by network users is an attack on the security of the network especially if the network is vulnerable due to poor security policies. Network security is largely an exercise to protect not only the network itself but most importantly, the data. This exercise involves hardware and software technology. Secure and effective access management falls under the purview of network security. It focuses on threats both internally and externally, intending to protect and stop the threats from entering or spreading into the network. A specialized collection of physical devices, such as routers, firewalls, and anti-malware tools, is required to address and ensure a secure network. Almost all agencies and businesses employ highly qualified information security analysts to execute security policies and validate the policies’ effectiveness on regular basis. This research paper presents a significant and flexible way of providing centralized log analysis between network devices. Moreover, this paper proposes a novel method for compiling and displaying all potential threats and alert information in a single dashboard using a deep learning approach for campus network infrastructure

    Padding Ain't Enough: Assessing the Privacy Guarantees of Encrypted DNS

    Get PDF
    DNS over TLS (DoT) and DNS over HTTPS (DoH) encrypt DNS to guard user privacy by hiding DNS resolutions from passive adversaries. Yet, past attacks have shown that encrypted DNS is still sensitive to traffic analysis. As a consequence, RFC 8467 proposes to pad messages prior to encryption, which heavily reduces the characteristics of encrypted traffic. In this paper, we show that padding alone is insufficient to counter DNS traffic analysis. We propose a novel traffic analysis method that combines size and timing information to infer the websites a user visits purely based on encrypted and padded DNS traces. To this end, we model DNS sequences that capture the complexity of websites that usually trigger dozens of DNS resolutions instead of just a single DNS transaction. A closed world evaluation based on the Alexa top-10k websites reveals that attackers can deanonymize at least half of the test traces in 80.2% of all websites, and even correctly label all traces for 32.0% of the websites. Our findings undermine the privacy goals of state-of-the-art message padding strategies in DoT/DoH. We conclude by showing that successful mitigations to such attacks have to remove the entropy of inter-arrival timings between query responses

    Dynamic monitoring of Android malware behavior: a DNS-based approach

    Get PDF
    The increasing technological revolution of the mobile smart devices fosters their wide use. Since mobile users rely on unofficial or thirdparty repositories in order to freely install paid applications, lots of security and privacy issues are generated. Thus, at the same time that Android phones become very popular and growing rapidly their market share, so it is the number of malicious applications targeting them. Yet, current mobile malware detection and analysis technologies are very limited and ineffective. Due to the particular traits of mobile devices such as the power consumption constraints that make unaffordable to run traditional PC detection engines on the device; therefore mobile security faces new challenges, especially on dynamic runtime malware detection. This approach is import because many instructions or infections could happen after an application is installed or executed. On the one hand, recent studies have shown that the network-based analysis, where applications could be also analyzed by observing the network traffic they generate, enabling us to detect malicious activities occurring on the smart device. On the other hand, the aggressors rely on DNS to provide adjustable and resilient communication between compromised client machines and malicious infrastructure. So, having rich DNS traffic information is very important to identify malevolent behavior, then using DNS for malware detection is a logical step in the dynamic analysis because malicious URLs are common and the present danger for cybersecurity. Therefore, the main goal of this thesis is to combine and correlate two approaches: top-down detection by identifying malware domains using DNS traces at the network level, and bottom-up detection at the device level using the dynamic analysis in order to capture the URLs requested on a number of applications to pinpoint the malware. For malware detection and visualization, we propose a system which is based on dynamic analysis of API calls. Thiscan help Android malware analysts in visually inspecting what the application under study does, easily identifying such malicious functions. Moreover, we have also developed a framework that automates the dynamic DNS analysis of Android malware where the captured URLs at the smartphone under scrutiny are sent to a remote server where they are: collected, identified within the DNS server records, mapped the extracted DNS records into this server in order to classify them either as benign or malicious domain. The classification is done through the usage of machine learning. Besides, the malicious URLs found are used in order to track and pinpoint other infected smart devices, not currently under monitoring

    IoT device ïŹngerprinting with sequence-based features

    Get PDF
    Exponential growth of Internet of Things complicates the network management in terms of security and device troubleshooting due to the heterogeneity of IoT devices. In the absence of a proper device identification mechanism, network administrators are unable to limit unauthorized accesses, locate vulnerable/rogue devices or assess the security policies applicable to these devices. Hence identifying the devices connected to the network is essential as it provides important insights about the devices that enable proper application of security measures and improve the efficiency of device troubleshooting. Despite the fact that active device fingerprinting reveals in depth information about devices, passive device fingerprinting has gained focus as a consequence of the lack of cooperation of devices in active fingerprinting. We propose a passive, feature based device identification technique that extracts features from a sequence of packets during the initial startup of a device and then uses machine learning for classification. Proposed system improves the average device prediction F1-score up to 0.912 which is a 14% increase compared with the state-of-the-art technique. In addition, We have analyzed the impact of confidence threshold on device prediction accuracy when a previously unknown device is detected by the classifier. As future work we suggest a feature-based approach to detect anomalies in devices by comparing long-term device behaviors

    Machine Learning and Big Data Methodologies for Network Traffic Monitoring

    Get PDF
    Over the past 20 years, the Internet saw an exponential grown of traffic, users, services and applications. Currently, it is estimated that the Internet is used everyday by more than 3.6 billions users, who generate 20 TB of traffic per second. Such a huge amount of data challenge network managers and analysts to understand how the network is performing, how users are accessing resources, how to properly control and manage the infrastructure, and how to detect possible threats. Along with mathematical, statistical, and set theory methodologies machine learning and big data approaches have emerged to build systems that aim at automatically extracting information from the raw data that the network monitoring infrastructures offer. In this thesis I will address different network monitoring solutions, evaluating several methodologies and scenarios. I will show how following a common workflow, it is possible to exploit mathematical, statistical, set theory, and machine learning methodologies to extract meaningful information from the raw data. Particular attention will be given to machine learning and big data methodologies such as DBSCAN, and the Apache Spark big data framework. The results show that despite being able to take advantage of mathematical, statistical, and set theory tools to characterize a problem, machine learning methodologies are very useful to discover hidden information about the raw data. Using DBSCAN clustering algorithm, I will show how to use YouLighter, an unsupervised methodology to group caches serving YouTube traffic into edge-nodes, and latter by using the notion of Pattern Dissimilarity, how to identify changes in their usage over time. By using YouLighter over 10-month long races, I will pinpoint sudden changes in the YouTube edge-nodes usage, changes that also impair the end users’ Quality of Experience. I will also apply DBSCAN in the deployment of SeLINA, a self-tuning tool implemented in the Apache Spark big data framework to autonomously extract knowledge from network traffic measurements. By using SeLINA, I will show how to automatically detect the changes of the YouTube CDN previously highlighted by YouLighter. Along with these machine learning studies, I will show how to use mathematical and set theory methodologies to investigate the browsing habits of Internauts. By using a two weeks dataset, I will show how over this period, the Internauts continue discovering new websites. Moreover, I will show that by using only DNS information to build a profile, it is hard to build a reliable profiler. Instead, by exploiting mathematical and statistical tools, I will show how to characterize Anycast-enabled CDNs (A-CDNs). I will show that A-CDNs are widely used either for stateless and stateful services. That A-CDNs are quite popular, as, more than 50% of web users contact an A-CDN every day. And that, stateful services, can benefit of A-CDNs, since their paths are very stable over time, as demonstrated by the presence of only a few anomalies in their Round Trip Time. Finally, I will conclude by showing how I used BGPStream an open-source software framework for the analysis of both historical and real-time Border Gateway Protocol (BGP) measurement data. By using BGPStream in real-time mode I will show how I detected a Multiple Origin AS (MOAS) event, and how I studies the black-holing community propagation, showing the effect of this community in the network. Then, by using BGPStream in historical mode, and the Apache Spark big data framework over 16 years of data, I will show different results such as the continuous growth of IPv4 prefixes, and the growth of MOAS events over time. All these studies have the aim of showing how monitoring is a fundamental task in different scenarios. In particular, highlighting the importance of machine learning and of big data methodologies

    Network Traffic Measurements, Applications to Internet Services and Security

    Get PDF
    The Internet has become along the years a pervasive network interconnecting billions of users and is now playing the role of collector for a multitude of tasks, ranging from professional activities to personal interactions. From a technical standpoint, novel architectures, e.g., cloud-based services and content delivery networks, innovative devices, e.g., smartphones and connected wearables, and security threats, e.g., DDoS attacks, are posing new challenges in understanding network dynamics. In such complex scenario, network measurements play a central role to guide traffic management, improve network design, and evaluate application requirements. In addition, increasing importance is devoted to the quality of experience provided to final users, which requires thorough investigations on both the transport network and the design of Internet services. In this thesis, we stress the importance of users’ centrality by focusing on the traffic they exchange with the network. To do so, we design methodologies complementing passive and active measurements, as well as post-processing techniques belonging to the machine learning and statistics domains. Traffic exchanged by Internet users can be classified in three macro-groups: (i) Outbound, produced by users’ devices and pushed to the network; (ii) unsolicited, part of malicious attacks threatening users’ security; and (iii) inbound, directed to users’ devices and retrieved from remote servers. For each of the above categories, we address specific research topics consisting in the benchmarking of personal cloud storage services, the automatic identification of Internet threats, and the assessment of quality of experience in the Web domain, respectively. Results comprise several contributions in the scope of each research topic. In short, they shed light on (i) the interplay among design choices of cloud storage services, which severely impact the performance provided to end users; (ii) the feasibility of designing a general purpose classifier to detect malicious attacks, without chasing threat specificities; and (iii) the relevance of appropriate means to evaluate the perceived quality of Web pages delivery, strengthening the need of users’ feedbacks for a factual assessment
    • 

    corecore