122 research outputs found

    Harnessing Predictive Models for Assisting Network Forensic Investigations of DNS Tunnels

    Get PDF
    In recent times, DNS tunneling techniques have been used for malicious purposes, however network security mechanisms struggle to detect them. Network forensic analysis has been proven effective, but is slow and effort intensive as Network Forensics Analysis Tools struggle to deal with undocumented or new network tunneling techniques. In this paper, we present a machine learning approach, based on feature subsets of network traffic evidence, to aid forensic analysis through automating the inference of protocols carried within DNS tunneling techniques. We explore four network protocols, namely, HTTP, HTTPS, FTP, and POP3. Three features are extracted from the DNS tunneled traffic: IP packet length, DNS Query Name Entropy, and DNS Query Name Length. We benchmark the performance of four classification models, i.e., decision trees, support vector machines, k-nearest neighbours, and neural networks, on a data set of DNS tunneled traffic. Classification accuracy of 95% is achieved and the feature set reduces the original evidence data size by a factor of 74%. More importantly, our findings provide strong evidence that predictive modeling machine learning techniques can be used to identify network protocols within DNS tunneled traffic in real-time with high accuracy from a relatively small-sized feature-set, without necessarily infringing on privacy from the outset, nor having to collect complete DNS Tunneling sessions

    A hybrid method of genetic algorithm and support vector machine for DNS tunneling detection

    Get PDF
    With the expansion of the business over the internet, corporations nowadays are investing numerous amounts of money in the web applications. However, there are different threats could make the corporations vulnerable for potential attacks. One of these threats is harnessing the domain name protocol for passing harmful information, this kind of threats is known as DNS tunneling. As a result, confidential information would be exposed and violated. Several studies have investigated the machine learning in order to propose a detection approach. In their approaches, authors have used different and numerous types of features such as domain length, number of bytes, content, volume of DNS traffic, number of hostnames per domain, geographic location and domain history. Apparently, there is a vital demand to accommodate feature selection task in order to identify the best features. This paper proposes a hybrid method of genetic algorithm feature selection approach with the support vector machine classifier for the sake of identifying the best features that have the ability to optimize the detection of DNS tunneling. To evaluate the proposed method, a benchmark dataset of DNS tunneling has been used. Results showed that the proposed method has outperformed the conventional SVM by achieving 0.946 of f-measur

    Comparing the Effectiveness of Different Classification Techniques in Predicting DNS Tunnels

    Get PDF
    DNS is one of the most widely used protocols on the internet and is used in the translation of domain names into IP address in order to correctly route messages between computers. It presents an attractive attack vector for criminals as the service is not as closely monitored by security experts as other protocols such as HTTP or FTP. Its use as a covert means of communication has increased with the availability of tools that allow for the creation of DNS tunnels using the protocol. One of the primary motivations for using DNS tunnels is the illegal extraction of information from a company’s network. This can lead to reputational damage for the organisation and result in significant fines – particularly with the introduction of General Data Protection Regulations in the EU. Most of the research into the detection of DNS tunnels has used anomalies in the relationship between DNS requests and other protocols, or anomalies in the rate of DNS requests made over specific time periods. This study will look at the characteristics of an individual DNS requests to see how effective different classification techniques are at identifying tunnels. The different techniques selected are Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM). The effectiveness of the different techniques will be measured and compared to see if there are statistically significant differences between them using a Cochran’s Q test. The results will indicate that DT, RF and SVM, are the most effective techniques at categorising DNS requests, and that they are significantly different to the other models. Key Words: DNS Tunnel, Logistic Regression, Support Vector Machine, Decision Tree, Random Forest, Cochran’s Q Test

    Performance Evaluation of a Field Programmable Gate Array-Based System for Detecting and Tracking Peer-to-Peer Protocols on a Gigabit Ethernet Network

    Get PDF
    Recent years have seen a massive increase in illegal, suspicious, and malicious traffic traversing government and military computer networks. Some examples include illegal file distribution and disclosure of sensitive information using the BitTorrent file sharing protocol, criminals and terrorists using Voice over Internet Protocol (VoIP) technologies to communicate, and foreign entities exfiltrating sensitive data from government, military, and Department of Defense contractor networks. As a result of these growing threats, the TRacking and Analysis for Peer-to-Peer (TRAPP) system was developed in 2008 to detect BitTorrent and VoIP traffic of interest. The TRAPP system, designed on a Xilinx Virtex-II Pro Field Programmable Gate Array (FPGA) proved valuable and effective in detecting traffic of interest on a 100 Mbps network. Using concepts and technology developed for the TRAPP system, the TRAPP-2 system is developed on a Xilinx ML510 FPGA. The goals of this research are to evaluate the performance of the TRAPP-2 system as a solution to detect and track malicious packets traversing a gigabit Ethernet network. The TRAPP-2 system detects a BitTorrent, Session Initiation Protocol (SIP), or Domain Name System (DNS) packet, extracts the payload, compares the data against a hash list, and if the packet is suspicious, logs the entire packet for future analysis. Results show that the TRAPP-2 system captures 95.56% of BitTorrent, 20.78% of SIP INVITE, 37.11% of SIP BYE, and 91.89% of DNS packets of interest while under a 93.7% network utilization (937 Mbps). For another experiment, the contraband hash list size is increased from 1,000 to 131,072,000 unique items. The experiment reveals that each doubling of the hash list size results in a mean increase of approximately 16 central processing unit cycles. These results demonstrate the TRAPP-2 system’s ability to detect traffic of interest under a saturated network utilization while maintaining large contraband hash lists

    Practical Analysis of Encrypted Network Traffic

    Get PDF
    The growing use of encryption in network communications is an undoubted boon for user privacy. However, the limitations of real-world encryption schemes are still not well understood, and new side-channel attacks against encrypted communications are disclosed every year. Furthermore, encrypted network communications, by preventing inspection of packet contents, represent a significant challenge from a network security perspective: our existing infrastructure relies on such inspection for threat detection. Both problems are exacerbated by the increasing prevalence of encrypted traffic: recent estimates suggest that 65% or more of downstream Internet traffic will be encrypted by the end of 2016. This work addresses these problems by expanding our understanding of the properties and characteristics of encrypted network traffic and exploring new, specialized techniques for the handling of encrypted traffic by network monitoring systems. We first demonstrate that opaque traffic, of which encrypted traffic is a subset, can be identified in real-time and how this ability can be leveraged to improve the capabilities of existing IDS systems. To do so, we evaluate and compare multiple methods for rapid identification of opaque packets, ultimately pinpointing a simple hypothesis test (which can be implemented on an FPGA) as an efficient and effective detector of such traffic. In our experiments, using this technique to “winnow”, or filter, opaque packets from the traffic load presented to an IDS system significantly increased the throughput of the system, allowing the identification of many more potential threats than the same system without winnowing. Second, we show that side channels in encrypted VoIP traffic enable the reconstruction of approximate transcripts of conversations. Our approach leverages techniques from linguistics, machine learning, natural language processing, and machine translation to accomplish this task despite the limited information leaked by such side channels. Our ability to do so underscores both the potential threat to user privacy which such side channels represent and the degree to which this threat has been underestimated. Finally, we propose and demonstrate the effectiveness of a new paradigm for identifying HTTP resources retrieved over encrypted connections. Our experiments demonstrate how the predominant paradigm from prior work fails to accurately represent real-world situations and how our proposed approach offers significant advantages, including the ability to infer partial information, in comparison. We believe these results represent both an enhanced threat to user privacy and an opportunity for network monitors and analysts to improve their own capabilities with respect to encrypted traffic.Doctor of Philosoph
    • …
    corecore