2 research outputs found

    Analyzing malware log files for internet access investigation using Hadoop

    Get PDF
    On the Internet, malicious software (malware) is one of the most serious threats to system security. Major complex issues and problems on any software systems are frequently caused by malware. Malware can infect any computer software that has connection to Internet infrastructure. There are many types of malware and some of the popular malwares are botnet, trojans, viruses, spyware and adware. Internet users with lesser knowledge on the malware threats are susceptible to this issue. To protect and prevent the computer and internet users from exposing themselves towards malware attacks, identifying the attacks through investigating malware log file is an essential step to curb this threat. The log file exposes crucial information in identifying the malware, such as algorithm and functional characteristic, the network interaction between the source and the destination, and type of malware. By nature, the log file size is humongous and requires the investigation process to be executed on faster and stable platform such as big data environment. In this study, the authors had adopted Hadoop, an open source software framework to process and extract the information from the malware log files that obtains from university’s security equipment. The Python program was used for data transformation then analysis it in Hadoop simulation environment. The analysis includes assessing reduction of log files size, performance of execution time and data visualization using Microsoft Power BI (Business Intelligence). The results of log processing have reduced 50% of the original log file size, while the total execution time would not increase linearly with the size of the data. The information will be used for further prevention and protection from malware threats in university’s network

    A Survey on Big Data for Network Traffic Monitoring and Analysis

    Get PDF
    Network Traffic Monitoring and Analysis (NTMA) represents a key component for network management, especially to guarantee the correct operation of large-scale networks such as the Internet. As the complexity of Internet services and the volume of traffic continue to increase, it becomes difficult to design scalable NTMA applications. Applications such as traffic classification and policing require real-time and scalable approaches. Anomaly detection and security mechanisms require to quickly identify and react to unpredictable events while processing millions of heterogeneous events. At last, the system has to collect, store, and process massive sets of historical data for post-mortem analysis. Those are precisely the challenges faced by general big data approaches: Volume, Velocity, Variety, and Veracity. This survey brings together NTMA and big data. We catalog previous work on NTMA that adopt big data approaches to understand to what extent the potential of big data is being explored in NTMA. This survey mainly focuses on approaches and technologies to manage the big NTMA data, additionally briefly discussing big data analytics (e.g., machine learning) for the sake of NTMA. Finally, we provide guidelines for future work, discussing lessons learned, and research directions
    corecore