1 research outputs found

    Web Log Data Analysis: Converting Unstructured Web Log Data into Structured Data Using Apache Pig

    Get PDF
    Data extraction and analysis have recently received significant attention due to the evolution of social media and large volume of data available in an unstructured form. Hadoop and MapReduce have been continuously implementing and analyzing large amount of data. In this paper Apache Pig, which is one of the high-level platform for analyzing large volume of data and runs on the top of Hadoop is used to analyze unstructured log files and extract information. In this paper, weblog server files are used to analyze and extract meaningful information in an unstructured form to a structured form in Apache Pig framework The main purpose of this paper is to extract, transform and load unstructured data in an Apache Pig framework and analyze the data and its performance on local mode as well as MapReduce mode. This paper further explains in brief about the different steps required to analyze unstructured web server log files in Apache Pig. This paper also compares the efficiency when a large volume of data is processed on MapReduce mode and local mode
    corecore