1 research outputs found

    Public Opinion Analysis Using Hadoop

    Get PDF
    Recent technological advances in devices, computing, and social networking have revolutionized the world but have also increased the amount of data produced by humans on a large scale. If you collect this data in the form of disks, it may fill an entire football field. According to studies, 2.5 billion gigabytes of new data is generated every day and 2.5 petabytes of data is collected every hour. This rate is still growing enormously. Though all this information produced is meaningful and can be useful when processed, it gets neglected. Social media has gained massive popularity nowadays. Twitter makes it easy to engage users in expressing, sharing and discussing hot latest topics but these public expressions and views are hard to analyze due to the bigger size of the data created by Twitter. In order to perform analysis and predictions over the hot topics in society, latest technologies are needed. The most popular solution for this is Hadoop. Hadoop acts as an open-source framework for developing and executing distributed applications that process very large amounts of data. It stores and process big data in a distributed fashion on large clusters of commodity hardware. The risk, of course, in running on commodity machines is how to handle failure. Hadoop is built with the assumption that hardware will fail and as such, it can easily handle most failures. Hadoop can be used for developing and executing distributed applications that process very large amounts of data. It provides a suitable environment needed for treating or processing huge data. Our job is to extract and store data into its file system and query the data according to the desired output. We propose to perform analysis on Public opinion expressed over Twitter regarding the trending topics of the society by using Apache Hadoop framework along with its services Apache Flume and Apache Hive
    corecore