Search CORE

2,493 research outputs found

Recommended from our members

A MapReduce architecture for web site user behaviour monitoring in real time

Author: Karakostas B.
Theodoulidis B.
Publication venue
Publication date
Field of study

Monitoring the behaviour of large numbers of web site users in real time poses significant performance challenges, due to the decentralised location and volume of generated data. This paper proposes a MapReduce-style architecture where the processing of event series from the Web users is performed by a number of cascading mappers, reducers and rereducers, local to the event origin. With the use of static analysis and a prototype implementation, we show how this architecture is capable to carry out time series analysis in real time for very large web data sets, based on the actual events, instead of resorting to sampling or other extrapolation techniques

City Research Online

A Novel Approach for Generic log analyser

Author: Reshma Chaudhari, Naykar Savita Dilip, Vandhan Himali Jaywant, Shelke Pratibha Ashok, Prof. Kavita S. Kumavat
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2015
Field of study

To capture the meaning of this emerging trend the term big data was formulated. In addition to its sheer volume, big data also shows other unique characteristics as compared with traditional data. For instance, big data requires more real-time analysis and is commonly unstructured. For data acquisition, transmission, storage, and large-scale data processing components, this improvement calls for new system architectures. In all databases there are log ?les that keep records of changes in database. This can include tracking distinct user events. For log processing Apache Hadoop is used. A standard part of large applications are the log files and are essential in operating systems, computer networks and distributed systems. The only ways to identify and locate an error in software log ?les are used, because log ?le analysis is not affected by anytime-based issues known as probe effect. This is opposite to analysis of a running program, when the investigative process can obstruct with time-critical or resource-critical conditions within the analyzed program. The global goal of this project is to design a generic log analyzer using hadoop map-reduce framework. Different kinds of log ?les such as- Email logs, Web logs; Firewall logs Server logs, Call data logs are analyzed using generic log analyzer

International Journal on Recent and Innovation Trends in Computing and Communication

Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page

Author: Gupta Srishti
Kumaraguru Ponnurangam
Publication venue
Publication date: 14/06/2014
Field of study

Each month, more attacks are launched with the aim of making web users believe that they are communicating with a trusted entity which compels them to share their personal, financial information. Phishing costs Internet users billions of dollars every year. Researchers at Carnegie Mellon University (CMU) created an anti-phishing landing page supported by Anti-Phishing Working Group (APWG) with the aim to train users on how to prevent themselves from phishing attacks. It is used by financial institutions, phish site take down vendors, government organizations, and online merchants. When a potential victim clicks on a phishing link that has been taken down, he / she is redirected to the landing page. In this paper, we present the comparative analysis on two datasets that we obtained from APWG's landing page log files; one, from September 7, 2008 - November 11, 2009, and other from January 1, 2014 - April 30, 2014. We found that the landing page has been successful in training users against phishing. Forty six percent users clicked lesser number of phishing URLs from January 2014 to April 2014 which shows that training from the landing page helped users not to fall for phishing attacks. Our analysis shows that phishers have started to modify their techniques by creating more legitimate looking URLs and buying large number of domains to increase their activity. We observed that phishers are exploiting ICANN accredited registrars to launch their attacks even after strict surveillance. We saw that phishers are trying to exploit free subdomain registration services to carry out attacks. In this paper, we also compared the phishing e-mails used by phishers to lure victims in 2008 and 2014. We found that the phishing e-mails have changed considerably over time. Phishers have adopted new techniques like sending promotional e-mails and emotionally targeting users in clicking phishing URLs

arXiv.org e-Print Archive

BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

Author: Banos Vangelis
Kasioumis Nikolaos
Kim Yunhyong
Kopidaki Stella
Ross Seamus
Rynning Morten
Stepanyan Karen
Publication venue: BlogForever
Publication date: 25/10/2013
Field of study

This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

ZENODO

Enlighten