1,860 research outputs found
Emerging Phishing Trends and Effectiveness of the Anti-Phishing Landing Page
Each month, more attacks are launched with the aim of making web users
believe that they are communicating with a trusted entity which compels them to
share their personal, financial information. Phishing costs Internet users
billions of dollars every year. Researchers at Carnegie Mellon University (CMU)
created an anti-phishing landing page supported by Anti-Phishing Working Group
(APWG) with the aim to train users on how to prevent themselves from phishing
attacks. It is used by financial institutions, phish site take down vendors,
government organizations, and online merchants. When a potential victim clicks
on a phishing link that has been taken down, he / she is redirected to the
landing page. In this paper, we present the comparative analysis on two
datasets that we obtained from APWG's landing page log files; one, from
September 7, 2008 - November 11, 2009, and other from January 1, 2014 - April
30, 2014. We found that the landing page has been successful in training users
against phishing. Forty six percent users clicked lesser number of phishing
URLs from January 2014 to April 2014 which shows that training from the landing
page helped users not to fall for phishing attacks. Our analysis shows that
phishers have started to modify their techniques by creating more legitimate
looking URLs and buying large number of domains to increase their activity. We
observed that phishers are exploiting ICANN accredited registrars to launch
their attacks even after strict surveillance. We saw that phishers are trying
to exploit free subdomain registration services to carry out attacks. In this
paper, we also compared the phishing e-mails used by phishers to lure victims
in 2008 and 2014. We found that the phishing e-mails have changed considerably
over time. Phishers have adopted new techniques like sending promotional
e-mails and emotionally targeting users in clicking phishing URLs
Recommended from our members
Determining citizens’ opinions about stories in the news media: analysing Google, Facebook and Twitter
We describe a method whereby a governmental policy maker can discover citizens’ reaction to news stories. This is particularly relevant in the political world, where governments’ policy statements are reported by the news media and discussed by citizens. The work here addresses two main questions: whereabouts are citizens discussing a news story, and what are they saying? Our strategy to answer the first question is to find news articles pertaining to the policy statements, then perform internet searches for references to the news articles’ headlines and URLs. We have created a software tool that schedules repeating Google searches for the news articles and collects the results in a database, enabling the user to aggregate and analyse them to produce ranked tables of sites that reference the news articles. Using data mining techniques we can analyse data so that resultant ranking reflects an overall aggregate score, taking into account multiple datasets, and this shows the most relevant places on the internet where the story is discussed. To answer the second question, we introduce the WeGov toolbox as a tool for analysing citizens’ comments and behaviour pertaining to news stories. We first use the tool for identifying social network discussions, using different strategies for Facebook and Twitter. We apply different analysis components to analyse the data to distil the essence of the social network users’ comments, to determine influential users and identify important comments
BLOG INFORMATION CLASSIFICATION
nformation Classification is the categorization of the huge amount of data in an efficient and useful way. In the current scenario data is growing exponentially due to the rise of internet rich applications. One such source of information is the blogs. Blogs are web logs maintained by their authors that contain information related to a certain topic and also contain authors view about that topic. Micro blogs, on the other hands, are variations of blogs that contain smaller data as compared to blogs. Nevertheless, it also contains rich information. In this project, Twitter, a micro blogging website has been targeted to gather information on certain trending topics. The information is in the form of tweets. A tweet is a post or an update on status on the Twitter website. These tweets are extracted using Twitter Search APIs. This data is then classified into different classes based on its content. Using the classified data, features are extracted from the tweets and suggestions are given to the users based on the trending topics
Recommended from our members
Retrieving information from heterogeneous freight data sources to answer natural language queries
textThe ability to retrieve accurate information from databases without an extensive knowledge of the contents and organization of each database is extremely beneficial to the dissemination and utilization of freight data. The challenges, however, are: 1) correctly identifying only the relevant information and keywords from questions when dealing with multiple sentence structures, and 2) automatically retrieving, preprocessing, and understanding multiple data sources to determine the best answer to user’s query. Current named entity recognition systems have the ability to identify entities but require an annotated corpus for training which in the field of transportation planning does not currently exist. A hybrid approach which combines multiple models to classify specific named entities was therefore proposed as an alternative. The retrieval and classification of freight related keywords facilitated the process of finding which databases are capable of answering a question. Values in data dictionaries can be queried by mapping keywords to data element fields in various freight databases using ontologies. A number of challenges still arise as a result of different entities sharing the same names, the same entity having multiple names, and differences in classification systems. Dealing with ambiguities is required to accurately determine which database provides the best answer from the list of applicable sources. This dissertation 1) develops an approach to identify and classifying keywords from freight related natural language queries, 2) develops a standardized knowledge representation of freight data sources using an ontology that both computer systems and domain experts can utilize to identify relevant freight data sources, and 3) provides recommendations for addressing ambiguities in freight related named entities. Finally, the use of knowledge base expert systems to intelligently sift through data sources to determine which ones provide the best answer to a user’s question is proposed.Civil, Architectural, and Environmental Engineerin
ADIOS LDA: When Grammar Induction Meets Topic Modeling
We explore the interplay between grammar induction and topic modeling approaches to unsupervised text processing. These two methods complement each other since one allows for the identification of local structures centered around certain key terms, while the other generates a document wide context of expressed topics. This approach allows us to access and identify semantic structures that would be otherwise hardly discovered by using only one of the two aforementioned methods. Using our approach, we are able to provide a deeper understanding of the topic structure by examining inferred information structures characteristic of given topics as well as capture differences in word usage that would be hard by using standard disambiguation methods. We perform our exploration on an extensive corpus of blog posts centered around the surveillance discussion, where we focus on the debate around the Snowden affair. We show how our approach can be used for (semi-) automated content classification and the extraction of semantic features from large textual corpora
- …