Search CORE

2,722 research outputs found

A Brief History of Web Crawlers

Author: Bochmann Gregor V.
Dinçktürk Mustafa Emre
Hooshmand Salman
Jourdan Guy-Vincent
Mirtaheri Seyed M.
Onut Iosif Viorel
Publication venue
Publication date: 04/05/2014
Field of study

Web crawlers visit internet applications, collect data, and learn about new web pages from visited pages. Web crawlers have a long and interesting history. Early web crawlers collected statistics about the web. In addition to collecting statistics about the web and indexing the applications for search engines, modern crawlers can be used to perform accessibility and vulnerability checks on the application. Quick expansion of the web, and the complexity added to web applications have made the process of crawling a very challenging one. Throughout the history of web crawling many researchers and industrial groups addressed different issues and challenges that web crawlers face. Different solutions have been proposed to reduce the time and cost of crawling. Performing an exhaustive crawl is a challenging question. Additionally capturing the model of a modern web application and extracting data from it automatically is another open question. What follows is a brief history of different technique and algorithms used from the early days of crawling up to the recent days. We introduce criteria to evaluate the relative performance of web crawlers. Based on these criteria we plot the evolution of web crawlers and compare their performanc

arXiv.org e-Print Archive

CiteSeerX

Structural Learning of Attack Vectors for Generating Mutated XSS Attacks

Author: Adam Kieyzun
Andreas Stolcke
Andreas Stolcke
Andrew James Viterbi
Ching-Hao Mao
Davide Balzarotti
Fourthdimension
Gary Wassermann
Geeknet Inc.
Gwen Salaün
Hahn-Ming Lee
Jason Bau
Jeff Offutt
Kevin Fernandez
Lawrence R. Rabiner
Mrmunkey22
Nenad Jovanovic
Niels Provos
Niels Provos
OWASP
Pierre Dupont
PortSwigger
Prahlad Fogla
Roflo1
Roy T. Fielding Tim Berners-Lee
RSnake
Sean McAllister
SearchSecurity.com
Simon Hansman
Stefan Kals
Sylvain Hallé
Xiang Fu
Yi-Hsun Wang
Publication venue: 'Open Publishing Association'
Publication date: 01/09/2010
Field of study

Web applications suffer from cross-site scripting (XSS) attacks that resulting from incomplete or incorrect input sanitization. Learning the structure of attack vectors could enrich the variety of manifestations in generated XSS attacks. In this study, we focus on generating more threatening XSS attacks for the state-of-the-art detection approaches that can find potential XSS vulnerabilities in Web applications, and propose a mechanism for structural learning of attack vectors with the aim of generating mutated XSS attacks in a fully automatic way. Mutated XSS attack generation depends on the analysis of attack vectors and the structural learning mechanism. For the kernel of the learning mechanism, we use a Hidden Markov model (HMM) as the structure of the attack vector model to capture the implicit manner of the attack vector, and this manner is benefited from the syntax meanings that are labeled by the proposed tokenizing mechanism. Bayes theorem is used to determine the number of hidden states in the model for generalizing the structure model. The paper has the contributions as following: (1) automatically learn the structure of attack vectors from practical data analysis to modeling a structure model of attack vectors, (2) mimic the manners and the elements of attack vectors to extend the ability of testing tool for identifying XSS vulnerabilities, (3) be helpful to verify the flaws of blacklist sanitization procedures of Web applications. We evaluated the proposed mechanism by Burp Intruder with a dataset collected from public XSS archives. The results show that mutated XSS attack generation can identify potential vulnerabilities.Comment: In Proceedings TAV-WEB 2010, arXiv:1009.330

arXiv.org e-Print Archive

Directory of Open Access Journals