466 research outputs found
A Brief History of Web Crawlers
Web crawlers visit internet applications, collect data, and learn about new
web pages from visited pages. Web crawlers have a long and interesting history.
Early web crawlers collected statistics about the web. In addition to
collecting statistics about the web and indexing the applications for search
engines, modern crawlers can be used to perform accessibility and vulnerability
checks on the application. Quick expansion of the web, and the complexity added
to web applications have made the process of crawling a very challenging one.
Throughout the history of web crawling many researchers and industrial groups
addressed different issues and challenges that web crawlers face. Different
solutions have been proposed to reduce the time and cost of crawling.
Performing an exhaustive crawl is a challenging question. Additionally
capturing the model of a modern web application and extracting data from it
automatically is another open question. What follows is a brief history of
different technique and algorithms used from the early days of crawling up to
the recent days. We introduce criteria to evaluate the relative performance of
web crawlers. Based on these criteria we plot the evolution of web crawlers and
compare their performanc
Reverse Engineering and Testing of Rich Internet Applications
The World Wide Web experiences a continuous and constant evolution, where new initiatives, standards, approaches and technologies are continuously proposed for developing more effective and higher quality Web applications.
To satisfy the growing request of the market for Web applications, new technologies, frameworks, tools and environments that allow to develop Web and mobile applications with the least effort and in very short time have been introduced in the last years.
These new technologies have made possible the dawn of a new generation of Web applications, named Rich Internet Applications (RIAs), that offer greater usability and interactivity than traditional ones. This evolution has been accompanied by some drawbacks that are mostly due to the lack of applying well-known software engineering practices and approaches. As a consequence, new research questions and challenges have emerged in the field of web and mobile applications maintenance and testing.
The research activity described in this thesis has addressed some of these topics with the specific aim of proposing new and effective solutions to the problems of modelling, reverse engineering, comprehending, re-documenting and testing existing RIAs.
Due to the growing relevance of mobile applications in the renewed Web scenarios, the problem of testing mobile applications developed for the Android operating system has been addressed too, in an attempt of exploring and proposing new techniques of testing automation for these type of applications
Browser-based Analysis of Web Framework Applications
Although web applications evolved to mature solutions providing sophisticated
user experience, they also became complex for the same reason. Complexity
primarily affects the server-side generation of dynamic pages as they are
aggregated from multiple sources and as there are lots of possible processing
paths depending on parameters. Browser-based tests are an adequate instrument
to detect errors within generated web pages considering the server-side process
and path complexity a black box. However, these tests do not detect the cause
of an error which has to be located manually instead. This paper proposes to
generate metadata on the paths and parts involved during server-side processing
to facilitate backtracking origins of detected errors at development time.
While there are several possible points of interest to observe for
backtracking, this paper focuses user interface components of web frameworks.Comment: In Proceedings TAV-WEB 2010, arXiv:1009.330
A Practical T-P3R2 Model to Test Dynamic Websites
Present day web applications are very complex as they employ more objects (controls) on a web page than traditional web applications. This results in more memory leaks, more CPU utilizations and longer test executions. Furthermore, today websites are dynamic meaning that the web pages are loaded according to the users input. Higher complexity of web software means more insecure website. This increases the attack surfaces. In this paper, it is proposed to use both Test-Driven Development (TDD) and white-box testing together to handle the dynamic aspects of web applications. Also, it proposes a new practical T-P3 R2 model to cope with dynamism of websites. Keywords: Dynamic website testing, TDD, Web Application Trees (WAT), Path testing
A folk song retrieval system with a gesture-based interface
This article describes how a folk song retrieval system uses a gesture-based interface to recognize KodĂĄly hand signs and formulate search queries
Automock Automated Mock Backend Generation for JavaScript based Applications
Modern web development is an intensely collaborative process. Frontend Developers, Backend Developers and Quality Assurance Engineers are integral cogs of a development machine. Frontend developers constantly juggle developing new features, fixing bugs and writing good unit test cases. Achieving this is sometimes difficult as frontend developers are not able to utilize their time completely. They have to wait for the backend to be ready and wait for pages to load during iterations. This paper proposes an approach that enables frontend developers to quickly generate a mock backend that behaves exactly like their actual backend. This generated mock backend minimizes the dependency between frontend developers and backend developers, since both the teams can now utilize the entire sprint duration efficiently. The approach also aids the frontend developer to perform quicker iterations and modifications to his or her code
DPMbox: An interactive user-friendly web interface for a disk-based grid storage system
Disk Pool Manager (DPM) es un sistema de gestiĂłn de almacenamiento que se usa dentro del Worldwide LHC Computing Grid. Ha sido desarrollado en el CERN y actualmente es el mĂĄs usado dentro de esta infraestructura de computaciĂłn distribuida.
Avanzando hacia el uso de estĂĄndares que faciliten el uso de DPM, recientemente se implementĂł una interfaz WebDAV (una extensiĂłn del protocolo HTTP) para este sistema. A pesar de ello esta interfaz aĂșn ofrece una funcionalidad bĂĄsica, sobre todo accediendo desde un navegador web, lo que hace que siga siendo necesario usar algunas herramientas especiales. El objetivo de DPMbox es ofrecer una interfaz realmente amigable, intuitiva y que pueda usarse con herramientas ya conocidas por los usuarios, como es el caso de un navegador web, atrayendo asĂ a usuarios menos tĂ©cnicos de la comunidad cientĂfica.
El proyecto basa su construcciĂłn en la interfaz WebDAV implementada y hace uso de tecnologĂas maduras y estĂĄndar que permiten este desarrollo como JavaScript/ECMAScript a travĂ©s de jQuery u otras librerĂas de apoyo, asĂ como HTML y CSS. Al realizarse como colaboraciĂłn con el CERN el desarrollo se centra en las funcionalidades requeridas por el sistema DPM. AĂșn asĂ, uno de los objetivos es que habiendo cumplido los requisitos iniciales, el sistema sea extensible y facilmente adaptable, haciendo posible su uso con otros sistemas que ofrezcan el protocolo WebDAV de manera general.Disk Pool Manager (DPM) is a lightweight storage management system for grid sites. It has been developed in CERN (European Organization for Nuclear Research), and it is the most widely adopted solution in the Worldwide LHC Computing Grid infrastructure.
Attracting less technical users has been an objective for the last years, thus, as an effort to move towards standard protocols that removes the need of special tools, DPM started offering a WebDAV (an extension of the HTTP protocol) interface, facilitating the access through commonly available tools, i.e. web browsers or WebDAV clients. However, this interface only provides basic functionality, especially when accessed from a web browser, making it still necessary to use some specific tools. DPMbox is a project for a friendly web interface that allows both technical and nontechnical users to manage their data from and into the grid by accessing it trough their web browsers.
The project has been built getting advantage of the implemented WebDAV front-end, and as a web development it uses standard and mature web technologies like HTML, CSS and JavaScript/ECMAScript as its core language. As a collaboration with CERN, the development has been focused on the functionality required by the DPM, but one of the objectives is to make DPMbox easily expandable and flexible, enabling its use with other systems that offer the WebDAV protocol
Ten Years of Rich Internet Applications: A Systematic Mapping Study, and Beyond
BACKGROUND: The term Rich Internet Applications (RIAs) is generally associated with Web appli-
cations that provide the features and functionality of traditional desktop applications. Ten years after the
introduction of the term, an ample amount of research has been carried out to study various aspects of
RIAs. It has thus become essential to summarize this research and provide an adequate overview.
OBJECTIVE: The objective of our study is to assemble, classify and analyze all RIA research performed
in the scienti c community, thus providing a consolidated overview thereof, and to identify well-established
topics, trends and open research issues. Additionally, we provide a qualitative discussion of the most inter-
esting ndings. This work therefore serves as a reference work for beginning and established RIA researchers
alike, as well as for industrial actors that need an introduction in the eld, or seek pointers to (a speci c
subset of) the state-of-the-art.
METHOD: A systematic mapping study is performed in order to identify all RIA-related publications,
de ne a classi cation scheme, and categorize, analyze, and discuss the identi ed research according to it.
RESULTS: Our source identi cation phase resulted in 133 relevant, peer-reviewed publications, published
between 2002 and 2011 in a wide variety of venues. They were subsequently classi ed according to four facets:
development activity, research topic, contribution type and research type. Pie, stacked bar and bubble charts
were used to visualize and analyze the results. A deeper analysis is provided for the most interesting and/or
remarkable results.
CONCLUSION: Analysis of the results shows that, although the RIA term was coined in 2002, the rst
RIA-related research appeared in 2004. From 2007 there was a signi cant increase in research activity,
peaking in 2009 and decreasing to pre-2009 levels afterwards. All development phases are covered in the
identi ed research, with emphasis on \design" (33%) and \implementation" (29%). The majority of research
proposes a \method" (44%), followed by \model" (22%), \methodology" (18%) and \tools" (16%); no
publications in the category \metrics" were found. The preponderant research topic is \models, methods
and methodologies" (23%) and to a lesser extent, \usability & accessibility" and \user interface" (11% each).
On the other hand, the topic \localization, internationalization & multi-linguality" received no attention at
all, and topics such as \deep web" (under 1%), \business processing", \usage analysis", \data management",
\quality & metrics", (all under 2%), \semantics" and \performance" (slightly above 2%) received very few
attention. Finally, there is a large majority of \solution proposals" (66%), few \evaluation research" (14%)
and even fewer \validation" (6%), although the latter are increasing in recent years
- âŠ