3,274 research outputs found

    A Brief History of Web Crawlers

    Full text link
    Web crawlers visit internet applications, collect data, and learn about new web pages from visited pages. Web crawlers have a long and interesting history. Early web crawlers collected statistics about the web. In addition to collecting statistics about the web and indexing the applications for search engines, modern crawlers can be used to perform accessibility and vulnerability checks on the application. Quick expansion of the web, and the complexity added to web applications have made the process of crawling a very challenging one. Throughout the history of web crawling many researchers and industrial groups addressed different issues and challenges that web crawlers face. Different solutions have been proposed to reduce the time and cost of crawling. Performing an exhaustive crawl is a challenging question. Additionally capturing the model of a modern web application and extracting data from it automatically is another open question. What follows is a brief history of different technique and algorithms used from the early days of crawling up to the recent days. We introduce criteria to evaluate the relative performance of web crawlers. Based on these criteria we plot the evolution of web crawlers and compare their performanc

    Reverse Engineering and Testing of Rich Internet Applications

    Get PDF
    The World Wide Web experiences a continuous and constant evolution, where new initiatives, standards, approaches and technologies are continuously proposed for developing more effective and higher quality Web applications. To satisfy the growing request of the market for Web applications, new technologies, frameworks, tools and environments that allow to develop Web and mobile applications with the least effort and in very short time have been introduced in the last years. These new technologies have made possible the dawn of a new generation of Web applications, named Rich Internet Applications (RIAs), that offer greater usability and interactivity than traditional ones. This evolution has been accompanied by some drawbacks that are mostly due to the lack of applying well-known software engineering practices and approaches. As a consequence, new research questions and challenges have emerged in the field of web and mobile applications maintenance and testing. The research activity described in this thesis has addressed some of these topics with the specific aim of proposing new and effective solutions to the problems of modelling, reverse engineering, comprehending, re-documenting and testing existing RIAs. Due to the growing relevance of mobile applications in the renewed Web scenarios, the problem of testing mobile applications developed for the Android operating system has been addressed too, in an attempt of exploring and proposing new techniques of testing automation for these type of applications

    A Practical Blended Analysis for Dynamic Features in JavaScript

    Get PDF
    The JavaScript Blended Analysis Framework is designed to perform a general-purpose, practical combined static/dynamic analysis of JavaScript programs, while handling dynamic features such as run-time generated code and variadic func- tions. The idea of blended analysis is to focus static anal- ysis on a dynamic calling structure collected at runtime in a lightweight manner, and to rene the static analysis us- ing additional dynamic information. We perform blended points-to analysis of JavaScript with our framework and compare results with those computed by a pure static points- to analysis. Using JavaScript codes from actual webpages as benchmarks, we show that optimized blended analysis for JavaScript obtains good coverage (86.6% on average per website) of the pure static analysis solution and nds ad- ditional points-to pairs (7.0% on average per website) con- tributed by dynamically generated/loaded code

    MT-WAVE: Profiling multi-tier web applications

    Get PDF
    The web is evolving: what was once primarily used for sharing static content has now evolved into a platform for rich client-side applications. These applications do not run exclusively on the client; while the client is responsible for presentation and some processing, there is a significant amount of processing and persistence that happens server-side. This has advantages and disadvantages. The biggest advantage is that the user’s data is accessible from anywhere. It doesn’t matter which device you sign into a web application from, everything you’ve been working on is instantly accessible. The largest disadvantage is that large numbers of servers are required to support a growing user base; unlike traditional client applications, an organization making a web application needs to provision compute and storage resources for each expected user. This infrastructure is designed in tiers that are responsible for different aspects of the application, and these tiers may not even be run by the same organization. As these systems grow in complexity, it becomes progressively more challenging to identify and solve performance problems. While there are many measures of software system performance, web application users only care about response latency. This “fingertip-to-eyeball performance” is the only metric that users directly perceive: when a button is clicked in a web application, how long does it take for the desired action to complete? MT-WAVE is a system for solving fingertip-to-eyeball performance problems in web applications. The system is designed for doing multi-tier tracing: each piece of the application is instrumented, execution traces are collected, and the system merges these traces into a single coherent snapshot of system latency at every tier. To ensure that user-perceived latency is accurately captured, the tracing begins in the web browser. The application developer then uses the MT-WAVE Visualization System to explore the execution traces to first identify which system is causing the largest amount of latency, and then zooms in on the specific function calls in that tier to find optimization candidates. After fixing an identified problem, the system is used to verify that the changes had the intended effect. This optimization methodology and toolset is explained through a series of case studies that identify and solve performance problems in open-source and commercial applications. These case studies demonstrate both the utility of the MT-WAVE system and the unintuitive nature of system optimization

    Test Generation and Dependency Analysis for Web Applications

    Get PDF
    In web application testing existing model based web test generators derive test paths from a navigation model of the web application, completed with either manually or randomly generated inputs. Test paths extraction and input generation are handled separately, ignoring the fact that generating inputs for test paths is difficult or even impossible if such paths are infeasible. In this thesis, we propose three directions to mitigate the path infeasibility problem. The first direction uses a search based approach defining novel set of genetic operators that support the joint generation of test inputs and feasible test paths. Results show that such search based approach can achieve higher level of model coverage than existing approaches. Secondly, we propose a novel web test generation algorithm that pre-selects the most promising candidate test cases based on their diversity from previously generated tests. Results of our empirical evaluation show that promoting diversity is beneficial not only to a thorough exploration of the web application behaviours, but also to the feasibility of automatically generated test cases. Moreover, the diversity based approach achieves higher coverage of the navigation model significantly faster than crawling based and search based approaches. The third approach we propose uses a web crawler as a test generator. As such, the generated tests are concrete, hence their navigations among the web application states are feasible by construction. However, the crawling trace cannot be easily turned into a minimal test suite that achieves the same coverage due to test dependencies. Indeed, test dependencies are undesirable in the context of regression testing, preventing the adoption of testing optimization techniques that assume tests to be independent. In this thesis, we propose the first approach to detect test dependencies in a given web test suite by leveraging the information available both in the web test code and on the client side of the web application. Results of our empirical validation show that our approach can effectively and efficiently detect test dependencies and it enables dependency aware formulations of test parallelization and test minimization

    Semantics of RxJS

    Get PDF
    RxJS is a popular JavaScript library for reactive programming in Web applications. It provides numerous operators to create, combine, transform, and filter discrete events and to handle errors. These operators may be stateful and have side effects, which makes it difficult to understand the precise meaning of the resulting computation. In this paper, we define a formal model for RxJS programs by formalizing a selected subset of RxJS operators using a small-step operational semantics. We present several debugging related applications using the semantics as a model. We also implemented a subset of RxJS based on this semantics, which provides convenient access to the runtime representation of the RxJS program to help debugging

    An Extensible User Interface for Lean 4

    Get PDF
    Contemporary proof assistants rely on complex automation and process libraries with millions of lines of code. At these scales, understanding the emergent interactions between components can be a serious challenge. One way of managing complexity, long established in informal practice, is through varying external representations. For instance, algebraic notation facilitates term-based reasoning whereas geometric diagrams invoke spatial intuition. Objects viewed one way become much simpler than when viewed differently. In contrast, modern general-purpose ITP systems usually only support limited, textual representations. Treating this as a problem of human-computer interaction, we aim to demonstrate that presentations - UI elements that store references to the objects they are displaying - are a fruitful way of thinking about ITP interface design. They allow us to make headway on two fronts - introspection of prover internals and support for diagrammatic reasoning. To this end we have built an extensible user interface for the Lean 4 prover with an associated ProofWidgets 4 library of presentation-based UI components. We demonstrate the system with several examples including type information popups, structured traces, contextual suggestions, a display for algebraic reasoning, and visualizations of red-black trees. Our interface is already part of the core Lean distribution
    • 

    corecore