127 research outputs found

    Reverse Engineering and Testing of Rich Internet Applications

    Get PDF
    The World Wide Web experiences a continuous and constant evolution, where new initiatives, standards, approaches and technologies are continuously proposed for developing more effective and higher quality Web applications. To satisfy the growing request of the market for Web applications, new technologies, frameworks, tools and environments that allow to develop Web and mobile applications with the least effort and in very short time have been introduced in the last years. These new technologies have made possible the dawn of a new generation of Web applications, named Rich Internet Applications (RIAs), that offer greater usability and interactivity than traditional ones. This evolution has been accompanied by some drawbacks that are mostly due to the lack of applying well-known software engineering practices and approaches. As a consequence, new research questions and challenges have emerged in the field of web and mobile applications maintenance and testing. The research activity described in this thesis has addressed some of these topics with the specific aim of proposing new and effective solutions to the problems of modelling, reverse engineering, comprehending, re-documenting and testing existing RIAs. Due to the growing relevance of mobile applications in the renewed Web scenarios, the problem of testing mobile applications developed for the Android operating system has been addressed too, in an attempt of exploring and proposing new techniques of testing automation for these type of applications

    Scripts in a Frame: A Framework for Archiving Deferred Representations

    Get PDF
    Web archives provide a view of the Web as seen by Web crawlers. Because of rapid advancements and adoption of client-side technologies like JavaScript and Ajax, coupled with the inability of crawlers to execute these technologies effectively, Web resources become harder to archive as they become more interactive. At Web scale, we cannot capture client-side representations using the current state-of-the art toolsets because of the migration from Web pages to Web applications. Web applications increasingly rely on JavaScript and other client-side programming languages to load embedded resources and change client-side state. We demonstrate that Web crawlers and other automatic archival tools are unable to archive the resulting JavaScript-dependent representations (what we term deferred representations), resulting in missing or incorrect content in the archives and the general inability to replay the archived resource as it existed at the time of capture. Building on prior studies on Web archiving, client-side monitoring of events and embedded resources, and studies of the Web, we establish an understanding of the trends contributing to the increasing unarchivability of deferred representations. We show that JavaScript leads to lower-quality mementos (archived Web resources) due to the archival difficulties it introduces. We measure the historical impact of JavaScript on mementos, demonstrating that the increased adoption of JavaScript and Ajax correlates with the increase in missing embedded resources. To measure memento and archive quality, we propose and evaluate a metric to assess memento quality closer to Web users’ perception. We propose a two-tiered crawling approach that enables crawlers to capture embedded resources dependent upon JavaScript. Measuring the performance benefits between crawl approaches, we propose a classification method that mitigates the performance impacts of the two-tiered crawling approach, and we measure the frontier size improvements observed with the two-tiered approach. Using the two-tiered crawling approach, we measure the number of client-side states associated with each URI-R and propose a mechanism for storing the mementos of deferred representations. In short, this dissertation details a body of work that explores the following: why JavaScript and deferred representations are difficult to archive (establishing the term deferred representation to describe JavaScript dependent representations); the extent to which JavaScript impacts archivability along with its impact on current archival tools; a metric for measuring the quality of mementos, which we use to describe the impact of JavaScript on archival quality; the performance trade-offs between traditional archival tools and technologies that better archive JavaScript; and a two-tiered crawling approach for discovering and archiving currently unarchivable descendants (representations generated by client-side user events) of deferred representations to mitigate the impact of JavaScript on our archives. In summary, what we archive is increasingly different from what we as interactive users experience. Using the approaches detailed in this dissertation, archives can create mementos closer to what users experience rather than archiving the crawlers’ experiences on the Web

    Crawling AJAX by Inferring User Interface State Changes

    Full text link

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    The approaches to quantify web application security scanners quality: A review

    Get PDF
    The web application security scanner is a computer program that assessed web application security with penetration testing technique. The benefit of automated web application penetration testing is huge, which web application security scanner not only reduced the time, cost, and resource required for web application penetration testing but also eliminate test engineer reliance on human knowledge. Nevertheless, web application security scanners are possessing weaknesses of low test coverage, and the scanners are generating inaccurate test results. Consequently, experimentations are frequently held to quantitatively quantify web application security scanner's quality to investigate the web application security scanner's strengths and limitations. However, there is a discovery that neither a standard methodology nor criterion is available for quantifying the web application security scanner's quality. Hence, in this paper systematic review is conducted and analysed the methodology and criterion used for quantifying web application security scanners' quality. In this survey, the experiment methodologies and criterions that had been used to quantify web application security scanner's quality is classified and review using the preferred reporting items for systematic reviews and meta-analyses (PRISMA) protocol. The objectives are to provide practitioners with the understanding of methodologies and criterions that available for measuring web application security scanners' test coverage, attack coverage, and vulnerability detection rate, while provides the critical hint for development of the next testing framework, model, methodology, or criterions, to measure web application security scanner quality

    Reverse engineering of web applications

    Get PDF
    The MAP-i Doctoral Program of the Universities of Minho, Aveiro and PortoEven so many years after its genesis, the Internet is still growing. Not only are the users increasing, so are the number of different programming languages or frameworks for building Web applications. However, this plethora of technologies makes Web applications’ source code hard to comprehend and understand, thus deteriorating both their debugging and their maintenance costs. In this context, a number of proposals have been put forward to solve this problem. While, on one hand, there are techniques that analyze the entire source code of Web applications, the diversity of available implementation technology makes these techniques return unsatisfactory results. On the other hand, there are also techniques that dynamically (but blindly) explore the applications by running them and analyzing the results of randomly exploring them. In this case the results are better, but there is always the chance that some part of the application might be left unexplored. This thesis investigates if an hybrid approach combining static analysis and dynamic exploration of the user interface can provide better results. FREIA, a framework developed in the context of this thesis, is capable of analyzing Web applications automatically, deriving structural and behavioral interface models from them.Mesmo decorridos tantos anos desde a sua gĂ©nese, a Internet continua a crescer. Este crescimento aplica-se nĂŁo sĂł ao nĂșmero de utilizadores como tambĂ©m ao nĂșmero de diferentes linguagens de programação e frameworks utilizadas para a construção de aplicaçÔes Web. No entanto, esta pletora de tecnologias leva a que o cĂłdigo fonte das aplicaçÔes Web seja difĂ­cil de compreender e analisar, deteriorando tanto o seu depuramento como os seus custos de manutenção. Neste contexto, foram desenvolvidas algumas propostas com intuito de resolver este problema. NĂŁo obstante, por um lado, existirem tĂ©cnicas que analisam a totalidade do cĂłdigo fonte das aplicaçÔes Web, a diversidade das tecnologias de implementação existentes fazem com que estas tĂ©cnicas gerem resultados insatisfatĂłrios. Por outro lado, existem tambĂ©m tĂ©cnicas que, dinamicamente (apesar de cegamente), exploram as aplicaçÔes, executando-as e analisando os resultados da sua exploração aleatĂłria. Neste caso, os resultados sĂŁo melhores, mas corremos o risco de ter deixado alguma parte da aplicação por explorar. Esta tese investiga se uma abordagem hĂ­brida, combinando a anĂĄlise estĂĄtica com a exploração dinĂąmica da interface do utilizador consegue produzir melhores resultados. FREIA, uma framework desenvolvida no contexto desta tese Ă© capaz de, automaticamente, analisar aplicaçÔes Web, derivando modelos estruturais e comportamentais da interface das mesmas.Esta investigação foi financiada pela Fundação para a CiĂȘncia e Tecnologia atravĂ©s da concessĂŁo de uma bolsa de doutoramento (SFRH/BD/71136/2010) no Ăąmbito do Programa Operacional Potencial Humano (POPH), comparticipado pelo Fundo Social Europeu e por fundos nacionais do QREN

    Topic driven testing

    Get PDF
    Modern interactive applications offer so many interaction opportunities that automated exploration and testing becomes practically impossible without some domain specific guidance towards relevant functionality. In this dissertation, we present a novel fundamental graphical user interface testing method called topic-driven testing. We mine the semantic meaning of interactive elements, guide testing, and identify core functionality of applications. The semantic interpretation is close to human understanding and allows us to learn specifications and transfer knowledge across multiple applications independent of the underlying device, platform, programming language, or technology stack—to the best of our knowledge a unique feature of our technique. Our tool ATTABOY is able to take an existing Web application test suite say from Amazon, execute it on ebay, and thus guide testing to relevant core functionality. Tested on different application domains such as eCommerce, news pages, mail clients, it can trans- fer on average sixty percent of the tested application behavior to new apps—without any human intervention. On top of that, topic-driven testing can go with even more vague instructions of how-to descriptions or use-case descriptions. Given an instruction, say “add item to shopping cart”, it tests the specified behavior in an application–both in a browser as well as in mobile apps. It thus improves state-of-the-art UI testing frame- works, creates change resilient UI tests, and lays the foundation for learning, transfer- ring, and enforcing common application behavior. The prototype is up to five times faster than existing random testing frameworks and tests functions that are hard to cover by non-trained approaches.Moderne interaktive Anwendungen bieten so viele Interaktionsmöglichkeiten, dass eine vollstĂ€ndige automatische Exploration und das Testen aller Szenarien praktisch unmöglich ist. Stattdessen muss die Testprozedur auf relevante KernfunktionalitĂ€t ausgerichtet werden. Diese Arbeit stellt ein neues fundamentales Testprinzip genannt thematisches Testen vor, das beliebige Anwendungen u ̈ber die graphische OberflĂ€che testet. Wir untersuchen die semantische Bedeutung von interagierbaren Elementen um die Kernfunktionenen von Anwendungen zu identifizieren und entsprechende Tests zu erzeugen. Statt typischen starren Testinstruktionen orientiert sich diese Art von Tests an menschlichen AnwendungsfĂ€llen in natĂŒrlicher Sprache. Dies erlaubt es, Software Spezifikationen zu erlernen und Wissen von einer Anwendung auf andere zu ĂŒbertragen unabhĂ€ngig von der Anwendungsart, der Programmiersprache, dem TestgerĂ€t oder der -Plattform. Nach unserem Kenntnisstand ist unser Ansatz der Erste dieser Art. Wir prĂ€sentieren ATTABOY, ein Programm, das eine existierende Testsammlung fĂŒr eine Webanwendung (z.B. fĂŒr Amazon) nimmt und in einer beliebigen anderen Anwendung (sagen wir ebay) ausfĂŒhrt. Dadurch werden Tests fĂŒr Kernfunktionen generiert. Bei der ersten AusfĂŒhrung auf Anwendungen aus den DomĂ€nen Online Shopping, Nachrichtenseiten und eMail, erzeugt der Prototyp sechzig Prozent der Tests automatisch. Ohne zusĂ€tzlichen manuellen Aufwand. DarĂŒber hinaus interpretiert themen- getriebenes Testen auch vage Anweisungen beispielsweise von How-to Anleitungen oder Anwendungsbeschreibungen. Eine Anweisung wie "FĂŒgen Sie das Produkt in den Warenkorb hinzu" testet das entsprechende Verhalten in der Anwendung. Sowohl im Browser, als auch in einer mobilen Anwendung. Die erzeugten Tests sind robuster und effektiver als vergleichbar erzeugte Tests. Der Prototyp testet die ZielfunktionalitĂ€t fĂŒnf mal schneller und testet dabei Funktionen die durch nicht spezialisierte AnsĂ€tze kaum zu erreichen sind

    Upper Tag Ontology (UTO) For Integrating Social Tagging Data

    Get PDF
    Data integration and mediation have become central concerns of information technology over the past few decades. With the advent of the Web and the rapid increases in the amount of data and the number of Web documents and users, researchers have focused on enhancing the interoperability of data through the development of metadata schemes. Other researchers have looked to the wealth of metadata generated by bookmarking sites on the Social Web. While several existing ontologies have capitalized on the semantics of metadata created by tagging activities, the Upper Tag Ontology (UTO) emphasizes the structure of tagging activities to facilitate modeling of tagging data and the integration of data from different bookmarking sites as well as the alignment of tagging ontologies. UTO is described and its utility in modeling, harvesting, integrating, searching, and analyzing data is demonstrated with metadata harvested from three major social tagging systems (Delicious, Flickr, and YouTube)

    Advanced Automated Web Application Vulnerability Analysis

    Get PDF
    Web applications are an integral part of our lives and culture. We useweb applications to manage our bank accounts, interact with friends,and file our taxes. A single vulnerability in one of these webapplications could allow a malicious hacker to steal your money, toimpersonate you on Facebook, or to access sensitive information, suchas tax returns. It is vital that we develop new approaches to discoverand fix these vulnerabilities before the cybercriminals exploit them.In this dissertation, I will present my research on securing the webagainst current threats and future threats. First, I will discuss mywork on improving black-box vulnerability scanners, which are toolsthat can automatically discover vulnerabilities in web applications.Then, I will describe a new type of web application vulnerability:Execution After Redirect, or EAR, and an approach to automaticallydetect EARs in web applications. Finally, I will present deDacota, afirst step in the direction of making web applications secure byconstruction
    • 

    corecore