11,157 research outputs found
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
Automated Functional Testing based on the Navigation of Web Applications
Web applications are becoming more and more complex. Testing such
applications is an intricate hard and time-consuming activity. Therefore,
testing is often poorly performed or skipped by practitioners. Test automation
can help to avoid this situation. Hence, this paper presents a novel approach
to perform automated software testing for web applications based on its
navigation. On the one hand, web navigation is the process of traversing a web
application using a browser. On the other hand, functional requirements are
actions that an application must do. Therefore, the evaluation of the correct
navigation of web applications results in the assessment of the specified
functional requirements. The proposed method to perform the automation is done
in four levels: test case generation, test data derivation, test case
execution, and test case reporting. This method is driven by three kinds of
inputs: i) UML models; ii) Selenium scripts; iii) XML files. We have
implemented our approach in an open-source testing framework named Automatic
Testing Platform. The validation of this work has been carried out by means of
a case study, in which the target is a real invoice management system developed
using a model-driven approach.Comment: In Proceedings WWV 2011, arXiv:1108.208
Scripted GUI Testing of Android Apps: A Study on Diffusion, Evolution and Fragility
Background. Evidence suggests that mobile applications are not thoroughly
tested as their desktop counterparts. In particular GUI testing is generally
limited. Like web-based applications, mobile apps suffer from GUI test
fragility, i.e. GUI test classes failing due to minor modifications in the GUI,
without the application functionalities being altered.
Aims. The objective of our study is to examine the diffusion of GUI testing
on Android, and the amount of changes required to keep test classes up to date,
and in particular the changes due to GUI test fragility. We define metrics to
characterize the modifications and evolution of test classes and test methods,
and proxies to estimate fragility-induced changes.
Method. To perform our experiments, we selected six widely used open-source
tools for scripted GUI testing of mobile applications previously described in
the literature. We have mined the repositories on GitHub that used those tools,
and computed our set of metrics.
Results. We found that none of the considered GUI testing frameworks achieved
a major diffusion among the open-source Android projects available on GitHub.
For projects with GUI tests, we found that test suites have to be modified
often, specifically 5\%-10\% of developers' modified LOCs belong to tests, and
that a relevant portion (60\% on average) of such modifications are induced by
fragility.
Conclusions. Fragility of GUI test classes constitute a relevant concern,
possibly being an obstacle for developers to adopt automated scripted GUI
tests. This first evaluation and measure of fragility of Android scripted GUI
testing can constitute a benchmark for developers, and the basis for the
definition of a taxonomy of fragility causes, and actionable guidelines to
mitigate the issue.Comment: PROMISE'17 Conference, Best Paper Awar
Simplicity-oriented lifelong learning of web applications
Nowadays, web applications are ubiquitous. Entire business models revolve around making their services available over the Internet, anytime, anywhere in the world. Due to todayâs rapid development practices, software changes are released faster than ever before, creating the risk of losing control over the quality of the delivered products. To counter this, appropriate testing methodologies must be deeply integrated into each phase of the development cycle to identify potential defects as early as possible and to ensure that the product operates as expected in production. The use of low- and no-code tools and code generation technologies can drastically reduce the implementation effort by using well-tailored (graphical) Domain-Specific Languages (DSLs) to focus on what is important: the product. DSLs and corresponding Integrated Modeling Environments (IMEs) are a key enabler for quality control because many system properties can already be verified at a pre-product level. However, to verify that the product fulfills given functional requirements at runtime, end-to-end testing is still a necessity.
This dissertation describes the implementation of a lifelong learning framework for the continuous quality control of web applications. In this framework, models representing user-level behavior are mined from running systems using active automata learning, and system properties are verified using model checking. All this is achieved in a continuous and fully automated manner. Code changes trigger testing, learning, and verification processes which generate feedback that can be used for model refinement or product improvement. The main focus of this framework is simplicity. On the one hand, it allows Quality Assurance (QA) engineers to apply learning-based testing techniques to web applications with minimal effort, even without writing code; on the other hand, it allows automation engineers to easily implement these techniques in modern development workflows driven by Continuous Integration and Continuous Deployment (CI/CD).
The effectiveness of this framework is leveraged by the Language-Driven Engineering (LDE) approach to web development. Key to this is the text-based DSL iHTML, which enables the instrumentation of user interfaces to make web applications learnable by design, i.e., they adhere to practices that allow fully automated inference of behavioral models without prior specification of an input alphabet. By designing code generators to generate instrumented web-based products, the effort for quality control in the LDE ecosystem is minimized and reduced to formulating runtime properties in temporal logic and verifying them against learned models
Topic driven testing
Modern interactive applications offer so many interaction opportunities that automated exploration and testing becomes practically impossible without some domain specific guidance towards relevant functionality. In this dissertation, we present a novel fundamental graphical user interface testing method called topic-driven testing. We mine the semantic meaning of interactive elements, guide testing, and identify core functionality of applications. The semantic interpretation is close to human understanding and allows us to learn specifications and transfer knowledge across multiple applications independent of the underlying device, platform, programming language, or technology stackâto the best of our knowledge a unique feature of our technique. Our tool ATTABOY is able to take an existing Web application test suite say from Amazon, execute it on ebay, and thus guide testing to relevant core functionality. Tested on different application domains such as eCommerce, news pages, mail clients, it can trans- fer on average sixty percent of the tested application behavior to new appsâwithout any human intervention. On top of that, topic-driven testing can go with even more vague instructions of how-to descriptions or use-case descriptions. Given an instruction, say âadd item to shopping cartâ, it tests the specified behavior in an applicationâboth in a browser as well as in mobile apps. It thus improves state-of-the-art UI testing frame- works, creates change resilient UI tests, and lays the foundation for learning, transfer- ring, and enforcing common application behavior. The prototype is up to five times faster than existing random testing frameworks and tests functions that are hard to cover by non-trained approaches.Moderne interaktive Anwendungen bieten so viele Interaktionsmöglichkeiten, dass eine vollstĂ€ndige automatische Exploration und das Testen aller Szenarien praktisch unmöglich ist. Stattdessen muss die Testprozedur auf relevante KernfunktionalitĂ€t ausgerichtet werden. Diese Arbeit stellt ein neues fundamentales Testprinzip genannt thematisches Testen vor, das beliebige Anwendungen u Ìber die graphische OberflĂ€che testet. Wir untersuchen die semantische Bedeutung von interagierbaren Elementen um die Kernfunktionenen von Anwendungen zu identifizieren und entsprechende Tests zu erzeugen. Statt typischen starren Testinstruktionen orientiert sich diese Art von Tests an menschlichen AnwendungsfĂ€llen in natĂŒrlicher Sprache. Dies erlaubt es, Software Spezifikationen zu erlernen und Wissen von einer Anwendung auf andere zu ĂŒbertragen unabhĂ€ngig von der Anwendungsart, der Programmiersprache, dem TestgerĂ€t oder der -Plattform. Nach unserem Kenntnisstand ist unser Ansatz der Erste dieser Art. Wir prĂ€sentieren ATTABOY, ein Programm, das eine existierende Testsammlung fĂŒr eine Webanwendung (z.B. fĂŒr Amazon) nimmt und in einer beliebigen anderen Anwendung (sagen wir ebay) ausfĂŒhrt. Dadurch werden Tests fĂŒr Kernfunktionen generiert. Bei der ersten AusfĂŒhrung auf Anwendungen aus den DomĂ€nen Online Shopping, Nachrichtenseiten und eMail, erzeugt der Prototyp sechzig Prozent der Tests automatisch. Ohne zusĂ€tzlichen manuellen Aufwand. DarĂŒber hinaus interpretiert themen- getriebenes Testen auch vage Anweisungen beispielsweise von How-to Anleitungen oder Anwendungsbeschreibungen. Eine Anweisung wie "FĂŒgen Sie das Produkt in den Warenkorb hinzu" testet das entsprechende Verhalten in der Anwendung. Sowohl im Browser, als auch in einer mobilen Anwendung. Die erzeugten Tests sind robuster und effektiver als vergleichbar erzeugte Tests. Der Prototyp testet die ZielfunktionalitĂ€t fĂŒnf mal schneller und testet dabei Funktionen die durch nicht spezialisierte AnsĂ€tze kaum zu erreichen sind
Neural Embeddings for Web Testing
Web test automation techniques employ web crawlers to automatically produce a
web app model that is used for test generation. Existing crawlers rely on
app-specific, threshold-based, algorithms to assess state equivalence. Such
algorithms are hard to tune in the general case and cannot accurately identify
and remove near-duplicate web pages from crawl models. Failing to retrieve an
accurate web app model results in automated test generation solutions that
produce redundant test cases and inadequate test suites that do not cover the
web app functionalities adequately. In this paper, we propose WEBEMBED, a novel
abstraction function based on neural network embeddings and threshold-free
classifiers that can be used to produce accurate web app models during
model-based test generation. Our evaluation on nine web apps shows that
WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates
more accurately, inferring better web app models that exhibit 22% more
precision, and 24% more recall on average. Consequently, the test suites
generated from these models achieve higher code coverage, with improvements
ranging from 2% to 59% on an app-wise basis and averaging at 23%.Comment: 12 pages; in revisio
- âŠ