Search CORE

2,676 research outputs found

A Custom Browser Architecture to Execute Web Navigation Sequences

Author: Losada Pérez José
Montoto Paula
Pan Bermúdez Carlos Alberto
Raposo Santiago Juan
Álvarez Díaz Manuel
Publication venue: Springer
Publication date: 18/12/2015
Field of study

This version of the article has been accepted for publication, after peer review and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/978-3-319-26187-4_11.[Abstract]: Web automation applications are widely used for different purposes such as B2B integration and automated testing of web applications. Most current systems build the automatic web navigation component by using the APIs of conventional browsers. This approach suffers performance problems for intensive web automation tasks which require real time responses and/or a high degree of parallelism. Other systems use the approach of creating custom browsers to avoid some of the tasks of conventional browsers, but they work like them, when building the internal representation of the web pages. In this paper, we present a complete architecture for a custom browser able to efficiently execute web navigation sequences. The proposed architecture supports some novel automatic optimization techniques that can be applied when loading and building the internal representation of the pages. The tests performed using real web sources show that the reference implementation of the proposed architecture runs significantly faster than other navigation components

Repositorio da Universidade da Coruña

Sistema multidimensional de armazenamento e classificação de dados

Author: Filipe Ricardo Pimentel
Publication venue
Publication date: 03/12/2018
Field of study

Nowadays with the advancement of technology and its wide availability, information has been digitally generated, whether documents, photos, videos created by people or data files generated by electronic devices. This creates a huge amount of available data, which causes a burden when it comes to accessing to information and relating different pieces of data. This dissertation aims to create an online repository that is capable of storing any type of digital file and associating information to them, both automatically and manually, so that they are found more easily.Com o avanço da tecnologia e a sua larga disponibilidade nos dias de hoje, a informação tem vindo a ser gerada digitalmente, quer se trate de documentos, fotos, vídeos criados por individuos, ou dados produzidos por dispositivos eletrónicos. Esta facilidade em gerar informação cria um enorme aumento na quantidade de dados que estão disponíveis, dificultando assim o acesso à informação pretendida e a associação entre diversos dados. Esta dissertação tem como objetivo disponibilizar um repositório online no qual seja possível armazenar qualquer tipo de documento digital e associar informação, de forma automática e manual, de modo a que estes sejam encontrados com maior facilidade.Mestrado em Engenharia de Computadores e Telemátic

Repositório Institucional da Universidade de Aveiro

Web Data Extraction, Applications and Techniques: A Survey

Author: Abel
Amalfitano
Balduzzi
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Baumgartner
Berger
Berthold
Bettencourt
Califf
Catanese
Chang
Chen
Chen
Chen
Collins
Conover
Crandall
Crescenzi
Crescenzi
Dalvi
Dalvi
De Meo
De Meo
Doan
Emilio Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Ferrara
Flesca
Freitag
Furche
Gatterbauer
Gatterbauer
Giacomo Fiumara
Gjoka
Gkotsis
Gottlob
Gottlob
Hammersley
Han
Hecht
Hsu
Irmak
Khare
Kim
Kinsella
Kleinberg
Kleinberg
Kohlschütter
Kokkoras
Kokkoras
Kokkoras
Krüpl
Kushmerick
Kwak
Laender
Liu
Manning
Masanès
Mathes
Meng
Mislove
Monge
Muslea
Oro
Pan
Pasquale De Meo
Perito
Phan
Plake
Rahm
Rahm
Reis
Robert Baumgartner
Sahuguet
Sarawagi
Schifanella
Selkow
Shi
Soderland
Szomszor
Turmo
Vosecky
Wang
Wang
Weikum
Wilson
Winograd
Yang
Ye
Zafarani
Zanasi
Zhai
Zhang
Zhang
Publication venue: 'Elsevier BV'
Publication date: 09/06/2014
Field of study

Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

arXiv.org e-Print Archive

Crossref

Design and Implementation of the UniProt Website

Author: Amos Bairoch
Elisabeth Gasteiger
Eric Jain
Isabelle Phan
Maria J. Martin
Nicole Redaschi
Peter McGarvey
Severine Duvaud
Publication venue
Publication date: 06/12/2008
Field of study

The UniProt consortium is the main provider of protein sequence and annotation data for much of the life sciences community. The "www.uniprot.org":http://www.uniprot.org website is the primary access point to this data and to documentation and basic tools for the data. This paper discusses the design and implementation of the new website, which was released in July 2008, and shows how it improves data access for users with different levels of experience, as well as to machines for programmatic access

Nature Precedings

Web browsing automation for applications quality control

Author: Dueñas Juan Carlos
Garcia Gutierrez Boni
Publication venue: 'Rinton Press'
Publication date: 01/11/2015
Field of study

Context: Quality control comprises the set of activities aimed to evaluate that software meets its specification and delivers the functionality expected by the consumers. These activities are often removed in the development process and, as a result, the final software product usually lacks quality. Objective: We propose a set of techniques to automate the quality control for web applications from the client-side, guiding the process by functional and nonfunctional requirements (performance, security, compatibility, usability and accessibility). Method: The first step to achieve automation is to define the structure of the web navigation. Existing software artifacts in the phase of analysis and design are reused. Then, the independent paths of navigation are found, and each path is traversed automatically using real browsers while different kinds of assessments are carried out. Results: The processes and methods proposed in this paper have been implemented by means of a reference architecture and open source tools. A laboratory experiment and an industrial case study have been performed in order to validate the proposal. Conclusion: The definition of navigation paths is a rich approach to model web applications. Grey-box (black-box and white-box) methods have been proved to be very valuable for web assessment. The Chinese Postman Problem (CPP) is an optimal way to find the independent paths in a web navigation modeled as a directed graph

Universidad Carlos III de Madrid e-Archivo

Temporal meta-model framework for Enterprise Information Systems (EIS) development

Author: Davis Jon Edward
Publication venue: Curtin University
Publication date: 01/01/2014
Field of study

This thesis has developed a Temporal Meta-Model Framework for semi-automated Enterprise System Development, which can help drastically reduce the time and cost to develop, deploy and maintain Enterprise Information Systems throughout their lifecycle. It proposes that the analysis and requirements gathering can also perform the bulk of the design phase, stored and available in a suitable model which would then be capable of automated execution with the availability of a set of specific runtime components

espace@Curtin

Test Generation and Dependency Analysis for Web Applications

Author: Biagiola Matteo
Publication venue: Universit\ue0 degli studi di Genova
Publication date: 15/01/2020
Field of study

In web application testing existing model based web test generators derive test paths from a navigation model of the web application, completed with either manually or randomly generated inputs. Test paths extraction and input generation are handled separately, ignoring the fact that generating inputs for test paths is difficult or even impossible if such paths are infeasible. In this thesis, we propose three directions to mitigate the path infeasibility problem. The first direction uses a search based approach defining novel set of genetic operators that support the joint generation of test inputs and feasible test paths. Results show that such search based approach can achieve higher level of model coverage than existing approaches. Secondly, we propose a novel web test generation algorithm that pre-selects the most promising candidate test cases based on their diversity from previously generated tests. Results of our empirical evaluation show that promoting diversity is beneficial not only to a thorough exploration of the web application behaviours, but also to the feasibility of automatically generated test cases. Moreover, the diversity based approach achieves higher coverage of the navigation model significantly faster than crawling based and search based approaches. The third approach we propose uses a web crawler as a test generator. As such, the generated tests are concrete, hence their navigations among the web application states are feasible by construction. However, the crawling trace cannot be easily turned into a minimal test suite that achieves the same coverage due to test dependencies. Indeed, test dependencies are undesirable in the context of regression testing, preventing the adoption of testing optimization techniques that assume tests to be independent. In this thesis, we propose the first approach to detect test dependencies in a given web test suite by leveraging the information available both in the web test code and on the client side of the web application. Results of our empirical validation show that our approach can effectively and efficiently detect test dependencies and it enables dependency aware formulations of test parallelization and test minimization

Archivio istituzionale della ricerca - Università di Genova

Enabling Customization of Discussion Forums for Blind Users

Author: Ashok Vikas
Jayarathna Sampath
Lee Hae-Na
Prakash Yash
Sunkara Mohan
Publication venue: ODU Digital Commons
Publication date: 01/01/2023
Field of study

Online discussion forums have become an integral component of news, entertainment, information, and video-streaming websites, where people all over the world actively engage in discussions on a wide range of topics including politics, sports, music, business, health, and world affairs. Yet, little is known about their usability for blind users, who aurally interact with the forum conversations using screen reader assistive technology. In an interview study, blind users stated that they often had an arduous and frustrating interaction experience while consuming conversation threads, mainly due to the highly redundant content and the absence of customization options to selectively view portions of the conversations. As an initial step towards addressing these usability concerns, we designed PView - a browser extension that enables blind users to customize the content of forum threads in real time as they interact with these threads. Specifically, PView allows the blind users to explicitly hide any post that is irrelevant to them, and then PView automatically detects and filters out all subsequent posts that are substantially similar to the hidden post in real time, before the users navigate to those portions of the thread. In a user study with blind participants, we observed that compared to the status quo, PView significantly improved the usability, workload, and satisfaction of the participants while interacting with the forums

Old Dominion University