128,468 research outputs found

    Web Data Extraction, Applications and Techniques: A Survey

    Full text link
    Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains.Comment: Knowledge-based System

    A Survey on Software Testing Techniques using Genetic Algorithm

    Full text link
    The overall aim of the software industry is to ensure delivery of high quality software to the end user. To ensure high quality software, it is required to test software. Testing ensures that software meets user specifications and requirements. However, the field of software testing has a number of underlying issues like effective generation of test cases, prioritisation of test cases etc which need to be tackled. These issues demand on effort, time and cost of the testing. Different techniques and methodologies have been proposed for taking care of these issues. Use of evolutionary algorithms for automatic test generation has been an area of interest for many researchers. Genetic Algorithm (GA) is one such form of evolutionary algorithms. In this research paper, we present a survey of GA approach for addressing the various issues encountered during software testing.Comment: 13 Page

    Identifying and Modelling Complex Workflow Requirements in Web Applications

    Get PDF
    Workflow plays a major role in nowadays business and therefore its requirement elicitation must be accurate and clear for achieving the solution closest to business’s needs. Due to Web applications popularity, the Web is becoming the standard platform for implementing business workflows. In this context, Web applications and their workflows must be adapted to market demands in such a way that time and effort are minimize. As they get more popular, they must give support to different functional requirements but also they contain tangled and scattered behaviour. In this work we present a model-driven approach for modelling workflows using a Domain Specific Language for Web application requirement called WebSpec. We present an extension to WebSpec based on Pattern Specifications for modelling crosscutting workflow requirements identifying tangled and scattered behaviour and reducing inconsistencies early in the cycle

    BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking

    Full text link
    Data generation is a key issue in big data benchmarking that aims to generate application-specific data sets to meet the 4V requirements of big data. Specifically, big data generators need to generate scalable data (Volume) of different types (Variety) under controllable generation rates (Velocity) while keeping the important characteristics of raw data (Veracity). This gives rise to various new challenges about how we design generators efficiently and successfully. To date, most existing techniques can only generate limited types of data and support specific big data systems such as Hadoop. Hence we develop a tool, called Big Data Generator Suite (BDGS), to efficiently generate scalable big data while employing data models derived from real data to preserve data veracity. The effectiveness of BDGS is demonstrated by developing six data generators covering three representative data types (structured, semi-structured and unstructured) and three data sources (text, graph, and table data)

    Ontology based Scene Creation for the Development of Automated Vehicles

    Full text link
    The introduction of automated vehicles without permanent human supervision demands a functional system description, including functional system boundaries and a comprehensive safety analysis. These inputs to the technical development can be identified and analyzed by a scenario-based approach. Furthermore, to establish an economical test and release process, a large number of scenarios must be identified to obtain meaningful test results. Experts are doing well to identify scenarios that are difficult to handle or unlikely to happen. However, experts are unlikely to identify all scenarios possible based on the knowledge they have on hand. Expert knowledge modeled for computer aided processing may help for the purpose of providing a wide range of scenarios. This contribution reviews ontologies as knowledge-based systems in the field of automated vehicles, and proposes a generation of traffic scenes in natural language as a basis for a scenario creation.Comment: Accepted at the 2018 IEEE Intelligent Vehicles Symposium, 8 pages, 10 figure
    • …
    corecore