128,468 research outputs found
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
A Survey on Software Testing Techniques using Genetic Algorithm
The overall aim of the software industry is to ensure delivery of high
quality software to the end user. To ensure high quality software, it is
required to test software. Testing ensures that software meets user
specifications and requirements. However, the field of software testing has a
number of underlying issues like effective generation of test cases,
prioritisation of test cases etc which need to be tackled. These issues demand
on effort, time and cost of the testing. Different techniques and methodologies
have been proposed for taking care of these issues. Use of evolutionary
algorithms for automatic test generation has been an area of interest for many
researchers. Genetic Algorithm (GA) is one such form of evolutionary
algorithms. In this research paper, we present a survey of GA approach for
addressing the various issues encountered during software testing.Comment: 13 Page
Identifying and Modelling Complex Workflow Requirements in Web Applications
Workflow plays a major role in nowadays business and therefore its
requirement elicitation must be accurate and clear for achieving the solution
closest to business’s needs. Due to Web applications popularity, the Web is becoming
the standard platform for implementing business workflows. In this
context, Web applications and their workflows must be adapted to market demands
in such a way that time and effort are minimize. As they get more popular,
they must give support to different functional requirements but also they
contain tangled and scattered behaviour. In this work we present a model-driven
approach for modelling workflows using a Domain Specific Language for Web
application requirement called WebSpec. We present an extension to WebSpec
based on Pattern Specifications for modelling crosscutting workflow requirements
identifying tangled and scattered behaviour and reducing inconsistencies
early in the cycle
BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking
Data generation is a key issue in big data benchmarking that aims to generate
application-specific data sets to meet the 4V requirements of big data.
Specifically, big data generators need to generate scalable data (Volume) of
different types (Variety) under controllable generation rates (Velocity) while
keeping the important characteristics of raw data (Veracity). This gives rise
to various new challenges about how we design generators efficiently and
successfully. To date, most existing techniques can only generate limited types
of data and support specific big data systems such as Hadoop. Hence we develop
a tool, called Big Data Generator Suite (BDGS), to efficiently generate
scalable big data while employing data models derived from real data to
preserve data veracity. The effectiveness of BDGS is demonstrated by developing
six data generators covering three representative data types (structured,
semi-structured and unstructured) and three data sources (text, graph, and
table data)
Ontology based Scene Creation for the Development of Automated Vehicles
The introduction of automated vehicles without permanent human supervision
demands a functional system description, including functional system boundaries
and a comprehensive safety analysis. These inputs to the technical development
can be identified and analyzed by a scenario-based approach. Furthermore, to
establish an economical test and release process, a large number of scenarios
must be identified to obtain meaningful test results. Experts are doing well to
identify scenarios that are difficult to handle or unlikely to happen. However,
experts are unlikely to identify all scenarios possible based on the knowledge
they have on hand. Expert knowledge modeled for computer aided processing may
help for the purpose of providing a wide range of scenarios. This contribution
reviews ontologies as knowledge-based systems in the field of automated
vehicles, and proposes a generation of traffic scenes in natural language as a
basis for a scenario creation.Comment: Accepted at the 2018 IEEE Intelligent Vehicles Symposium, 8 pages, 10
figure
- …