1,731 research outputs found
Web Data Extraction, Applications and Techniques: A Survey
Web Data Extraction is an important problem that has been studied by means of
different scientific tools and in a broad range of applications. Many
approaches to extracting data from the Web have been designed to solve specific
problems and operate in ad-hoc domains. Other approaches, instead, heavily
reuse techniques and algorithms developed in the field of Information
Extraction.
This survey aims at providing a structured and comprehensive overview of the
literature in the field of Web Data Extraction. We provided a simple
classification framework in which existing Web Data Extraction applications are
grouped into two main classes, namely applications at the Enterprise level and
at the Social Web level. At the Enterprise level, Web Data Extraction
techniques emerge as a key tool to perform data analysis in Business and
Competitive Intelligence systems as well as for business process
re-engineering. At the Social Web level, Web Data Extraction techniques allow
to gather a large amount of structured data continuously generated and
disseminated by Web 2.0, Social Media and Online Social Network users and this
offers unprecedented opportunities to analyze human behavior at a very large
scale. We discuss also the potential of cross-fertilization, i.e., on the
possibility of re-using Web Data Extraction techniques originally designed to
work in a given domain, in other domains.Comment: Knowledge-based System
CXQuery: A novel XML query language
XML is becoming the data exchange standard on the Internet. Previously proposed XML query languages, such as XQuery, Quilt, YALT, Lorel, and XML-QL, lack schema definition of the query result; therefore, they are limited for defining views, integrating data, updating, and further querying, all of which are often needed in e-Business applications. We propose a novel XML query language called CXQuery, which defines the schema of the query results explicitly and can easily define views, and integrate, update, and query XML data. In addition, CXQuery can express spatial and spatio-temporal queries using a constraint-based querying approach
Knowledge Rich Natural Language Queries over Structured Biological Databases
Increasingly, keyword, natural language and NoSQL queries are being used for
information retrieval from traditional as well as non-traditional databases
such as web, document, image, GIS, legal, and health databases. While their
popularity are undeniable for obvious reasons, their engineering is far from
simple. In most part, semantics and intent preserving mapping of a well
understood natural language query expressed over a structured database schema
to a structured query language is still a difficult task, and research to tame
the complexity is intense. In this paper, we propose a multi-level
knowledge-based middleware to facilitate such mappings that separate the
conceptual level from the physical level. We augment these multi-level
abstractions with a concept reasoner and a query strategy engine to dynamically
link arbitrary natural language querying to well defined structured queries. We
demonstrate the feasibility of our approach by presenting a Datalog based
prototype system, called BioSmart, that can compute responses to arbitrary
natural language queries over arbitrary databases once a syntactic
classification of the natural language query is made
A virtual environment to support the distributed design of large made-to-order products
An overview of a virtual design environment (virtual platform) developed as part of the European Commission funded VRShips-ROPAX (VRS) project is presented. The main objectives for the development of the virtual platform are described, followed by the discussion of the techniques chosen to address the objectives, and finally a description of a use-case for the platform. Whilst the focus of the VRS virtual platform was to facilitate the design of ROPAX (roll-on passengers and cargo) vessels, the components within the platform are entirely generic and may be applied to the distributed design of any type of vessel, or other complex made-to-order products
- …