28,447 research outputs found
Efficient Incremental Breadth-Depth XML Event Mining
Many applications log a large amount of events continuously. Extracting
interesting knowledge from logged events is an emerging active research area in
data mining. In this context, we propose an approach for mining frequent events
and association rules from logged events in XML format. This approach is
composed of two-main phases: I) constructing a novel tree structure called
Frequency XML-based Tree (FXT), which contains the frequency of events to be
mined; II) querying the constructed FXT using XQuery to discover frequent
itemsets and association rules. The FXT is constructed with a single-pass over
logged data. We implement the proposed algorithm and study various performance
issues. The performance study shows that the algorithm is efficient, for both
constructing the FXT and discovering association rules
IVOA Recommendation: Simple Spectral Access Protocol Version 1.1
The Simple Spectral Access (SSA) Protocol (SSAP) defines a uniform interface
to remotely discover and access one dimensional spectra. SSA is a member of an
integrated family of data access interfaces altogether comprising the Data
Access Layer (DAL) of the IVOA. SSA is based on a more general data model
capable of describing most tabular spectrophotometric data, including time
series and spectral energy distributions (SEDs) as well as 1-D spectra; however
the scope of the SSA interface as specified in this document is limited to
simple 1-D spectra, including simple aggregations of 1-D spectra. The form of
the SSA interface is simple: clients first query the global resource registry
to find services of interest and then issue a data discovery query to selected
services to determine what relevant data is available from each service; the
candidate datasets available are described uniformly in a VOTable format
document which is returned in response to the query. Finally, the client may
retrieve selected datasets for analysis. Spectrum datasets returned by an SSA
spectrum service may be either precomputed, archival datasets, or they may be
virtual data which is computed on the fly to respond to a client request.
Spectrum datasets may conform to a standard data model defined by SSA, or may
be native spectra with custom project-defined content. Spectra may be returned
in any of a number of standard data formats. Spectral data is generally stored
externally to the VO in a format specific to each spectral data collection;
currently there is no standard way to represent astronomical spectra, and
virtually every project does it differently. Hence spectra may be actively
mediated to the standard SSA-defined data model at access time by the service,
so that client analysis programs do not have to be familiar with the
idiosyncratic details of each data collection to be accessed
Survey over Existing Query and Transformation Languages
A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability
of many current Semantic Web approaches to cope with data available in such diverging
representation formalisms as XML, RDF, or Topic Maps. A common query language is the first
step to allow transparent access to data in any of these formats. To further the understanding
of the requirements and approaches proposed for query languages in the conventional as well
as the Semantic Web, this report surveys a large number of query languages for accessing
XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from
all these areas. From the detailed survey of these query languages, a common classification
scheme is derived that is useful for understanding and differentiating languages within and
among all three areas
Data integration through service-based mediation for web-enabled information systems
The Web and its underlying platform technologies have often been used to integrate existing software and information systems. Traditional techniques for data representation and transformations between documents are not sufficient to support a flexible and maintainable data integration solution that meets the requirements of modern complex Web-enabled software and information systems. The difficulty
arises from the high degree of complexity of data structures, for example in business and technology applications, and from the constant change of data and its
representation. In the Web context, where the Web platform is used to integrate different organisations or software systems, additionally the problem of heterogeneity
arises. We introduce a specific data integration solution for Web applications such as Web-enabled information systems. Our contribution is an integration technology
framework for Web-enabled information systems comprising, firstly, a data integration technique based on the declarative specification of transformation rules and the construction of connectors that handle the integration and, secondly, a mediator architecture based on information services and the constructed connectors to handle the integration process
- …