8,372 research outputs found
Querying standardized EHRs by a Search Ontology XML extension (SOX).
Motivation: The previously developed Search Ontology (SO) allows domain experts to formally specify domain concepts, search terms associated to a domain, and rules describing domain concepts. So far, Lucene search queries can be generated from information contained in the SO and can be used for querying literature data bases or PubMed. However, this is still insufficient, since these queries are not well suited for querying XML documents because they are not following their structure. However, in the medical domain, many information items are coded in XML. Thus, querying structured XML documents is crucial for retrieving similar cases or for identifying potential study participants. For example, information items of patients with a similar tumor classification documented in a certain section of the respective pathology report need to be retrieved. This requires a precise definition of queries. In this paper, we introduce a concept for the generation of such queries using a Search Ontology XML extension to enable semantic searches on structured data. Results: For a gain of precision, the paragraph of a document need to be specified, in which a specific information item expressed in a query is expected to appear. The Search Ontology XML Extension (SOX) connects search terms to certain sections in XML documents. The extension consists of a class which represents the XML structure and a relation between search terms and this XML structure. This enables an automatic generation of XPath expressions, which makes an efficient and precise search of structured pathology reports in XML databases possible. The combination of standardized Electronic Health Records with an ontology based query method promises a gain of precision, a high degree of interoperability and long term durability of both, XML documents and queries on XML documents
Survey over Existing Query and Transformation Languages
A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability
of many current Semantic Web approaches to cope with data available in such diverging
representation formalisms as XML, RDF, or Topic Maps. A common query language is the first
step to allow transparent access to data in any of these formats. To further the understanding
of the requirements and approaches proposed for query languages in the conventional as well
as the Semantic Web, this report surveys a large number of query languages for accessing
XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from
all these areas. From the detailed survey of these query languages, a common classification
scheme is derived that is useful for understanding and differentiating languages within and
among all three areas
Investigation into Indexing XML Data Techniques
The rapid development of XML technology improves the WWW, since the XML data has many advantages and has become a common technology for transferring data cross the internet. Therefore, the objective of this research is to investigate and study the XML indexing techniques in terms of their structures. The main goal of this investigation is to identify the main limitations of these techniques and any other open issues.
Furthermore, this research considers most common XML indexing techniques and performs a comparison between them. Subsequently, this work makes an argument to find out these limitations. To conclude, the main problem of all the XML indexing techniques is the trade-off between the
size and the efficiency of the indexes. So, all the indexes become large in order to perform well, and none of them is suitable for all users’ requirements. However, each one of these techniques has some advantages in somehow
Querying a regulatory model for compliant building design audit
The ingredients for an effective automated audit of a building design include a BIM model containing the design information, an electronic regulatory knowledge model, and a practical method of processing these computerised representations. There have been numerous approaches to computer-aided compliance audit in the AEC/FM domain over the last four decades, but none has yet evolved into a practical solution. One reason is that they have all been isolated attempts that lack any form of standardisation. The current research project therefore focuses on using an open standard regulatory knowledge and BIM representations in conjunction with open standard executable compliant design workflows to automate the compliance audit process. This paper provides an overview of different approaches to access information from a regulatory model representation. The paper then describes the use of a purpose-built high-level domain specific query language to extract regulatory information as part of the effort to automate manual design procedures for compliance audit
Content-Aware DataGuides for Indexing Large Collections of XML Documents
XML is well-suited for modelling structured data with
textual content. However, most indexing approaches perform
structure and content matching independently, combining
the retrieved path and keyword occurrences in a third
step. This paper shows that retrieval in XML documents can
be accelerated significantly by processing text and structure
simultaneously during all retrieval phases. To this end,
the Content-Aware DataGuide (CADG) enhances the wellknown
DataGuide with (1) simultaneous keyword and path
matching and (2) a precomputed content/structure join. Extensive
experiments prove the CADG to be 50-90% faster
than the DataGuide for various sorts of query and document,
including difficult cases such as poorly structured
queries and recursive document paths. A new query classification
scheme identifies precise query characteristics with
a predominant influence on the performance of the individual
indices. The experiments show that the CADG is applicable
to many real-world applications, in particular large
collections of heterogeneously structured XML documents
- …