108 research outputs found
Optimization of Regular Path Queries in Graph Databases
Regular path queries offer a powerful navigational mechanism in graph databases. Recently, there has been renewed interest in such queries in the context of the Semantic Web. The extension of SPARQL in version 1.1 with property paths offers a type of regular path query for RDF graph databases. While eminently useful, such queries are difficult to optimize and evaluate efficiently, however. We design and implement a cost-based optimizer we call Waveguide for SPARQL queries with property paths. Waveguide builds a query planwhich we call a waveplan (WP)which guides the query evaluation. There are numerous choices in the con- struction of a plan, and a number of optimization methods, so the space of plans for a query can be quite large. Execution costs of plans for the same query can vary by orders of magnitude with the best plan often offering excellent performance. A WPs costs can be estimated, which opens the way to cost-based optimization. We demonstrate that Waveguide properly subsumes existing techniques and that the new plans it adds are relevant. We analyze the effective plan space which is enabled by Waveguide and design an efficient enumerator for it. We implement a pro- totype of a Waveguide cost-based optimizer on top of an open-source relational RDF store. Finally, we perform a comprehensive performance study of the state of the art for evaluation of SPARQL property paths and demonstrate the significant performance gains that Waveguide offers
Accessing Information Based on a Combination of Document Structure and Content: Exploiting XML tags in indexing and searching to enhance content retrieval of online document-centric XML encoded texts
This study explores the challenges of using traditional information retrieval methods to retrieve document-centric XML encoded text. It demonstrates how coupling structure and content in query and index formulation improves retrieval performance. Native XML database (NXD) and search engine technologies were evaluated in a baseline experiment, and in a second test after alterations were made to their respective indexes. Documents were retrieved for simple and complex forms of 30 XPath and keyword queries from a corpus of 95 XML/TEI encoded texts. Overall results indicated that query augmentation using document structure improves retrieval performance. Complex queries submitted to the NXD produced the most satisfying results, with an average precision of 93.3% and an average recall of 86.3%. Performance improvements were also achieved using complex, structured queries and indexes in the search engine. Study findings suggest that effective XML retrieval models might result from a combination of unstructures and structured retrieval techniques
Searching without SQL: Re-engineering a database-centric web application with open-source information retrieval software
This paper seeks to describe the process by which a database-centric web application was redesigned and rewritten to take advantage of Apache's Lucene - an open-source information retrieval software library written in the Java programming language. After the implementation of a Lucene-based text index of "semi-structured data", a college radio station's card catalog application was able to deliver higher-quality search results in significantly less time than it was able to do using just a relational database alone. Additionally, the dramatic improvements in speed and performance even allowed the search results interface to be redesigned and enhanced with an improved pagination system and new features such as faceted search/filtering
A Practical Framework for Storing and Searching Encrypted Data on Cloud Storage
Security has become a significant concern with the increased popularity of
cloud storage services. It comes with the vulnerability of being accessed by
third parties. Security is one of the major hurdles in the cloud server for the
user when the user data that reside in local storage is outsourced to the
cloud. It has given rise to security concerns involved in data confidentiality
even after the deletion of data from cloud storage. Though, it raises a serious
problem when the encrypted data needs to be shared with more people than the
data owner initially designated. However, searching on encrypted data is a
fundamental issue in cloud storage. The method of searching over encrypted data
represents a significant challenge in the cloud.
Searchable encryption allows a cloud server to conduct a search over
encrypted data on behalf of the data users without learning the underlying
plaintexts. While many academic SE schemes show provable security, they usually
expose some query information, making them less practical, weak in usability,
and challenging to deploy. Also, sharing encrypted data with other authorized
users must provide each document's secret key. However, this way has many
limitations due to the difficulty of key management and distribution.
We have designed the system using the existing cryptographic approaches,
ensuring the search on encrypted data over the cloud. The primary focus of our
proposed model is to ensure user privacy and security through a less
computationally intensive, user-friendly system with a trusted third party
entity. To demonstrate our proposed model, we have implemented a web
application called CryptoSearch as an overlay system on top of a well-known
cloud storage domain. It exhibits secure search on encrypted data with no
compromise to the user-friendliness and the scheme's functional performance in
real-world applications.Comment: 146 Pages, Master's Thesis, 6 Chapters, 96 Figures, 11 Table
- …