Search CORE

1,753 research outputs found

Analysis of multiple update techniques on a RDF keyword search system

Author
Publication venue
Publication date
Field of study

Keyword search is a technology that allows non-expert users to explore and retrieve information and it is traditionally used for unstructured data, such as in Web page searches. In the last decade, this search method has also become popular for exploring structured data, such as relational databases or graphs. Instead of using complex SQL or SPARQL queries and when the underlying schema is known, the user writes a series of words(keywords) to search for what he or she needs, getting as answers the ones more matching with the search. Keyword search systems are challenged by two fundamental parameters, efficiency and effectiveness. In fact, efficiency and effectiveness are two qualities of a SPARQL, or SQL, query that returns an answer quickly and always accurate even when operating on large amounts of data. The "virtual documents" method allows keyword search systems to work also on large databases by generating answers to keyword queries in a reasonable time. This paper aims to replicate the keyword search systems based on "virtual documents" TSA+BM25 and TSA+VDP for RDF graphs. In addition, two methods of update processing in a keyword search system, will be presented and analyzed: BruteForce and semiTSA. Although keyword search is a growing research matter, the topic of updates on structured data, such as RDF data, had not yet been addressed in the literature.Keyword search is a technology that allows non-expert users to explore and retrieve information and it is traditionally used for unstructured data, such as in Web page searches. In the last decade, this search method has also become popular for exploring structured data, such as relational databases or graphs. Instead of using complex SQL or SPARQL queries and when the underlying schema is known, the user writes a series of words(keywords) to search for what he or she needs, getting as answers the ones more matching with the search. Keyword search systems are challenged by two fundamental parameters, efficiency and effectiveness. In fact, efficiency and effectiveness are two qualities of a SPARQL, or SQL, query that returns an answer quickly and always accurate even when operating on large amounts of data. The "virtual documents" method allows keyword search systems to work also on large databases by generating answers to keyword queries in a reasonable time. This paper aims to replicate the keyword search systems based on "virtual documents" TSA+BM25 and TSA+VDP for RDF graphs. In addition, two methods of update processing in a keyword search system, will be presented and analyzed: BruteForce and semiTSA. Although keyword search is a growing research matter, the topic of updates on structured data, such as RDF data, had not yet been addressed in the literature

Padua Thesis and Dissertation Archive

EAGLE—A Scalable Query Processing Engine for Linked Sensor Data

Author: G. Breslin John
Le-Phuoc Danh
Mau Nguyen Han
Nguyen Mau Quoc Hoan
Serrano Martin
Publication venue
Publication date: 09/10/2019
Field of study

Recently, many approaches have been proposed to manage sensor data using semantic web technologies for effective heterogeneous data integration. However, our empirical observations revealed that these solutions primarily focused on semantic relationships and unfortunately paid less attention to spatio–temporal correlations. Most semantic approaches do not have spatio–temporal support. Some of them have attempted to provide full spatio–temporal support, but have poor performance for complex spatio–temporal aggregate queries. In addition, while the volume of sensor data is rapidly growing, the challenge of querying and managing the massive volumes of data generated by sensing devices still remains unsolved. In this article, we introduce EAGLE, a spatio–temporal query engine for querying sensor data based on the linked data model. The ultimate goal of EAGLE is to provide an elastic and scalable system which allows fast searching and analysis with respect to the relationships of space, time and semantics in sensor data. We also extend SPARQL with a set of new query operators in order to support spatio–temporal computing in the linked sensor data context.EC/H2020/732679/EU/ACTivating InnoVative IoT smart living environments for AGEing well/ACTIVAGEEC/H2020/661180/EU/A Scalable and Elastic Platform for Near-Realtime Analytics for The Graph of Everything/SMARTE

DepositOnce

A Linked Data Approach to Sharing Workflows and Workflow Results

Author: Bechhofer S
Margaria T
Marshall MS
Missier P
Newman DR
Roos M
Roure DD
Steffen B
Zhao J
Publication venue
Publication date: 01/01/2010
Field of study

A bioinformatics analysis pipeline is often highly elaborate, due to the inherent complexity of biological systems and the variety and size of datasets. A digital equivalent of the ‘Materials and Methods’ section in wet laboratory publications would be highly beneficial to bioinformatics, for evaluating evidence and examining data across related experiments, while introducing the potential to find associated resources and integrate them as data and services. We present initial steps towards preserving bioinformatics ‘materials and methods’ by exploiting the workflow paradigm for capturing the design of a data analysis pipeline, and RDF to link the workflow, its component services, run-time provenance, and a personalized biological interpretation of the results. An example shows the reproduction of the unique graph of an analysis procedure, its results, provenance, and personal interpretation of a text mining experiment. It links data from Taverna, myExperiment.org, BioCatalogue.org, and ConceptWiki.org. The approach is relatively ‘light-weight’ and unobtrusive to bioinformatics users

Southampton (e-Prints Soton)

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

The University of Manchester - Institutional Repository

Role of Semantic web in the changing context of Enterprise Collaboration

Author: Nitesh Khilwani (7202156)
Publication venue
Publication date: 01/01/2011
Field of study

In order to compete with the global giants, enterprises are concentrating on their core competencies and collaborating with organizations that compliment their skills and core activities. The current trend is to develop temporary alliances of independent enterprises, in which companies can come together to share skills, core competencies and resources. However, knowledge sharing and communication among multidiscipline companies is a complex and challenging problem. In a collaborative environment, the meaning of knowledge is drastically affected by the context in which it is viewed and interpreted; thus necessitating the treatment of structure as well as semantics of the data stored in enterprise repositories. Keeping the present market and technological scenario in mind, this research aims to propose tools and techniques that can enable companies to assimilate distributed information resources and achieve their business goals

Loughborough University Institutional Repository

Approximating expressive queries on graph-modeled data: The GeX approach

Author: Mandreoli Federica
Martoglia Riccardo
Penzo Wilma
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

We present the GeX (Graph-eXplorer) approach for the approximate matching of complex queries on graph-modeled data. GeX generalizes existing approaches and provides for a highly expressive graph-based query language that supports queries ranging from keyword-based to structured ones. The GeX query answering model gracefully blends label approximation with structural relaxation, under the primary objective of delivering meaningfully approximated results only. GeX implements ad-hoc data structures that are exploited by a top-k retrieval algorithm which enhances the approximate matching of complex queries. An extensive experimental evaluation on real world datasets demonstrates the efficiency of the GeX query answering

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Search Text to Retrieve Graphs: A Scalable RDF Keyword-Based Search System

Author: Dosso Dennis
Silvello Gianmaria
Publication venue
Publication date: 01/01/2020
Field of study

Keyword-based access to structured data has been gaining traction both in research and industry as a means to facilitate access to information. In recent years, the research community and big data technology vendors have put much effort into developing new approaches for keyword search over structured data. Accessing these data through structured query languages, such as SQL or SPARQL, can be hard for end-users accustomed to Web-based search systems. To overcome this issue, keyword search in databases is becoming the technology of choice, although its efficiency and effectiveness problems still prevent a large scale diffusion. In this work, we focus on graph data, and we propose the TSA + BM25 and the TSA + VDP keyword search systems over RDF datasets based on the "virtual documents" approach. This approach enables high scalability because it moves most of the computational complexity off-line and then exploits highly efficient text retrieval techniques and data structures to carry out the on-line phase. Nevertheless, text retrieval techniques scale well to large datasets but need to be adapted to the complexity of structured data. The new approaches we propose are more efficient and effective compared to state-of-the-art systems. In particular, we show that our systems scale to work with RDF datasets composed of hundreds of millions of triples and obtain competitive results in terms of effectiveness

Open Access Repository

Archivio istituzionale della ricerca - Università di Padova

SEMANTIC HYPERCAT

Author: Georgiou Michalis
Publication venue
Publication date
Field of study

The rapidly increasing number of sensor networks and smart devices contributed to the generation of a huge number of information. Information that is generated by several sources and is available indifferent formats highlights interoperability as one of the key preconditions for the success of the Internet of Things (IoT). Hypercat is a specification defining a JSON-based catalogue, designed to serve the needs of the industry. In this thesis, I extend the existing work on semantic enrichment of Hypercat by defining a JSON-LD based catalogue. The proposed JSON-LD specification offers a mapping mechanism among JSON and JSON-LD catalogues, while highlighting the fact that JSON-LD could be seamlessly adopted by the Hypercat community

University of Huddersfield Repository