Search CORE

2 research outputs found

Semantic querying and search in distributed ontologies

Author: Aslam Sharjeel
Publication venue
Publication date: 01/09/2021
Field of study

We have observed in recent years a continuous growth in the quantity of RDF data accessible on the web. This evolution is primarily based on increasing data on the web by different sectors such as governments, life science researchers, or academic institutes. RDF data creation is mainly developed by replacing existing data resources with RDF, changing relational databases into RDF. These RDF data are usually called qualified linked data URIs and endpoints of SPARQL. Continuous development that we are experiencing in SPARQL endpoints requires accessing sets of distributed RDF data repositories is getting popularity. This research has offered an extensive analysis of accessing RDF data across distributed ontologies. The existing approaches lack a broad mix of RDF indexing and retrieving of distributed RDF data in one package. In addition, the efficiency of the current methods is not so dynamic and mainly depend on manual fixed strategies for accessing RDF data from a distributed environment. The literature review has acknowledged the need for a robust, reliable, dynamic, and comprehensive accessing mechanism for distributed RDF data using RDF indexing. This thesis presents the conceptual framework that demonstrates the SPARQL query execution process, which accesses the data within distributed RDF sets across a stored index. This thesis introduces the semantic algebra involved in the conversion of traditional SPARQL query language into different phases. The proposed framework elaborates the concepts included in selecting, projection, joins, specialisation and generalisation operators. These operators are usually in assistance during the process of processing and converting a SPARQL query. This thesis introduces the algorithms behind the proposed conceptual framework, which covert the main SPARQL query into sub-queries, sending each subquery to the required distributed repository to fetch the data and merging the sub queries results. 4 This research demonstrates the testing of the proposed framework using the unit and functional testing strategies. The author developed and utilised the Museum ontology to test and evaluate the developed system. It demonstrates all how the complete developed and processed system works. Different tests have been performed in this thesis, like the algebraic operator's test (e.g., select, join, outer join, generalisation, and specialisation operators test) and test the proposed algorithm. After comprehensive testing, it shows that all developed system units worked as expected, and no errors found during the testing of all phases of the tested framework. Finally, the thesis presents implemented framework's performance and accuracy by comparing it to other similar systems. Evaluation of the implemented system demonstrated that the proposed framework could handle distributed SPARQL queries very effectively. The author selected FedX, ANAPSID and ADERIS existing frameworks to compare with developed system and described the results in a graphical format to illustrate the performance and accuracy of all systems

London Met Repository

Automated Extraction of Behaviour Model of Applications

Author: Chakraborty Tandra
Publication venue: 'University of Waterloo'
Publication date: 01/09/2016
Field of study

Highly replicated cloud applications are deployed only when they are deemed to be func- tional. That is, they generally perform their task and their failure rate is relatively low. However, even though failure is rare, it does occur and is very difficult to diagnose. We devise a tool for failure diagnosis which learns the normal behaviour of an application in terms of the statistical properties of variables used throughout its execution, and then monitors it for deviation from these statistical properties. Our study reveals that many variables have unique statistical characteristics that amount to an invariant of the pro- gram. Therefore, any significant deviation from these characteristics reflects an abnormal behaviour of the application which may be caused by a program error. It is difficult to get the invariant from the application’s static code analysis alone. For example, the name of a person usually does not include a semicolon; however, an intruder may try to do a SQL injection (which will include a semicolon) through the ‘name’ field while entering his information and be successful if there is no checking for this case. This scenario can only be captured at runtime and may not be tested by the application de- veloper. The character range of the ‘name’ variable is one of its statistical properties; by learning this range from the execution of the application it is possible to detect the above described abnormal input. Hence, monitoring the statistics of values taken by the different variables of an application is an effective way to detect anomalies that can help to diagnose the failure of the application. We build a tool that collects frequent snapshots of the application’s heap and build a statistical model solely from the extensional knowledge of the application. The extensional knowledge is only obtainable from runtime data of the application without having any description or explanation of the application’s execution flow. The model characterizes the application’s normal behaviour. Collecting snapshots in form of memory dumps and determine the application’s behaviour model from them without code instrumentation make our tool applicable in cases where instrumentation is computationally expensive. Our approach allows a behaviour model to be automatically and efficiently built using the monitoring data alone. We evaluate the utility of our approach by applying it on an e-commerce application and online bidding system, and then derive different statisti- cal properties of variables from their runtime-exhibited values. Our experimental result demonstrates 96% accuracy in the generated statistical model with a maximum 1% per- formance overhead. This accuracy is measured at the basis of generating less false positive alerts when the application is running without any anomaly. The high accuracy and low performance overhead indicates that our tool can successfully determine the application’s normal behaviour without affecting the performance of the application and can be used to monitor it in production time. Moreover, our tool also correctly detected two anomalous condition while monitoring the application with a small amount of injected fault. In ad- dition to anomaly detection, our tool logs all the variables of the application that violates the learned model. The log file can help to diagnose any failure caused by the variables and gives our tool a source-code granularity in fault localization

University of Waterloo's Institutional Repository