Search CORE

2 research outputs found

Recommended from our members

A Comparative Study of Data Transformations for Efficient XML and JSON Data Compression. An In-Depth Analysis of Data Transformation Techniques, including Tag and Capital Conversions, Character and Word N-Gram Transformations, and Domain-Specific Data Transforms using SMILES Data as a Case Study

Author: Scanlon Shagufta A.
Publication venue: School of Electrical Engineering and Computer Science
Publication date: 01/01/2015
Field of study

XML is a widely used data exchange format. The verbose nature of XML leads to the requirement to efficiently store and process this type of data using compression. Various general-purpose transforms and compression techniques exist that can be used to transform and compress XML data. More compact alternatives to XML data have been developed, namely JSON due to the verbosity of XML data. Similarly, there is a requirement to efficiently store and process SMILES data used in Chemoinformatics. General-purpose transforms and compressors can be used to compress this type of data to a certain extent, however, these techniques are not specific to SMILES data. The primary contribution of this research is to provide developers that use XML, JSON or SMILES data, with key knowledge of the best transformation techniques to use with certain types of data, and which compression techniques would provide the best compressed output size and processing times, depending on their requirements. The main study in this thesis, investigates the extent of which using data transforms prior to data compression can further improve the compression of XML and JSON data. It provides a comparative analysis of applying a variety of data transform and data transform variations, to a number of different types of XML and JSON equivalent datasets of various sizes, and applying different general-purpose compression techniques over the transformed data. A case study is also conducted, to investigate data transforms prior to compression to improve the compression of data within a data-specific domain.The files of software accompanying this thesis are unable to be presented online with the thesis

Bradford Scholars

Enhancing Recommendations in Specialist Search Through Semantic-based Techniques and Multiple Resources

Author: Almuhaimeed Abdullah
Publication venue
Publication date: 15/09/2016
Field of study

Information resources abound on the Internet, but mining these resources is a non-trivial task. Such abundance has raised the need to enhance services provided to users, such as recommendations. The purpose of this work is to explore how better recommendations can be provided to specialists in specific domains such as bioinformatics by introducing semantic techniques that reason through different resources and using specialist search techniques. Such techniques exploit semantic relations and hidden associations that occur as a result of the information overlapping among various concepts in multiple bioinformatics resources such as ontologies, websites and corpora. Thus, this work introduces a new method that reasons over different bioinformatics resources and then discovers and exploits different relations and information that may not exist in the original resources. Such relations may be discovered as a consequence of the information overlapping, such as the sibling and semantic similarity relations, to enhance the accuracy of the recommendations provided on bioinformatics content (e.g. articles). In addition, this research introduces a set of semantic rules that are able to extract different semantic information and relations inferred among various bioinformatics resources. This project introduces these semantic-based methods as part of a recommendation service within a content-based system. Moreover, it uses specialists' interests to enhance the provided recommendations by employing a method that is collecting user data implicitly. Then, it represents the data as adaptive ontological user profiles for each user based on his/her preferences, which contributes to more accurate recommendations provided to each specialist in the field of bioinformatics

University of Essex Research Repository