Search CORE

149 research outputs found

Designing a resource-efficient data structure for mobile data systems

Author: Gourlay Richard Scott
Publication venue
Publication date: 01/07/2006
Field of study

Designing data structures for use in mobile devices requires attention on optimising data volumes with associated benefits for data transmission, storage space and battery use. For semi-structured data, tree summarisation techniques can be used to reduce the volume of structured elements while dictionary compression can efficiently deal with value-based predicates. This project seeks to investigate and evaluate an integration of the two approaches. The key strength of this technique is that both structural and value predicates could be resolved within one graph while further allowing for compression of the resulting data structure. As the current trend is towards the requirement for working with larger semi-structured data sets this work would allow for the utilisation of much larger data sets whilst reducing requirements on bandwidth and minimising the memory necessary both for the storage and querying of the data

University of Strathclyde Institutional Repository

A web services based framework for efficient monitoring and event reporting.

Author: Chourmouziadis Aimilios P.
Publication venue
Publication date: 15/03/2018
Field of study

Network and Service Management (NSM) is a research discipline with significant research contributions the last 25 years. Despite the numerous standardised solutions that have been proposed for NSM, the quest for an "all encompassing technology" still continues. A new technology introduced lately to address NSM problems is Web Services (WS). Despite the research effort put into WS and their potential for addressing NSM objectives, there are efficiency, interoperability, etc issues that need to be solved before using WS for NSM. This thesis looks at two techniques to increase the efficiency of WS management applications so that the latter can be used for efficient monitoring and event reporting. The first is a query tool we built that can be used for efficient retrieval of management state data close to the devices where they are hosted. The second technique is policies used to delegate a number of tasks from a manager to an agent to make WS-based event reporting systems more efficient. We tested the performance of these mechanisms by incorporating them in a custom monitoring and event reporting framework and supporting systems we have built, against other similar mechanisms (XPath) that have been proposed for the same tasks, as well as previous technologies such as SNMP. Through these tests we have shown that these mechanisms are capable of allowing us to use WS efficiently in various monitoring and event reporting scenarios. Having shown the potential of our techniques we also present the design and implementation challenges for building a GUI tool to support and enhance the above systems with extra capabilities. In summary, we expect that other problems WS face will be solved in the near future, making WS a capable platform for it to be used for NSM

University of Surrey

Using multi-locators to increase the robustness of web test cases

Author: LEOTTA MAURIZIO
RICCA FILIPPO
STOCCO ANDREA
Tonella Paolo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

The main reason for the fragility of web test cases is the inability of web element locators to work correctly when the web page DOM evolves. Web elements locators are used in web test cases to identify all the GUI objects to operate upon and eventually to retrieve web page content that is compared against some oracle in order to decide whether the test case has passed or not. Hence, web element locators play an extremely important role in web testing and when a web element locator gets broken developers have to spend substantial time and effort to repair it. While algorithms exist to produce robust web element locators to be used in web test scripts, no algorithm is perfect and different algorithms are exposed to different fragilities when the software evolves. Based on such observation, we propose a new type of locator, named multi-locator, which selects the best locator among a candidate set of locators produced by different algorithms. Such selection is based on a voting procedure that assigns different voting weights to different locator generation algorithms. Experimental results obtained on six web applications, for which a subsequent release was available, show that the multi-locator is more robust than the single locators (about -30% of broken locators w.r.t. the most robust kind of single locator) and that the execution overhead required by the multiple queries done with different locators is negligible (2-3% at most)

Archivio della ricerca - Fondazione Bruno Kessler

Archivio istituzionale della ricerca - Università di Genova

Frontiers of tractability for typechecking simple XML transformations

Author: Martens Wim
Neven Frank
Publication venue: Elsevier Inc.
Publication date: 31/05/2007
Field of study

AbstractTypechecking consists of statically verifying whether the output of an XML transformation is always conform to an output type for documents satisfying a given input type. We focus on complete algorithms which always produce the correct answer. We consider top–down XML transformations incorporating XPath expressions and abstract document types by grammars and tree automata. By restricting schema languages and transformations, we identify several practical settings for which typechecking can be done in polynomial time. Moreover, the resulting framework provides a rather complete picture as we show that most scenarios cannot be enlarged without rendering the typechecking problem intractable. So, the present research sheds light on when to use fast complete algorithms and when to reside to sound but incomplete ones

Elsevier - Publisher Connector

Information extraction from multimedia web documents: an open-source platform and testbed

Author: Blanco Roi
Boato Giulia
Costanzo Andrea
Demidova Elena
Dupplaw David
Fontani Marco
Griffiths Thomas
Hare Jonathon
Johansson Richard
Lewis Paul H.
Matthews Michael
Minack Enrico
Moschitti Alessandro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/06/2014
Field of study

The LivingKnowledge project aimed to enhance the current state of the art in search, retrieval and knowledge management on the web by advancing the use of sentiment and opinion analysis within multimedia applications. To achieve this aim, a diverse set of novel and complementary analysis techniques have been integrated into a single, but extensible software platform on which such applications can be built. The platform combines state-of-the-art techniques for extracting facts, opinions and sentiment from multimedia documents, and unlike earlier platforms, it exploits both visual and textual techniques to support multimedia information retrieval. Foreseeing the usefulness of this software in the wider community, the platform has been made generally available as an open-source project. This paper describes the platform design, gives an overview of the analysis algorithms integrated into the system and describes two applications that utilise the system for multimedia information retrieval

Southampton (e-Prints Soton)

A node partitioning strategy for optimising the performance of XML queries

Author: Marks Gerard
Publication venue: Dublin City University. School of Computing
Publication date: 01/11/2011
Field of study

For ease of communication between heterogeneous systems, the eXtensible Markup Language (XML) has been widely adopted as a data storage format. However, XML query processing presents issues both in terms of query performance and updatability. Thus, many are choosing to shred XML data into relational databases in order to benet from its mature technology. The problem with this approach is that (often complex and time consuming) data transformation processes are required to transform XML data to relational tables and vice versa. Additionally, many of the benets of XML data can be lost during these processes. In this dissertation, we present a process that partitions nodes within an XML document into disjoint subsets. Briefly, as there are fewer partitions than there are nodes, a more efficient join operation can be performed between partitions, thus reducing the number of inefficient node comparisons. The number and size of partitions varies depending on the structure and layout in the XML document, and the number of partitions impacts query performance. Therefore, we also provide a partition classication process, which signicantly reduces the number of partitions because each partition class represents many equivalent partitions within the XML document. In this dissertation, we will demonstrate that our approach outperforms similar approaches for a large subset of XML queries by eliminating complex join operations (where possible) during the query process

DCU Online Research Access Service

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Processing techniques for partial tree-pattern queries on XML data

Author: Placek Pawel
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2009
Field of study

In recent years, eXtensible Markup Language (XML) has become a de facto standard for exporting and exchanging data on the Web. XML structures data as trees. Querying capabilities are provided through patterns matched against the XML trees. Research on the processing of XML queries has focused mainly on tree-pattern queries. Tree-pattern queries are not appropriate for querying XML data sources whose structure is not fully known to the user, or for querying multiple data sources which structure information differently. Recently, a class of queries, called Partial Tree-Pattern Queries (PTPQs) was identified. A central feature of PTPQs is that the structure can be specified fully, partially, or not at all in a query. For this reason. PTPQs can be used for flexibly querying XML data sources. This thesis deals with processing techniques for PTPQs. In particular, it addresses the satisfiability, containment and minimization problems for PTPQs. In order to cope with structural expression derivation issues and to compare PTPQs, a set of inference rules is suggested and a canonical form for PTPQs that comprises all derived structural expressions is defined. This canonical form is used for determining necessary and sufficient conditions for PTPQ satisfiability. The containment problem is studied both in the absence and in the presence of structural summaries of data called dimension graphs. It is shown that this problem cannot be characterized by homomorphisms between PTPQs, even when PTPQs are put in canonical form. In both cases of the problem, necessary and sufficient conditions for PTPQ containment are provided in terms of homomorphisms between PTPQs and (a possibly exponential number of) tree-pattern queries. This result is used to identify a subclass of PTPQs that strictly contains tree-pattern queries for which the containment problem can be fully characterized through the existence of homomorphisms. To cope with the high complexity of PTPQ containment, heuristic approaches for this problem are designed that trade accuracy for speed. The heuristic approaches equivalently add structural expressions to PTPQs in order to increase the possibility for a homomorphism between two contained PTPQs to exist. An implementation and extensive experimental evaluation of these heuristics shows that they are useful in practice, and that they can be efficiently implemented in a query optimizer. The goal of PTPQ minimization is to produce an equivalent PTPQ which is syntactically smaller in size. This problem is studied in the absence of structural summaries. It is shown that PTPQs cannot be minimized by removing redundant parts as is the case with certain classes of tree-pattern queries. It is also shown that, in general, a PTPQ does not have a unique minimal equivalent PTPQ. Finally, sound, but not complete, heuristic approaches for PTPQ minimization are presented. These approaches gradually trade execution time for accuracy

Digital Commons @ New Jersey Institute of Technology (NJIT)

Taming XPath Queries by Minimizing Wildcard Steps

Author: C CHAN
W FAN
Y ZENG
Publication venue: 'Elsevier BV'
Publication date: 01/01/2007
Field of study

Crossref