Search CORE

547 research outputs found

Feeds as Query Result Serializations

Author: Wilde Erik
Publication venue
Publication date: 08/04/2009
Field of study

Many Web-based data sources and services are available as feeds, a model that provides consumers with a loosely coupled way of interacting with providers. The current feed model is limited in its capabilities, however. Though it is simple to implement and scales well, it cannot be transferred to a wider range of application scenarios. This paper conceptualizes feeds as a way to serialize query results, describes the current hardcoded query semantics of such a perspective, and surveys the ways in which extensions of this hardcoded model have been proposed or implemented. Our generalized view of feeds as query result serializations has implications for the applicability of feeds as a generic Web service for any collection that is providing access to individual information items. As one interesting and compelling class of applications, we describe a simple way in which a query-based approach to feeds can be used to support location-based services

arXiv.org e-Print Archive

Ezid

eScholarship - University of California

Efficient And Scalable Evaluation Of Continuous, Spatio-temporal Queries In Mobile Computing Environments

Author: Cazalas Jonathan M
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2012
Field of study

A variety of research exists for the processing of continuous queries in large, mobile environments. Each method tries, in its own way, to address the computational bottleneck of constantly processing so many queries. For this research, we present a two-pronged approach at addressing this problem. Firstly, we introduce an efficient and scalable system for monitoring traditional, continuous queries by leveraging the parallel processing capability of the Graphics Processing Unit. We examine a naive CPU-based solution for continuous range-monitoring queries, and we then extend this system using the GPU. Additionally, with mobile communication devices becoming commodity, location-based services will become ubiquitous. To cope with the very high intensity of location-based queries, we propose a view oriented approach of the location database, thereby reducing computation costs by exploiting computation sharing amongst queries requiring the same view. Our studies show that by exploiting the parallel processing power of the GPU, we are able to significantly scale the number of mobile objects, while maintaining an acceptable level of performance. Our second approach was to view this research problem as one belonging to the domain of data streams. Several works have convincingly argued that the two research fields of spatiotemporal data streams and the management of moving objects can naturally come together. [IlMI10, ChFr03, MoXA04] For example, the output of a GPS receiver, monitoring the position of a mobile object, is viewed as a data stream of location updates. This data stream of location updates, along with those from the plausibly many other mobile objects, is received at a centralized server, which processes the streams upon arrival, effectively updating the answers to the currently active queries in real time. iv For this second approach, we present GEDS, a scalable, Graphics Processing Unit (GPU)-based framework for the evaluation of continuous spatio-temporal queries over spatiotemporal data streams. Specifically, GEDS employs the computation sharing and parallel processing paradigms to deliver scalability in the evaluation of continuous, spatio-temporal range queries and continuous, spatio-temporal kNN queries. The GEDS framework utilizes the parallel processing capability of the GPU, a stream processor by trade, to handle the computation required in this application. Experimental evaluation shows promising performance and shows the scalability and efficacy of GEDS in spatio-temporal data streaming environments. Additional performance studies demonstrate that, even in light of the costs associated with memory transfers, the parallel processing power provided by GEDS clearly counters and outweighs any associated costs. Finally, in an effort to move beyond the analysis of specific algorithms over the GEDS framework, we take a broader approach in our analysis of GPU computing. What algorithms are appropriate for the GPU? What types of applications can benefit from the parallel and stream processing power of the GPU? And can we identify a class of algorithms that are best suited for GPU computing? To answer these questions, we develop an abstract performance model, detailing the relationship between the CPU and the GPU. From this model, we are able to extrapolate a list of attributes common to successful GPU-based applications, thereby providing insight into which algorithms and applications are best suited for the GPU and also providing an estimated theoretical speedup for said GPU-based application

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Exploiting Graphic Card Processor Technology to Accelerate Data Mining Queries in SAP NetWeaver BIA

Author: Faerber Franz
Lehner Wolfgang
Mindnich Tobias
Weyerhaeuser Christoph
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/06/2022
Field of study

Within business Intelligence contexts, the importance of data mining algorithms is continuously increasing, particularly from the perspective of applications and users that demand novel algorithms on the one hand and an efficient implementation exploiting novel system architectures on the other hand. Within this paper, we focus on the latter issue and report our experience with the exploitation of graphic card processor technology within the SAP NetWeaver business intelligence accelerator (BIA). The BIA represents a highly distributed analytical engine that supports OLAP and data mining processing primitives. The system organizes data entities in column-wise fashion and its operation is completely main-memory-based. Since case studies have shown that classic data mining queries spend a large portion of their runtime on scanning and filtering the data as a necessary prerequisite to the actual mining step, our main goal was to speed up this expensive scanning and filtering process. In a first step, the paper outlines the basic data mining processing techniques within SAP NetWeaver BIA and illustrates the implementation of scans and filters. In a second step, we give insight into the main features of a hybrid system architecture design exploiting graphic card processor technology. Finally, we sketch the implementation and give details of our vast evaluations

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Automata learning algorithms and processes for providing more complete systems requirements specification by scenario generation, CSP-based syntax-oriented model construction, and R2D2C system requirements transformation

Author: Hinchey Michael G.
Margaria Tiziana
Rash James L.
Rouff Christopher A.
Steffen Bernard
Publication venue
Publication date: 23/02/2010
Field of study

Systems, methods and apparatus are provided through which in some embodiments, automata learning algorithms and techniques are implemented to generate a more complete set of scenarios for requirements based programming. More specifically, a CSP-based, syntax-oriented model construction, which requires the support of a theorem prover, is complemented by model extrapolation, via automata learning. This may support the systematic completion of the requirements, the nature of the requirement being partial, which provides focus on the most prominent scenarios. This may generalize requirement skeletons by extrapolation and may indicate by way of automatically generated traces where the requirement specification is too loose and additional information is required

NASA Technical Reports Server

Temporal latent topic user profiles for search personalisation

Author: Song Dawei
Tran Son N.
Vu Thanh
Willis Alistair
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The performance of search personalisation largely depends on how to build user profiles effectively. Many approaches have been developed to build user profiles using topics discussed in relevant documents, where the topics are usually obtained from human-generated online ontology such as Open Directory Project. The limitation of these approaches is that many documents may not contain the topics covered in the ontology. Moreover, the human-generated topics require expensive manual effort to determine the correct categories for each document. This paper addresses these problems by using Latent Dirichlet Allocation for unsupervised extraction of the topics from documents. With the learned topics, we observe that the search intent and user interests are dynamic, i.e., they change from time to time. In order to evaluate the effectiveness of temporal aspects in personalisation, we apply three typical time scales for building a long-term profile, a daily profile and a session profile. In the experiments, we utilise the profiles to re-rank search results returned by a commercial web search engine. Our experimental results demonstrate that our temporal profiles can significantly improve the ranking quality. The results further show a promising effect of temporal features in correlation with click entropy and query position in a search session

CiteSeerX

Crossref

Open Research Online (The Open University)

XM-Tree, a new index for web information retrieval

Author: Bender Cristina
Deco Claudia
Pierángeli Guillermo
Reyes Nora Susana
Publication venue
Publication date: 01/07/2008
Field of study

Web Information Retrieval is another problem of searching elements of a set that are closest to a given query under a certain similarity criterion. It is of interest to take advantage of metric spaces in order to solve a search in an effective and efficient way. In this article, we present an extension of the M-Tree index, called XM-Tree, in order to improve search results. This index allows dynamic insertion of new data, reduces search costs using pruning and precalculated distances, and uses a tolerable amount of space, which makes this index apt for the extensive and dynamic Web. The proposed extension indexes Web documents, uses L2 as indexing distance and L∞ as similarity criterion to solve queries. We also present experiments validating the results.Facultad de Informátic

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Servicio de Difusión de la Creación Intelectual

Survey over Existing Query and Transformation Languages

Author: Bolzer Oliver
Bry François
Furche Tim
Horrocks Ian
Kraus Michael
Orsini Renzo
Schaffert Sebastian
Publication venue
Publication date: 01/01/2004
Field of study

A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability of many current Semantic Web approaches to cope with data available in such diverging representation formalisms as XML, RDF, or Topic Maps. A common query language is the first step to allow transparent access to data in any of these formats. To further the understanding of the requirements and approaches proposed for query languages in the conventional as well as the Semantic Web, this report surveys a large number of query languages for accessing XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from all these areas. From the detailed survey of these query languages, a common classification scheme is derived that is useful for understanding and differentiating languages within and among all three areas

CiteSeerX

Open Access LMU

Exploring early vocal music and its lute arrangements: Using F-TEMPO as a musicological tool

Author: Crawford Tim
Lewis David
Porter Alastair
Publication venue: Association for Computing Machinery (ACM)
Publication date: 10/11/2023
Field of study

In its earliest state, F-TEMPO (Full-Text searching of Early Music Prints Online) enabled searching in the musical content of about 30,000 page-images of early printed music from the British Library's Early Music Online collection (GB-Lbl). The images were processed using the Optical Music Recognition (OMR) program, Aruspix, whose output is saved in the MEI (Music Encoding Initiative) format. To enable fast searches of the MEI, we adopted an indexing strategy that is both scalable and substantially robust to the inevitable errors in the process. In this paper we show how searches using these indexes may be used as a first step in two useful musicological tasks without exhaustively processing the full encodings. The F-TEMPO resource has subsequently been augmented to about 500,000 images including a large number from the Bavarian State Library in Munich (D-Mbs), and other libraries (D-Bsb, PL-Wn and F-Pn). Most recently, a new and more robust system architecture is in development, together with a new interface conforming better to modern web standards. The simple, yet robust, indexing method we use can be applied to scores encoded in any format from which strings of pitches each corresponding to a voice or instrument in the score can be derived. In addition to page-images, in its current form F-TEMPO now includes a collection of over 10,000 scores encoded in MusicXML, largely of early music, from the online Choral Public Domain Library (CPDL). To show the potential for F-TEMPO as a tool for musicologists to explore the full-text content of the collections, we look at two simple tasks: (a) finding pages which contain similar music to a given query page; and (b), given a query representing an approximation to the highest-sounding voice from a lute arrangement of a popular vocal item from the 16th century, finding a likely vocal model within the F-TEMPO index

Goldsmiths Research Online