547 research outputs found

    Feeds as Query Result Serializations

    Full text link
    Many Web-based data sources and services are available as feeds, a model that provides consumers with a loosely coupled way of interacting with providers. The current feed model is limited in its capabilities, however. Though it is simple to implement and scales well, it cannot be transferred to a wider range of application scenarios. This paper conceptualizes feeds as a way to serialize query results, describes the current hardcoded query semantics of such a perspective, and surveys the ways in which extensions of this hardcoded model have been proposed or implemented. Our generalized view of feeds as query result serializations has implications for the applicability of feeds as a generic Web service for any collection that is providing access to individual information items. As one interesting and compelling class of applications, we describe a simple way in which a query-based approach to feeds can be used to support location-based services

    Efficient And Scalable Evaluation Of Continuous, Spatio-temporal Queries In Mobile Computing Environments

    Get PDF
    A variety of research exists for the processing of continuous queries in large, mobile environments. Each method tries, in its own way, to address the computational bottleneck of constantly processing so many queries. For this research, we present a two-pronged approach at addressing this problem. Firstly, we introduce an efficient and scalable system for monitoring traditional, continuous queries by leveraging the parallel processing capability of the Graphics Processing Unit. We examine a naive CPU-based solution for continuous range-monitoring queries, and we then extend this system using the GPU. Additionally, with mobile communication devices becoming commodity, location-based services will become ubiquitous. To cope with the very high intensity of location-based queries, we propose a view oriented approach of the location database, thereby reducing computation costs by exploiting computation sharing amongst queries requiring the same view. Our studies show that by exploiting the parallel processing power of the GPU, we are able to significantly scale the number of mobile objects, while maintaining an acceptable level of performance. Our second approach was to view this research problem as one belonging to the domain of data streams. Several works have convincingly argued that the two research fields of spatiotemporal data streams and the management of moving objects can naturally come together. [IlMI10, ChFr03, MoXA04] For example, the output of a GPS receiver, monitoring the position of a mobile object, is viewed as a data stream of location updates. This data stream of location updates, along with those from the plausibly many other mobile objects, is received at a centralized server, which processes the streams upon arrival, effectively updating the answers to the currently active queries in real time. iv For this second approach, we present GEDS, a scalable, Graphics Processing Unit (GPU)-based framework for the evaluation of continuous spatio-temporal queries over spatiotemporal data streams. Specifically, GEDS employs the computation sharing and parallel processing paradigms to deliver scalability in the evaluation of continuous, spatio-temporal range queries and continuous, spatio-temporal kNN queries. The GEDS framework utilizes the parallel processing capability of the GPU, a stream processor by trade, to handle the computation required in this application. Experimental evaluation shows promising performance and shows the scalability and efficacy of GEDS in spatio-temporal data streaming environments. Additional performance studies demonstrate that, even in light of the costs associated with memory transfers, the parallel processing power provided by GEDS clearly counters and outweighs any associated costs. Finally, in an effort to move beyond the analysis of specific algorithms over the GEDS framework, we take a broader approach in our analysis of GPU computing. What algorithms are appropriate for the GPU? What types of applications can benefit from the parallel and stream processing power of the GPU? And can we identify a class of algorithms that are best suited for GPU computing? To answer these questions, we develop an abstract performance model, detailing the relationship between the CPU and the GPU. From this model, we are able to extrapolate a list of attributes common to successful GPU-based applications, thereby providing insight into which algorithms and applications are best suited for the GPU and also providing an estimated theoretical speedup for said GPU-based application

    Exploiting Graphic Card Processor Technology to Accelerate Data Mining Queries in SAP NetWeaver BIA

    Get PDF
    Within business Intelligence contexts, the importance of data mining algorithms is continuously increasing, particularly from the perspective of applications and users that demand novel algorithms on the one hand and an efficient implementation exploiting novel system architectures on the other hand. Within this paper, we focus on the latter issue and report our experience with the exploitation of graphic card processor technology within the SAP NetWeaver business intelligence accelerator (BIA). The BIA represents a highly distributed analytical engine that supports OLAP and data mining processing primitives. The system organizes data entities in column-wise fashion and its operation is completely main-memory-based. Since case studies have shown that classic data mining queries spend a large portion of their runtime on scanning and filtering the data as a necessary prerequisite to the actual mining step, our main goal was to speed up this expensive scanning and filtering process. In a first step, the paper outlines the basic data mining processing techniques within SAP NetWeaver BIA and illustrates the implementation of scans and filters. In a second step, we give insight into the main features of a hybrid system architecture design exploiting graphic card processor technology. Finally, we sketch the implementation and give details of our vast evaluations

    Automata learning algorithms and processes for providing more complete systems requirements specification by scenario generation, CSP-based syntax-oriented model construction, and R2D2C system requirements transformation

    Get PDF
    Systems, methods and apparatus are provided through which in some embodiments, automata learning algorithms and techniques are implemented to generate a more complete set of scenarios for requirements based programming. More specifically, a CSP-based, syntax-oriented model construction, which requires the support of a theorem prover, is complemented by model extrapolation, via automata learning. This may support the systematic completion of the requirements, the nature of the requirement being partial, which provides focus on the most prominent scenarios. This may generalize requirement skeletons by extrapolation and may indicate by way of automatically generated traces where the requirement specification is too loose and additional information is required

    Temporal latent topic user profiles for search personalisation

    Get PDF
    The performance of search personalisation largely depends on how to build user profiles effectively. Many approaches have been developed to build user profiles using topics discussed in relevant documents, where the topics are usually obtained from human-generated online ontology such as Open Directory Project. The limitation of these approaches is that many documents may not contain the topics covered in the ontology. Moreover, the human-generated topics require expensive manual effort to determine the correct categories for each document. This paper addresses these problems by using Latent Dirichlet Allocation for unsupervised extraction of the topics from documents. With the learned topics, we observe that the search intent and user interests are dynamic, i.e., they change from time to time. In order to evaluate the effectiveness of temporal aspects in personalisation, we apply three typical time scales for building a long-term profile, a daily profile and a session profile. In the experiments, we utilise the profiles to re-rank search results returned by a commercial web search engine. Our experimental results demonstrate that our temporal profiles can significantly improve the ranking quality. The results further show a promising effect of temporal features in correlation with click entropy and query position in a search session

    XM-Tree, a new index for web information retrieval

    Get PDF
    Web Information Retrieval is another problem of searching elements of a set that are closest to a given query under a certain similarity criterion. It is of interest to take advantage of metric spaces in order to solve a search in an effective and efficient way. In this article, we present an extension of the M-Tree index, called XM-Tree, in order to improve search results. This index allows dynamic insertion of new data, reduces search costs using pruning and precalculated distances, and uses a tolerable amount of space, which makes this index apt for the extensive and dynamic Web. The proposed extension indexes Web documents, uses L2 as indexing distance and L∞ as similarity criterion to solve queries. We also present experiments validating the results.Facultad de Informátic

    Survey over Existing Query and Transformation Languages

    Get PDF
    A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability of many current Semantic Web approaches to cope with data available in such diverging representation formalisms as XML, RDF, or Topic Maps. A common query language is the first step to allow transparent access to data in any of these formats. To further the understanding of the requirements and approaches proposed for query languages in the conventional as well as the Semantic Web, this report surveys a large number of query languages for accessing XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from all these areas. From the detailed survey of these query languages, a common classification scheme is derived that is useful for understanding and differentiating languages within and among all three areas

    Exploring early vocal music and its lute arrangements: Using F-TEMPO as a musicological tool

    Get PDF
    In its earliest state, F-TEMPO (Full-Text searching of Early Music Prints Online) enabled searching in the musical content of about 30,000 page-images of early printed music from the British Library's Early Music Online collection (GB-Lbl). The images were processed using the Optical Music Recognition (OMR) program, Aruspix, whose output is saved in the MEI (Music Encoding Initiative) format. To enable fast searches of the MEI, we adopted an indexing strategy that is both scalable and substantially robust to the inevitable errors in the process. In this paper we show how searches using these indexes may be used as a first step in two useful musicological tasks without exhaustively processing the full encodings. The F-TEMPO resource has subsequently been augmented to about 500,000 images including a large number from the Bavarian State Library in Munich (D-Mbs), and other libraries (D-Bsb, PL-Wn and F-Pn). Most recently, a new and more robust system architecture is in development, together with a new interface conforming better to modern web standards. The simple, yet robust, indexing method we use can be applied to scores encoded in any format from which strings of pitches each corresponding to a voice or instrument in the score can be derived. In addition to page-images, in its current form F-TEMPO now includes a collection of over 10,000 scores encoded in MusicXML, largely of early music, from the online Choral Public Domain Library (CPDL). To show the potential for F-TEMPO as a tool for musicologists to explore the full-text content of the collections, we look at two simple tasks: (a) finding pages which contain similar music to a given query page; and (b), given a query representing an approximation to the highest-sounding voice from a lute arrangement of a popular vocal item from the 16th century, finding a likely vocal model within the F-TEMPO index
    • …
    corecore