6 research outputs found

    LibraRing: An Architecture for Distributed Digital Libraries Based on DHTs

    Full text link
    Abstract. We present a digital library architecture based on distributed hash tables. We discuss the main components of this architecture and the protocols for offering information retrieval and information filtering functionality. We present an experimental evaluation of our proposals.

    Searching in peer-to-peer networks

    No full text
    As peer-to-peer networks are proving capable of handling huge volumes of data, the need for effective search tools is lasting and imperative. During the last years, a number of research studies have been published, which attempt to address the problem of search in large, decentralized networks. In this article, we mainly focus on content and concept-based retrieval. After providing a useful discussion on terminology, we introduce a representative sample of such studies and categorize them according to basic functional and non-functional characteristics. Following our analysis and discussion we conclude that future work should focus on information filtering, re-ranking and merging of results, relevance feedback and content replication as well as on related user-centric aspects of the problem

    Temporal pseudo-relevance feedback in microblog retrieval

    No full text
    Twitter has become a major outlet for news, discussion and commentary of on-going events and trends. Effective searching of Twitter collections poses a number of issues for traditional document-based information retrieval (IR) approaches, such as limited document term statistics and spam. In this paper we propose a novel approach to pseudo-relevance feedback, based upon the temporal profiles of n-grams extracted from the top N relevance feedback tweets. A weighted graph is used to model temporal correlation between n-grams, with a PageRank variant employed to combine both pseudo-relevant document term distribution and temporal collection evidence. Preliminary experiments with the TREC Microblogging 2011 Twitter corpus indicate that through parameter optimisation, retrieval effectiveness can be improved

    Towards addressing CPU-intensive seismological applications in Europe

    No full text
    Advanced application environments for seismic analysis help geoscientists to execute complex simulations to predict the behaviour of a geophysical system and potential surface observations. At the same time data collected from seismic stations must be processed comparing recorded signals with predictions. The EU-funded project VERCE ( http://verce.eu/ ) aims to enable specific seismological use-cases and, on the basis of requirements elicited from the seismology community, provide a service-oriented infrastructure to deal with such challenges. In this paper we present VERCE's architecture, in particular relating to forward and inverse modelling of Earth models and how the, largely file-based, HPC model can be combined with data streaming operations to enhance the scalability of experiments. We posit that the integration of services and HPC resources in an open, collaborative environment is an essential medium for the advancement of sciences of critical importance, such as seismology

    The BigDataEurope platform - supporting the variety dimension of big data

    No full text
    The management and analysis of large-scale datasets - described with the term Big Data - involves the three classic dimensions volume, velocity and variety. While the former two are well supported by a plethora of software components, the variety dimension is still rather neglected. We present the BDE platform - an easy-to-deploy, easy-to-use and adaptable (cluster-based and standalone) platform for the execution of big data components and tools like Hadoop, Spark, Flink, Flume and Cassandra. The BDE platform was designed based upon the requirements gathered from seven of the societal challenges put forward by the European Commission in the Horizon 2020 programme and targeted by the BigDataEurope pilots. As a result, the BDE platform allows to perform a variety of Big Data flow tasks like message passing, storage, analysis or publishing. To facilitate the processing of heterogeneous data, a particular innovation of the platform is the Semantic Layer, which allows to directly process RDF data and to map and transform arbitrary data into RDF. The advantages of the BDE platform are demonstrated through seven pilots, each focusing on a major societal challenge
    corecore