36 research outputs found

    Improving Information Retrieval Effectiveness in Peer-to-Peer Networks through Query Piggybacking

    Get PDF
    Περιέχει το πλήρες κείμενοThis work describes an algorithm which aims at increasing the quantity of relevant documents retrieved from a Peer-To-Peer (P2P) network. The algorithm is based on a statistical model used for ranking documents, peers and ultra-peers, and on a “piggybacking” technique performed when the query is routed across the network. The algorithm “amplifies” the statistical information about the neighborhood stored in each ultra-peer. The preliminary experiments provided encouraging results as the quantity of relevant documents retrieved through the network almost doubles once query piggybacking is exploited

    A Curated Database for Linguistic Research: The Test Case of Cimbrian Varieties

    Get PDF
    In this paper we present the definition of a conceptual approach for the information space entailed by a multidisciplinary and collaborative project, \u201cCimbrian as a test case for synchronic and diachronic language variation\u201d, which provides linguists with a test bed for formal hypotheses concerning human language. Aims of the project are to collect, digitize and tag linguistic data from the German variety of Cimbrian - spoken in three areas of northern Italy: Giazza (VR), Luserna (TN), and Roana (VI) - and to make available on-line a valuable and innovative linguistic resource for the in-depth study of Cimbrian. The task is addressed by a multidisciplinary team of linguists and computer scientists who, combining their competence, aim to make available new tools for linguistic analysis

    Character-angle based video annotation

    Get PDF
    A video annotation system includes clips organization, feature description and pattern determination. This paper aims to present a system for basketball zone-defence detection. Particularly, a character-angle based descriptor for feature description is proposed. The well-performed experimental results in basketball zone-defence detection demonstrate that it is robust for both simulations and real-life cases, with less sensitivity to the distribution caused by local translation of subprime defenders. Such a framework can be easily applied to other team-work sports

    The OpenAIRE Research Community Dashboard: On blending scientific workflows and scientific publishing

    Get PDF
    First Online 30 August 2019Despite the hype, the effective implementation of Open Science is hindered by several cultural and technical barriers. Researchers embraced digital science, use “digital laboratories” (e.g. research infrastructures, thematic services) to conduct their research and publish research data, but practices and tools are still far from achieving the expectations of transparency and reproducibility of Open Science. The places where science is performed and the places where science is published are still regarded as different realms. Publishing is still a post-experimental, tedious, manual process, too often limited to articles, in some contexts semantically linked to datasets, rarely to software, generally disregarding digital representations of experiments. In this work we present the OpenAIRE Research Community Dashboard (RCD), designed to overcome some of these barriers for a given research community, minimizing the technical efforts and without renouncing any of the community services or practices. The RCD flanks digital laboratories of research communities with scholarly communication tools for discovering and publishing interlinked scientific products such as literature, datasets, and software. The benefits of the RCD are show-cased by means of two real-case scenarios: the European Marine Science community and the European Plate Observing System (EPOS) research infrastructure.This work is partly funded by the OpenAIRE-Advance H2020 project (grant number: 777541; call: H2020-EINFRA-2017) and the OpenAIREConnect H2020 project (grant number: 731011; call: H2020-EINFRA-2016-1). Moreover, we would like to thank our colleagues Michele Manunta, Francesco Casu, and Claudio De Luca (Institute for the Electromagnetic Sensing of the Environment, CNR, Italy) for their work on the EPOS infrastructure RCD; and Stephane Pesant (University of Bremen, Germany) his work on the European Marine Science RCD

    Archeologia e Calcolatori. Accessibilità e diffusione della cultura scientifica

    Get PDF
    Based on the case study of the journal ‘Archeologia e Calcolatori’, the authors investigate specific issues related to the promotion of Open Science in archaeology. The first part analyses the initiatives undertaken in order to foster the dissemination of the journal’s digital resources on the web, such as the use of descriptive metadata (Dublin Core), the attribution of unique identifiers (DOI), the uploading of the full texts on institutional repositories for long term preservation (CNR-SOLAR), the collaboration with initiatives aiming at the aggregation of cultural and scientific digital contents (MiBACT-CulturaItalia). The second part illustrates many initiatives and projects promoted by the editorial committee to spread the principles of the ‘open access’ philosophy, nationally and internationally. The journal has thus become a record and memory of the progress in the theoretical, as well as applied, aspects of the Open Access movement. This study shows the relevance of the continuous experimentation of the practices for publishing scientific initiatives, adhering to and promoting the Open Access and facilitating the accessibility to its own resources

    The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives

    Full text link
    The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry.Comment: SIGIR 2023 resource paper, 13 page

    From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences

    Full text link
    We describe the state-of-the-art in performance modeling and prediction for Information Retrieval (IR), Natural Language Processing (NLP) and Recommender Systems (RecSys) along with its shortcomings and strengths. We present a framework for further research, identifying five major problem areas: understanding measures, performance analysis, making underlying assumptions explicit, identifying application features determining performance, and the development of prediction models describing the relationship between assumptions, features and resulting performanc
    corecore