94,515 research outputs found

    Fast Locality-Sensitive Hashing Frameworks for Approximate Near Neighbor Search

    Full text link
    The Indyk-Motwani Locality-Sensitive Hashing (LSH) framework (STOC 1998) is a general technique for constructing a data structure to answer approximate near neighbor queries by using a distribution H\mathcal{H} over locality-sensitive hash functions that partition space. For a collection of nn points, after preprocessing, the query time is dominated by O(nρlogn)O(n^{\rho} \log n) evaluations of hash functions from H\mathcal{H} and O(nρ)O(n^{\rho}) hash table lookups and distance computations where ρ(0,1)\rho \in (0,1) is determined by the locality-sensitivity properties of H\mathcal{H}. It follows from a recent result by Dahlgaard et al. (FOCS 2017) that the number of locality-sensitive hash functions can be reduced to O(log2n)O(\log^2 n), leaving the query time to be dominated by O(nρ)O(n^{\rho}) distance computations and O(nρlogn)O(n^{\rho} \log n) additional word-RAM operations. We state this result as a general framework and provide a simpler analysis showing that the number of lookups and distance computations closely match the Indyk-Motwani framework, making it a viable replacement in practice. Using ideas from another locality-sensitive hashing framework by Andoni and Indyk (SODA 2006) we are able to reduce the number of additional word-RAM operations to O(nρ)O(n^\rho).Comment: 15 pages, 3 figure

    The DiSCmap project : overview and first results

    Get PDF
    Traditionally, digitisation of cultural and scientific heritage material for use by the scholarly community has been led by supply rather than demand. The DiSCmap project commissioned by JISC in 2008, aimed to study what refocussing of digitisation efforts will suit best the users of digitised materials, especially in the context of the research and teaching in the higher education institutions in the UK. The paper presents some of its initial outcomes based on quantitative and qualitative analysis of 945 special collections nominated for digitisation by intermediary users (librarians, archivist and museum curators), as well as end users' study involving a combination of online survey, focus groups and in-depth interviews. The criteria for prioritising digitisation advanced by intermediaries and end users were analysed and cross-mapped to a range of existing digitisation frameworks. A user-driven prioritisation framework which synthesises the findings of the project is presented

    Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines

    Get PDF
    A cross-disciplinary examination of the user behaviours involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data. Two analytical frameworks rooted in information retrieval and science technology studies are used to identify key similarities in practices as a first step toward developing a model describing data retrieval

    Conservation and use of genetic resources of underutilized crops in the Americas - A continental analysis

    Get PDF
    Latin America is home to dramatically diverse agroecological regions which harbor a high concentration of underutilized plant species, whose genetic resources hold the potential to address challenges such as sustainable agricultural development, food security and sovereignty, and climate change. This paper examines the status of an expert-informed list of underutilized crops in Latin America and analyses how the most common features of underuse apply to these. The analysis pays special attention to if and how existing international policy and legal frameworks on biodiversity and plant genetic resources effectively support or not the conservation and sustainable use of underutilized crops. Results show that not all minor crops are affected by the same degree of neglect, and that the aspects under which any crop is underutilized vary greatly, calling for specific analyses and interventions. We also show that current international policy and legal instruments have so far provided limited stimulus and funding for the conservation and sustainable use of the genetic resources of these crops. Finally, the paper proposes an analytical framework for identifying and evaluating a crop’s underutilization, in order to define the most appropriate type and levels of intervention (international, national, local) for improving its statu

    SLIS Student Research Journal, Vol.7, Iss.1

    Get PDF

    Art Museum as a Purveyor of Culture

    Get PDF

    The DiSCmap project : digitisation of special collections: mapping, assessment, prioritisation

    Get PDF
    The paper presents the outcomes of DiSCmap, a JISC and RIN-funded project which aimed to study users' priorities for digitisation of special collections within the context of the higher education institutions in the UK. The project produced a 'long list' of 945 collections nominated for digitisation by intermediaries and end users and a user-driven prioritisation framework. Web surveys were used as a tool to gather data in combination with focus groups and telephone interviews with end users helped to get additional insights on their views in particular domains. The project developed an online forum and a group in Facebook in order to find to what extent the social networking technologies can be used to sustain a professional informal community but this did not prove to be successful. Over 1000 specialists took part in the different forms used to gather intermediaries and end users' nominations of collections for the "long list" and opinions about digitisation priorities. The long list of 945 special collections nominated for digitisation can be useful as an evidence of identified user interest; this list is not seen as a "snapshot" but as an outcome which needs to be sustained and further developed in the future. A user-driven framework for prioritizing digitisation was produced; it fits well with the current JISC digitisation strategy, providing a further level of detail on user priorities. The project also suggests a flexible approach for prioritizing collections for digitisation based on the use of the framework in combination with the long list of collections. The project did not make a representative study; the participation of intermediaries and end users was a matter of good will. Yet, special collections from 44% of the higher education institutions in the UK were nominated to the long list. The work on the project provided new insights and evidence on the user priorities in digitisation of special collections. It also suggests a user-driven digitisation prioritization framework which would be of benefit in future decision making
    corecore