94,515 research outputs found
Fast Locality-Sensitive Hashing Frameworks for Approximate Near Neighbor Search
The Indyk-Motwani Locality-Sensitive Hashing (LSH) framework (STOC 1998) is a
general technique for constructing a data structure to answer approximate near
neighbor queries by using a distribution over locality-sensitive
hash functions that partition space. For a collection of points, after
preprocessing, the query time is dominated by evaluations
of hash functions from and hash table lookups and
distance computations where is determined by the
locality-sensitivity properties of . It follows from a recent
result by Dahlgaard et al. (FOCS 2017) that the number of locality-sensitive
hash functions can be reduced to , leaving the query time to be
dominated by distance computations and
additional word-RAM operations. We state this result as a general framework and
provide a simpler analysis showing that the number of lookups and distance
computations closely match the Indyk-Motwani framework, making it a viable
replacement in practice. Using ideas from another locality-sensitive hashing
framework by Andoni and Indyk (SODA 2006) we are able to reduce the number of
additional word-RAM operations to .Comment: 15 pages, 3 figure
The DiSCmap project : overview and first results
Traditionally, digitisation of cultural and scientific heritage material for use by the scholarly community has been led by supply rather than demand. The DiSCmap project commissioned by JISC in 2008, aimed to study what refocussing of digitisation efforts will suit best the users of digitised materials, especially in the context of the research and teaching in the higher education institutions in the UK. The paper presents some of its initial outcomes based on quantitative and qualitative analysis of 945 special collections nominated for digitisation by intermediary users (librarians, archivist and museum curators), as well as end users' study involving a combination of online survey, focus groups and in-depth interviews. The criteria for prioritising digitisation advanced by intermediaries and end users were analysed and cross-mapped to a range of existing digitisation frameworks. A user-driven prioritisation framework which synthesises the findings of the project is presented
Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines
A cross-disciplinary examination of the user behaviours involved in seeking
and evaluating data is surprisingly absent from the research data discussion.
This review explores the data retrieval literature to identify commonalities in
how users search for and evaluate observational research data. Two analytical
frameworks rooted in information retrieval and science technology studies are
used to identify key similarities in practices as a first step toward
developing a model describing data retrieval
Conservation and use of genetic resources of underutilized crops in the Americas - A continental analysis
Latin America is home to dramatically diverse agroecological regions which harbor a high concentration of underutilized plant species, whose genetic resources hold the potential to address challenges such as sustainable agricultural development, food security and sovereignty, and climate change. This paper examines the status of an expert-informed list of underutilized crops in Latin America and analyses how the most common features of underuse apply to these. The analysis pays special attention to if and how existing international policy and legal frameworks on biodiversity and plant genetic resources effectively support or not the conservation and sustainable use of underutilized crops. Results show that not all minor crops are affected by the same degree of neglect, and that the aspects under which any crop is underutilized vary greatly, calling for specific analyses and interventions. We also show that current international policy and legal instruments have so far provided limited stimulus and funding for the conservation and sustainable use of the genetic resources of these crops. Finally, the paper proposes an analytical framework for identifying and evaluating a crop’s underutilization, in order to define the most appropriate type and levels of intervention (international, national, local) for improving its statu
The DiSCmap project : digitisation of special collections: mapping, assessment, prioritisation
The paper presents the outcomes of DiSCmap, a JISC and RIN-funded project which aimed to study users' priorities for digitisation of special collections within the context of the higher education institutions in the UK. The project produced a 'long list' of 945 collections nominated for digitisation by intermediaries and end users and a user-driven prioritisation framework. Web surveys were used as a tool to gather data in combination with focus groups and telephone interviews with end users helped to get additional insights on their views in particular domains. The project developed an online forum and a group in Facebook in order to find to what extent the social networking technologies can be used to sustain a professional informal community but this did not prove to be successful. Over 1000 specialists took part in the different forms used to gather intermediaries and end users' nominations of collections for the "long list" and opinions about digitisation priorities. The long list of 945 special collections nominated for digitisation can be useful as an evidence of identified user interest; this list is not seen as a "snapshot" but as an outcome which needs to be sustained and further developed in the future. A user-driven framework for prioritizing digitisation was produced; it fits well with the current JISC digitisation strategy, providing a further level of detail on user priorities. The project also suggests a flexible approach for prioritizing collections for digitisation based on the use of the framework in combination with the long list of collections. The project did not make a representative study; the participation of intermediaries and end users was a matter of good will. Yet, special collections from 44% of the higher education institutions in the UK were nominated to the long list. The work on the project provided new insights and evidence on the user priorities in digitisation of special collections. It also suggests a user-driven digitisation prioritization framework which would be of benefit in future decision making
- …