78,359 research outputs found
Throwing Out the Baby with the Bathwater: The Undesirable Effects of National Research Assessment Exercises on Research
The evaluation of the quality of research at a national level has become increasingly common. The UK has been at the forefront of this trend having undertaken many assessments since 1986, the latest being the âResearch Excellence Frameworkâ in 2014. The argument of this paper is that, whatever the intended results in terms of evaluating and improving research, there have been many, presumably unintended, results that are highly undesirable for research and the university community more generally. We situate our analysis using Bourdieuâs theory of cultural reproduction and then focus on the peculiarities of the 2008 RAE and the 2014 REF the rules of which allowed for, and indeed encouraged, significant game-playing on the part of striving universities. We conclude with practical recommendations to maintain the general intention of research assessment without the undesirable side-effects
Opportunistic linked data querying through approximate membership metadata
Between URI dereferencing and the SPARQL protocol lies a largely unexplored axis of possible interfaces to Linked Data, each with its own combination of trade-offs. One of these interfaces is Triple Pattern Fragments, which allows clients to execute SPARQL queries against low-cost servers, at the cost of higher bandwidth. Increasing a client's efficiency means lowering the number of requests, which can among others be achieved through additional metadata in responses. We noted that typical SPARQL query evaluations against Triple Pattern Fragments require a significant portion of membership subqueries, which check the presence of a specific triple, rather than a variable pattern. This paper studies the impact of providing approximate membership functions, i.e., Bloom filters and Golomb-coded sets, as extra metadata. In addition to reducing HTTP requests, such functions allow to achieve full result recall earlier when temporarily allowing lower precision. Half of the tested queries from a WatDiv benchmark test set could be executed with up to a third fewer HTTP requests with only marginally higher server cost. Query times, however, did not improve, likely due to slower metadata generation and transfer. This indicates that approximate membership functions can partly improve the client-side query process with minimal impact on the server and its interface
Balancing clusters to reduce response time variability in large scale image search
Many algorithms for approximate nearest neighbor search in high-dimensional
spaces partition the data into clusters. At query time, in order to avoid
exhaustive search, an index selects the few (or a single) clusters nearest to
the query point. Clusters are often produced by the well-known -means
approach since it has several desirable properties. On the downside, it tends
to produce clusters having quite different cardinalities. Imbalanced clusters
negatively impact both the variance and the expectation of query response
times. This paper proposes to modify -means centroids to produce clusters
with more comparable sizes without sacrificing the desirable properties.
Experiments with a large scale collection of image descriptors show that our
algorithm significantly reduces the variance of response times without
seriously impacting the search quality
Aid Selectivity According to Augmented Criteria
A dominant trend in the literature maintains that donor assistance should be targeted to poor countries with sound institutions and policies. In this context, donor selectivity refers to what extent aid is allocated according to the principles of this "canonical" model. This paper shows that it is legitimate for donors to simultaneously use other selectivity criteria corresponding either to expected factors of aid effectiveness or to handicaps to development. It is notably argued that vulnerability to exogenous shocks and low level of human capital should be considered as selectivity criteria. Taking these other criteria into account dramatically changes the assessment of donor selectivity.Aid selectivity, aid effectiveness, vulnerability, handicaps, least developed
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
- âŠ