267,293 research outputs found
Universal generalization and universal inter-item confusability
We argue that confusability between items should be distinguished from generalization between items. Shepard's data concern confusability, but the theories proposed by Shepard and by Tenenbaum & Griffiths concern generalization, indicating a gap between theory and data. We consider the empirical and theoretical work involved in bridging this gap
Scalar Casimir-Polder forces for uniaxial corrugations
We investigate the Dirichlet-scalar equivalent of Casimir-Polder forces
between an atom and a surface with arbitrary uniaxial corrugations. The
complexity of the problem can be reduced to a one-dimensional Green's function
equation along the corrugation which can be solved numerically. Our technique
is fully nonperturbative in the height profile of the corrugation. We present
explicit results for experimentally relevant sinusoidal and sawtooth
corrugations. Parameterizing the deviations from the planar limit in terms of
an anomalous dimension which measures the power-law deviation from the planar
case, we observe up to order-one anomalous dimensions at small and intermediate
scales and a universal regime at larger distances. This large-distance
universality can be understood from the fact that the relevant fluctuations
average over corrugation structures smaller than the atom-wall distance.Comment: 25 pages, 7 figure
Search strategies of Wikipedia readers
The quest for information is one of the most common activity of human beings. Despite the the impressive progress of search engines, not to miss the needed piece of information could be still very tough, as well as to acquire specific competences and knowledge by shaping and following the proper learning paths. Indeed, the need to find sensible paths in information networks is one of the biggest challenges of our societies and, to effectively address it, it is important to investigate the strategies adopted by human users to cope with the cognitive bottleneck of finding their way in a growing sea of information. Here we focus on the case of Wikipedia and investigate a recently released dataset about users’ click on the English Wikipedia, namely the English Wikipedia Clickstream. We perform a semantically charged analysis to uncover the general patterns followed by information seekers in the multi-dimensional space of Wikipedia topics/categories. We discover the existence of well defined strategies in which users tend to start from very general, i.e., semantically broad, pages and progressively narrow down the scope of their navigation, while keeping a growing semantic coherence. This is unlike strategies associated to tasks with predefined search goals, namely the case of the Wikispeedia game. In this case users first move from the ‘particular’ to the ‘universal’ before focusing down again to the required target. The clear picture offered here represents a very important stepping stone towards a better design of information networks and recommendation strategies, as well as the construction of radically new learning paths
Towards Automated Boundary Value Testing with Program Derivatives and Search
A natural and often used strategy when testing software is to use input
values at boundaries, i.e. where behavior is expected to change the most, an
approach often called boundary value testing or analysis (BVA). Even though
this has been a key testing idea for long it has been hard to clearly define
and formalize. Consequently, it has also been hard to automate.
In this research note we propose one such formalization of BVA by, in a
similar way as to how the derivative of a function is defined in mathematics,
considering (software) program derivatives. Critical to our definition is the
notion of distance between inputs and outputs which we can formalize and then
quantify based on ideas from Information theory.
However, for our (black-box) approach to be practical one must search for
test inputs with specific properties. Coupling it with search-based software
engineering is thus required and we discuss how program derivatives can be used
as and within fitness functions.
This brief note does not allow a deeper, empirical investigation but we use a
simple illustrative example throughout to introduce the main ideas. By
combining program derivatives with search, we thus propose a practical as well
as theoretically interesting technique for automated boundary value (analysis
and) testing
An investigation into the perspectives of providers and learners on MOOC accessibility
An effective open eLearning environment should consider the target learner’s abilities, learning goals, where learning takes place, and which specific device(s) the learner uses. MOOC platforms struggle to take these factors into account and typically are not accessible, inhibiting access to environments that are intended to be open to all. A series of research initiatives are described that are intended to benefit MOOC providers in achieving greater accessibility and disabled learners to improve their lifelong learning and re-skilling. In this paper, we first outline the rationale, the research questions, and the methodology. The research approach includes interviews, online surveys and a MOOC accessibility audit; we also include factors such the risk management of the research programme and ethical considerations when conducting research with vulnerable learners. Preliminary results are presented from interviews with providers and experts and from analysis of surveys of learners. Finally, we outline the future research opportunities. This paper is framed within the context of the Doctoral Consortium organised at the TEEM'17 conference
Normalized Web Distance and Word Similarity
There is a great deal of work in cognitive psychology, linguistics, and
computer science, about using word (or phrase) frequencies in context in text
corpora to develop measures for word similarity or word association, going back
to at least the 1960s. The goal of this chapter is to introduce the
normalizedis a general way to tap the amorphous low-grade knowledge available
for free on the Internet, typed in by local users aiming at personal
gratification of diverse objectives, and yet globally achieving what is
effectively the largest semantic electronic database in the world. Moreover,
this database is available for all by using any search engine that can return
aggregate page-count estimates for a large range of search-queries. In the
paper introducing the NWD it was called `normalized Google distance (NGD),' but
since Google doesn't allow computer searches anymore, we opt for the more
neutral and descriptive NWD. web distance (NWD) method to determine similarity
between words and phrases. ItComment: Latex, 20 pages, 7 figures, to appear in: Handbook of Natural
Language Processing, Second Edition, Nitin Indurkhya and Fred J. Damerau
Eds., CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010, ISBN
978-142008592
Negative Statements Considered Useful
Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities
- …