267,293 research outputs found

    Universal generalization and universal inter-item confusability

    Get PDF
    We argue that confusability between items should be distinguished from generalization between items. Shepard's data concern confusability, but the theories proposed by Shepard and by Tenenbaum & Griffiths concern generalization, indicating a gap between theory and data. We consider the empirical and theoretical work involved in bridging this gap

    Scalar Casimir-Polder forces for uniaxial corrugations

    Full text link
    We investigate the Dirichlet-scalar equivalent of Casimir-Polder forces between an atom and a surface with arbitrary uniaxial corrugations. The complexity of the problem can be reduced to a one-dimensional Green's function equation along the corrugation which can be solved numerically. Our technique is fully nonperturbative in the height profile of the corrugation. We present explicit results for experimentally relevant sinusoidal and sawtooth corrugations. Parameterizing the deviations from the planar limit in terms of an anomalous dimension which measures the power-law deviation from the planar case, we observe up to order-one anomalous dimensions at small and intermediate scales and a universal regime at larger distances. This large-distance universality can be understood from the fact that the relevant fluctuations average over corrugation structures smaller than the atom-wall distance.Comment: 25 pages, 7 figure

    Search strategies of Wikipedia readers

    Get PDF
    The quest for information is one of the most common activity of human beings. Despite the the impressive progress of search engines, not to miss the needed piece of information could be still very tough, as well as to acquire specific competences and knowledge by shaping and following the proper learning paths. Indeed, the need to find sensible paths in information networks is one of the biggest challenges of our societies and, to effectively address it, it is important to investigate the strategies adopted by human users to cope with the cognitive bottleneck of finding their way in a growing sea of information. Here we focus on the case of Wikipedia and investigate a recently released dataset about users’ click on the English Wikipedia, namely the English Wikipedia Clickstream. We perform a semantically charged analysis to uncover the general patterns followed by information seekers in the multi-dimensional space of Wikipedia topics/categories. We discover the existence of well defined strategies in which users tend to start from very general, i.e., semantically broad, pages and progressively narrow down the scope of their navigation, while keeping a growing semantic coherence. This is unlike strategies associated to tasks with predefined search goals, namely the case of the Wikispeedia game. In this case users first move from the ‘particular’ to the ‘universal’ before focusing down again to the required target. The clear picture offered here represents a very important stepping stone towards a better design of information networks and recommendation strategies, as well as the construction of radically new learning paths

    Towards Automated Boundary Value Testing with Program Derivatives and Search

    Full text link
    A natural and often used strategy when testing software is to use input values at boundaries, i.e. where behavior is expected to change the most, an approach often called boundary value testing or analysis (BVA). Even though this has been a key testing idea for long it has been hard to clearly define and formalize. Consequently, it has also been hard to automate. In this research note we propose one such formalization of BVA by, in a similar way as to how the derivative of a function is defined in mathematics, considering (software) program derivatives. Critical to our definition is the notion of distance between inputs and outputs which we can formalize and then quantify based on ideas from Information theory. However, for our (black-box) approach to be practical one must search for test inputs with specific properties. Coupling it with search-based software engineering is thus required and we discuss how program derivatives can be used as and within fitness functions. This brief note does not allow a deeper, empirical investigation but we use a simple illustrative example throughout to introduce the main ideas. By combining program derivatives with search, we thus propose a practical as well as theoretically interesting technique for automated boundary value (analysis and) testing

    An investigation into the perspectives of providers and learners on MOOC accessibility

    Get PDF
    An effective open eLearning environment should consider the target learner’s abilities, learning goals, where learning takes place, and which specific device(s) the learner uses. MOOC platforms struggle to take these factors into account and typically are not accessible, inhibiting access to environments that are intended to be open to all. A series of research initiatives are described that are intended to benefit MOOC providers in achieving greater accessibility and disabled learners to improve their lifelong learning and re-skilling. In this paper, we first outline the rationale, the research questions, and the methodology. The research approach includes interviews, online surveys and a MOOC accessibility audit; we also include factors such the risk management of the research programme and ethical considerations when conducting research with vulnerable learners. Preliminary results are presented from interviews with providers and experts and from analysis of surveys of learners. Finally, we outline the future research opportunities. This paper is framed within the context of the Doctoral Consortium organised at the TEEM'17 conference

    Normalized Web Distance and Word Similarity

    Get PDF
    There is a great deal of work in cognitive psychology, linguistics, and computer science, about using word (or phrase) frequencies in context in text corpora to develop measures for word similarity or word association, going back to at least the 1960s. The goal of this chapter is to introduce the normalizedis a general way to tap the amorphous low-grade knowledge available for free on the Internet, typed in by local users aiming at personal gratification of diverse objectives, and yet globally achieving what is effectively the largest semantic electronic database in the world. Moreover, this database is available for all by using any search engine that can return aggregate page-count estimates for a large range of search-queries. In the paper introducing the NWD it was called `normalized Google distance (NGD),' but since Google doesn't allow computer searches anymore, we opt for the more neutral and descriptive NWD. web distance (NWD) method to determine similarity between words and phrases. ItComment: Latex, 20 pages, 7 figures, to appear in: Handbook of Natural Language Processing, Second Edition, Nitin Indurkhya and Fred J. Damerau Eds., CRC Press, Taylor and Francis Group, Boca Raton, FL, 2010, ISBN 978-142008592

    Negative Statements Considered Useful

    No full text
    Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities
    corecore