5 research outputs found

    De las tarjetas perforadas a Google: una historia esquemática de la ciencia de la información

    Get PDF
    This paper reviews the history of information re-trieval (IR) from punched cards and the first pro-grammable computer (the ENIAC of 1945) to the present day Web searcher Google and Microsoft’s “cognitive technology” Watson. The review is based on three major factors in the development of IR; (1) the enormous increase in computing power over the last 72 years, (2) the “competition” between statis-tical analysis of text and Natural Language Pro-cessing (NLP) in which the two have finally to a large extent converged, and (3) the corresponding changes in human intervention in the IR process.Se revisa la historia de la recuperación de información (IR) desde las tarjetas perforadas y la primera computadora programable (el ENIAC de 1945) hasta el actual buscador web Google y la “tecnología cognitiva” Watson de Microsoft. La revisión se basa en tres factores principales en el desarrollo de IR; (1) el enorme aumento en el poder de cómputo en los últimos 72 años, (2) la “competencia” entre el análisis estadístico del texto y el procesamiento del lenguaje natural (NLP) en la que ambos finalmente han convergido en gran medida, y (3) los cambios correspondientes en la intervención humana en el proceso de IR

    Combining word semantics within complex Hilbert space for information retrieval

    Get PDF
    Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies

    Combining Word Semantics within Complex Hilbert Space for Information Retrieval

    No full text
    Abstract. Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies

    Implications of Computational Cognitive Models for Information Retrieval

    Get PDF
    This dissertation explores the implications of computational cognitive modeling for information retrieval. The parallel between information retrieval and human memory is that the goal of an information retrieval system is to find the set of documents most relevant to the query whereas the goal for the human memory system is to access the relevance of items stored in memory given a memory probe (Steyvers & Griffiths, 2010). The two major topics of this dissertation are desirability and information scent. Desirability is the context independent probability of an item receiving attention (Recker & Pitkow, 1996). Desirability has been widely utilized in numerous experiments to model the probability that a given memory item would be retrieved (Anderson, 2007). Information scent is a context dependent measure defined as the utility of an information item (Pirolli & Card, 1996b). Information scent has been widely utilized to predict the memory item that would be retrieved given a probe (Anderson, 2007) and to predict the browsing behavior of humans (Pirolli & Card, 1996b). In this dissertation, I proposed the theory that desirability observed in human memory is caused by preferential attachment in networks. Additionally, I showed that documents accessed in large repositories mirror the observed statistical properties in human memory and that these properties can be used to improve document ranking. Finally, I showed that the combination of information scent and desirability improves document ranking over existing well-established approaches
    corecore