7 research outputs found

    SABIO: Soft Agent for Extended Information Retrieval

    Get PDF
    In the current study, an integrated system called SABIO is presented. The current system applies Information Retrieval (IR) techniques developed for collections of textual documents to nontextual corpa. SABIO integrates a fuzzy logic-based procedure for IR. Its search algorithm improves the IR efficiency and decreases the computational burden by using a fuzzy logic-based procedure for IR. This procedure is integrated in a flexible and fault-tolerant, human-reasoning-based search algorithm. The Accumulated Knowledge Set (AKS) of the system is sorted in a hierarchic multilevel tree-structure-like ontology. The objects in the AKS are represented using a novel human-reasoning-based-method. This representation takes into account the occurrence of related terms. The system uses a novel fuzzy logic-based term-weighting (TW) method. The developed fuzzy logic method improves the classical term frequency–inverse document frequency (TF=IDF) method, generally used for TW. The abovementioned system is the core of a wizard for search into the website of the University of Seville, www.us.es, which is currently in testing

    Sacola de grafos textuais : um modelo de representação de textos baseado em grafos, preciso, eficiente e de propósito geral

    Get PDF
    Orientador: Ricardo da Silva TorresDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Modelos de representação de textos são o alicerce fundamental para as tarefas de Recuperação de Informação e Mineração de Textos. Apesar de diferentes modelos de representação de textos terem sido propostos, eles não são ao mesmo tempo eficientes, precisos e flexíveis para serem usados em aplicações variadas. Neste projeto, apresentamos a Sacola de Grafos Textuais (do inglês \textit{Bag of Textual Graphs}), um modelo de representação de textos que satisfaz esses três requisitos, ao propor uma combinação de um modelo de representação baseado em grafos com um arcabouço genérico de síntese de grafos em representações vetoriais. Avaliamos nosso método em experimentos considerando quatro coleções textuais bem conhecidas: Reuters-21578, 20-newsgroups, 4-universidades e K-series. Os resultados experimentais demonstram que o nosso modelo é genérico o bastante para lidar com diferentes coleções, e é mais eficiente do que métodos atuais e largamente utilizados em tarefas de classificação e recuperação de textos, sem perda de precisãoAbstract: Text representation models are the fundamental basis for Information Retrieval and Text Mining tasks. Despite different text models have been proposed, they are not at the same time efficient, accurate, and flexible to be used in several applications. Here we present Bag of Textual Graphs, a text representation model that addresses these three requirements, by combining a graph-representation model with an generic framework for graph-to-vector synthesis. We evaluate our method on experiments considering four well-known text collections: Reuters-21578, 20-newsgroups, 4-universities, and K-series. Experimental results demonstrate that our model is generic enough to handle different collections, and is more efficient than widely-used state-of-the-art methods in textual classification and retrieval tasks, without losing accuracy performanceMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    Agente para recuperación automática de información en diversos entornos basado en técnicas de inteligencia computacional

    Get PDF
    Falta palabras clavesLa presente tesis se enmarca en la problemática de la recuperación de información, entendiendo por recuperación de información la búsqueda dentro de una colección de documentos diversos, de forma automática, de todos los documentos relacionados, con un cierto grado de relevancia, a partir de una consulta formulada por un usuario. En particular, expone un novedoso sistema, basado en lógica difusa, para la implementación de agentes inteligentes para resolver problemas reales de recuperación de información en diversos entornos. Los métodos de recuperación de información y de asignación de pesos desarrollados dan lugar a las publicaciones que se adjuntan en el compendio de esta tesis; y su aplicación propicia una entrada en la oficina de registro de la propiedad intelectual. En los trabajos de colaboración con empresa relacionados en el Capítulo 5 se han implementado diversos prototipos de agentes inteligentes aplicando las técnicas de inteligencia computacional que sustentan los métodos de recuperación de información desarrollados, con la finalidad de crear agentes inteligentes para resolución de problemas reales en los que se necesita realizar una recuperación de información. Los agentes inteligentes desarrollados utilizan el método de recuperación de información, el método de asignación de pesos, y la estructura de almacenamiento de información desarrollada en las publicaciones que forman el compendio de esta tesis. En dichas publicaciones se justifica el buen funcionamiento de estos métodos, así como la mejora de rendimiento en la recuperación de información contenida en portales web frente al modelo de espacio vectorial (Vector Space Model, VSM) y el método de asignación de pesos tf-idf

    Combining shape and color. A bottom-up approach to evaluate object similarities

    Get PDF
    The objective of the present work is to develop a bottom-up approach to estimate the similarity between two unknown objects. Given a set of digital images, we want to identify the main objects and to determine whether they are similar or not. In the last decades many object recognition and classification strategies, driven by higher-level activities, have been successfully developed. The peculiarity of this work, instead, is the attempt to work without any training phase nor a priori knowledge about the objects or their context. Indeed, if we suppose to be in an unstructured and completely unknown environment, usually we have to deal with novel objects never seen before; under these hypothesis, it would be very useful to define some kind of similarity among the instances under analysis (even if we do not know which category they belong to). To obtain this result, we start observing that human beings use a lot of information and analyze very different aspects to achieve object recognition: shape, position, color and so on. Hence we try to reproduce part of this process, combining different methodologies (each working on a specific characteristic) to obtain a more meaningful idea of similarity. Mainly inspired by the human conception of representation, we identify two main characteristics and we called them the implicit and explicit models. The term "explicit" is used to account for the main traits of what, in the human representation, connotes a principal source of information regarding a category, a sort of a visual synecdoche (corresponding to the shape); the term "implicit", on the other hand, accounts for the object rendered by shadows and lights, colors and volumetric impression, a sort of a visual metonymy (corresponding to the chromatic characteristics). During the work, we had to face several problems and we tried to define specific solutions. In particular, our contributions are about: - defining a bottom-up approach for image segmentation (which does not rely on any a priori knowledge); - combining different features to evaluate objects similarity (particularly focusiing on shape and color); - defining a generic distance (similarity) measure between objects (without any attempt to identify the possible category they belong to); - analyzing the consequences of using the number of modes as an estimation of the number of mixture’s components (in the Expectation-Maximization algorithm)

    The neuro-cognitive representation of word meaning resolved in space and time.

    Get PDF
    One of the core human abilities is that of interpreting symbols. Prompted with a perceptual stimulus devoid of any intrinsic meaning, such as a written word, our brain can access a complex multidimensional representation, called semantic representation, which corresponds to its meaning. Notwithstanding decades of neuropsychological and neuroimaging work on the cognitive and neural substrate of semantic representations, many questions are left unanswered. The research in this dissertation attempts to unravel one of them: are the neural substrates of different components of concrete word meaning dissociated? In the first part, I review the different theoretical positions and empirical findings on the cognitive and neural correlates of semantic representations. I highlight how recent methodological advances, namely the introduction of multivariate methods for the analysis of distributed patterns of brain activity, broaden the set of hypotheses that can be empirically tested. In particular, they allow the exploration of the representational geometries of different brain areas, which is instrumental to the understanding of where and when the various dimensions of the semantic space are activated in the brain. Crucially, I propose an operational distinction between motor-perceptual dimensions (i.e., those attributes of the objects referred to by the words that are perceived through the senses) and conceptual ones (i.e., the information that is built via a complex integration of multiple perceptual features). In the second part, I present the results of the studies I conducted in order to investigate the automaticity of retrieval, topographical organization, and temporal dynamics of motor-perceptual and conceptual dimensions of word meaning. First, I show how the representational spaces retrieved with different behavioral and corpora-based methods (i.e., Semantic Distance Judgment, Semantic Feature Listing, WordNet) appear to be highly correlated and overall consistent within and across subjects. Second, I present the results of four priming experiments suggesting that perceptual dimensions of word meaning (such as implied real world size and sound) are recovered in an automatic but task-dependent way during reading. Third, thanks to a functional magnetic resonance imaging experiment, I show a representational shift along the ventral visual path: from perceptual features, preferentially encoded in primary visual areas, to conceptual ones, preferentially encoded in mid and anterior temporal areas. This result indicates that complementary dimensions of the semantic space are encoded in a distributed yet partially dissociated way across the cortex. Fourth, by means of a study conducted with magnetoencephalography, I present evidence of an early (around 200 ms after stimulus onset) simultaneous access to both motor-perceptual and conceptual dimensions of the semantic space thanks to different aspects of the signal: inter-trial phase coherence appears to be key for the encoding of perceptual while spectral power changes appear to support encoding of conceptual dimensions. These observations suggest that the neural substrates of different components of symbol meaning can be dissociated in terms of localization and of the feature of the signal encoding them, while sharing a similar temporal evolution
    corecore