16 research outputs found

    A Useful Framework for Identification and Analysis of Different Query Expansion Approaches based on the Candidate Expansion Terms Extraction Methods

    Get PDF
    Query expansion is a method for improving retrieval performance by supplementing an original query with additional terms. This process improves the quality of search engine results and helps users to find the required information. In the recent years, different methods have been proposed in this area. In addition to such a variety of different approaches in this area and necessity of the study of their characteristics, the lack of a comprehensive classification based on candidate expansion terms extraction methods and also suitable and complete criteria to evaluate them, make the precise study, comparison and evaluation of methods for query expansion and choosing appropriate method based on need difficult for researchers. Therefore, in this paper a new useful framework is presented. In the proposed framework, in addition to the identification of three basic approaches based on the candidate expansion terms extraction methods for query expansion and expressing their properties, appropriate criteria for qualitative evaluation of these methods will be described. Next, the proposed approaches will be evaluated qualitatively based on these criteria. Using the systematic and structured framework proposed in this paper leads a useful platform for researchers to be provided for the comparative study of existing methods in the field, investigating their features specially their drawbacks to improve them and choosing appropriate method based on their needs

    Enhancing Clinical Decision Support Systems with Public Knowledge Bases

    Get PDF
    With vast amount of biomedical literature available online, doctors have the benefits of consulting the literature before making clinical decisions, but they are facing the daunting task of finding needles in haystacks. In this situation, it would help doctors if an effective clinical decision support system could generate accurate queries and return a manageable size of highly useful articles. Existing studies showed the useful-ness of patients’ diagnosis information in such scenario, but diagnosis is often missing in most cases. Furthermore, existing diagnosis prediction systems mainly focus on predicting a small range of diseases with well-formatted features, and it is still a great challenge to perform large-scale automatic diagnosis predictions based on noisy pa-tient medical records. In this paper, we propose automatic diagnosis prediction meth-ods for enhancing the retrieval in a clinical decision support system, where the predic-tion is based on evidences automatically collected from publicly accessible online knowledge bases such as Wikipedia and Semantic MEDLINE Database (SemMedDB). The assumption is that relevant diseases and their corresponding symptoms co-occur more frequently in these knowledge bases. Our methods perfor-mance was evaluated using test collections from the Clinical Decision Support (CDS) track in TREC 2014, 2015 and 2016. The results show that our best method can au-tomatically predict diagnosis with about 65.56% usefulness, and such predictions can significantly improve the biomedical literatures retrieval. Our methods can generate comparable retrieval results to the state-of-art methods, which utilize much more complicated methods and some manually crafted medical knowledge. One possible future work is to apply these methods in collaboration with real doctors

    Data fusion techniques for biomedical informatics and clinical decision support

    Get PDF
    Data fusion can be used to combine multiple data sources or modalities to facilitate enhanced visualization, analysis, detection, estimation, or classification. Data fusion can be applied at the raw-data, feature-based, and decision-based levels. Data fusion applications of different sorts have been built up in areas such as statistics, computer vision and other machine learning aspects. It has been employed in a variety of realistic scenarios such as medical diagnosis, clinical decision support, and structural health monitoring. This dissertation includes investigation and development of methods to perform data fusion for cervical cancer intraepithelial neoplasia (CIN) and a clinical decision support system. The general framework for these applications includes image processing followed by feature development and classification of the detected region of interest (ROI). Image processing methods such as k-means clustering based on color information, dilation, erosion and centroid locating methods were used for ROI detection. The features extracted include texture, color, nuclei-based and triangle features. Analysis and classification was performed using feature- and decision-level data fusion techniques such as support vector machine, statistical methods such as logistic regression, linear discriminant analysis and voting algorithms --Abstract, page iv

    #Precision: An Exploration of the Utility of User-Generated Metadata for the Creation of Precise Microblog Query-Expansion Systems

    Get PDF
    Twitter research provides a unique opportunity to answer fundamental questions regarding the best methods for the large-scale retrieval of extremely sparse documents. This study examines the utility of user-generated metadata expansion candidate terms for the creation of precise microblog search engines. Several search engines were created utilizing different genres of candidate expansion terms, confidence thresholds, and document parameters to explore this issue. This study demonstrates that user-generated metadata has utility for the precise retrieval of terse queries with high levels of associated conversation, such as movie awards or current events, but performs poorly on textually rich queries with lower levels of perceived conversation.Master of Science in Information Scienc

    Algoritmos de expansión de consulta basados en una nueva función discreta de relevancia

    Get PDF
    Se ha demostrado que el proceso de expansión de las consultas en el modelo espacio vectorial de representación dedocumentos en un sistema de recuperación de información, es una técnica útil para mejorar la relevancia medidapor la precisión de los resultados entregados a los usuarios. En este artículo se presenta un nuevo algoritmo y unavariación del mismo para realizar expansión de consultas en un sistema de recuperación de información. Estosalgoritmos se basan en una nueva función discreta que define la importancia relativa de un término en una colecciónde documentos. El algoritmo y su variación se evalúan frente a la búsqueda por similitud de cosenos y el algoritmode expansión propuesto por Rocchio, obteniendo excelentes resultados sobre la colección de datos CACM (artículospublicados en la revista Communications of the ACM).It has been shown that the query expansion process in the vector space model of document’s representation in aretrieval system, it is a useful technique for improving the relevance measured by precision of the results delivered tousers. This paper presents a new algorithm and a variation of itself used to perform query expansion in informationretrieval systems. These algorithms are based on a new discrete function that defines the relative importance of aterm in a document collection. The algorithm and its variation were evaluated against the cosine similarity searchand the query expansion algorithm proposed by Rocchio, with excellent results on data collection CACM (articlespublished in the Communications of the ACM journal)

    Query expansion algorithms based on a new discrete relevance function

    Get PDF
    It has been shown that the query expansion process in the vector space model of document’s representation in aretrieval system, it is a useful technique for improving the relevance measured by precision of the results delivered tousers. This paper presents a new algorithm and a variation of itself used to perform query expansion in informationretrieval systems. These algorithms are based on a new discrete function that defines the relative importance of aterm in a document collection. The algorithm and its variation were evaluated against the cosine similarity searchand the query expansion algorithm proposed by Rocchio, with excellent results on data collection CACM (articlespublished in the Communications of the ACM journal).Se ha demostrado que el proceso de expansión de las consultas en el modelo espacio vectorial de representación dedocumentos en un sistema de recuperación de información, es una técnica útil para mejorar la relevancia medidapor la precisión de los resultados entregados a los usuarios. En este artículo se presenta un nuevo algoritmo y unavariación del mismo para realizar expansión de consultas en un sistema de recuperación de información. Estosalgoritmos se basan en una nueva función discreta que define la importancia relativa de un término en una colecciónde documentos. El algoritmo y su variación se evalúan frente a la búsqueda por similitud de cosenos y el algoritmode expansión propuesto por Rocchio, obteniendo excelentes resultados sobre la colección de datos CACM (artículospublicados en la revista Communications of the ACM)

    Web search model based on user context information and collaborative filtering techniques

    Get PDF
    A pesar del continuo desarrollo que han tenido los buscadores Web modernos, estos aún no satisfacen a cabalidad las necesidades de los usuarios, siendo la relevancia de los documentos recuperados uno de los principales aspectos que afectan la calidad de búsqueda. En este artículo se propone un modelo de meta buscador Web que integra el filtrado colaborativo (basado en ítems) con la propuesta de Massimo Melucci, que se basa en proyectores sobre planos que se originan en la información del contexto del usuario. El modelo fue implementado en un meta buscador Web que recupera documentos de buscadores tradicionales como Google y Bing, donde se muestran los resultados por medio de una lista de documentos ordenados por relevancia, basado en la información del contexto del usuario y en la retroalimentación colaborativa de la comunidad. El modelo propuesto se constituye en un aporte para el área de recuperación de información, dado que muestra promisorios resultados en pruebas realizadas sobre colecciones cerradas y con usuarios.Despite the continuous development modern Web browsers have had, they have not fulfilled user needs, and the retrieved documents relevance is one of the main issues affecting the search quality. The proposed web search meta model engine integrates Web search collaborative filtering (based on items) to Massimo Melucci’s proposal that is based on projectors on plans that came in the user context information. The obtained model was implemented in a meta search site that retrieves documents from traditional search engines like Google and Bing. It presents the results to the user through a list of documents sorted by relevance based on information from the user’s context and the collaborative community feedback. The proposed model constitutes a contribution to the field of information retrieval, since it shows promising results in both closed collections and open collections tests

    Web search model based on user context information and collaborative filtering techniques

    Get PDF
    A pesar del continuo desarrollo que han tenido los buscadores Web modernos, estos aún no satisfacen a cabalidad las necesidades de los usuarios, siendo la relevancia de los documentos recuperados uno de los principales aspectos que afectan la calidad de búsqueda. En este artículo se propone un modelo de meta buscador Web que integra el filtrado colaborativo (basado en ítems) con la propuesta de Massimo Melucci, que se basa en proyectores sobre planos que se originan en la información del contexto del usuario. El modelo fue implementado en un meta buscador Web que recupera documentos de buscadores tradicionales como Google y Bing, donde se muestran los resultados por medio de una lista de documentos ordenados por relevancia, basado en la información del contexto del usuario y en la retroalimentación colaborativa de la comunidad. El modelo propuesto se constituye en un aporte para el área de recuperación de información, dado que muestra promisorios resultados en pruebas realizadas sobre colecciones cerradas y con usuarios.Despite the continuous development modern Web browsers have had, they have not fulfilled user needs, and the retrieved documents relevance is one of the main issues affecting the search quality. The proposed web search meta model engine integrates Web search collaborative filtering (based on items) to Massimo Melucci’s proposal that is based on projectors on plans that came in the user context information. The obtained model was implemented in a meta search site that retrieves documents from traditional search engines like Google and Bing. It presents the results to the user through a list of documents sorted by relevance based on information from the user’s context and the collaborative community feedback. The proposed model constitutes a contribution to the field of information retrieval, since it shows promising results in both closed collections and open collections tests