2 research outputs found

    Interactive Malayalam Question Answering System: A Neural Word Embedding And Similarity Measure Based Approach.

    Get PDF
    This innovative system operates as an automated, domain-specific knowledge repository designed specifically to furnish reliable Malayalam responses to inquiries pertaining to COVID-19. Leveraging advanced Natural Language Processing (NLP) algorithms, both Malayalam documents and questions undergo meticulous processing. The semantic modelling and document conversion stages employ the Word Embedding approach, specifically Continuous Bag of Words (CBOW), to enhance the system's understanding of the language nuances. Subsequently, the retrieved results for a given query are meticulously ranked using the cosine similarity measure, ensuring that the most relevant and accurate information is presented to the user. Integral to the system's efficacy is our proprietary Malayalam question-answering dataset. This dataset has been meticulously curated, drawing from reliable and publicly accessible sources related to COVID-19. It serves as the foundation for experimentation, reflecting the system's ability to provide accurate responses. The system's performance is quantified using the F1 score, a metric that combines precision and recall, yielding a comprehensive evaluation. In our experimentation, the F1 score of the Semantic Malayalam Question-Answering System is found to be 76%, attesting to its robustness and effectiveness in delivering trustworthy information in the Malayalam language within the context of COVID-19

    A New Approach in Query Expansion Methods for Improving Information Retrieval

    Get PDF
    This research develops a new approach to query expansion by integrating Association Rules (AR) and Ontology. In the proposed approach, there are several steps to expand the query, namely (1) the document retrieval step; (2) the step of query expansion using AR; (3) the step of query expansion using Ontology. In the initial step, the system retrieved the top documents via the user's initial query. Next is the initial processing step (stopword removal, POS Tagging, TF-IDF). Then do a Frequent Itemset (FI) search from the list of terms generated from the previous step using FP-Growth. The association rules search by using the results of FI. The output from the AR step expanded using Ontology. The results of the expansion with Ontology use as new queries. The dataset used is a collection of learning documents. Ten queries used for the testing, the test results are measured by three measuring devices, namely recall, precision, and f-measure. Based on testing and analysis results,  integrating AR and Ontology can increase the relevance of documents with the value of recall, precision, and f-measure by 87.28, 79.07, and 82.85
    corecore