621 research outputs found

    A Mobile-Health Information Access System

    Get PDF
    Patients using the Mobile-Health Information System can send SMS requests to a Frequently Asked Questions (FAQ) web server with the expectation of receiving an appropriate feedback on issues that relate to their health. The accuracy of such feedback is paramount to the mobile search user. However, automating SMS-based information search and retrieval poses significant challenges because of the inherent noise in SMS communication. First, in this paper an architecture is proposed for the implementation of the retrieval process, and second, an algorithm is developed for the best-ranked question-answer pair retrieval. We present an algorithm that assists in the selection of the best FAQ-query after the ranking of the query-answer pair. Results are generated based on the ranking of the FAQ-query. Our algorithm gives a better result in terms of average precision and recall when compared with the naıve retrieval algorithm.Southern Africa Telecommunication Networks and Applications Conference (SATNAC)Department of HE and Training approved lis

    Text messaging and retrieval techniques for a mobile health information system

    Get PDF
    Mobile phones have been identified as one of the technologies that can be used to overcome the challenges of information dissemination regarding serious diseases. Short message services, a much used function of cell phones, for example, can be turned into a major tool for accessing databases. This paper focuses on the design and development of a short message services-based information access algorithm to carefully screen information on human immunodeficiency virus/acquired immune deficiency syndrome within the context of a frequently asked questions system. However, automating the short message services-based information search and retrieval poses significant challenges because of the inherent noise in its communications. The developed algorithm was used to retrieve the best-ranked question–answer pair. Results were evaluated using three metrics: average precision, recall and computational time. The retrieval efficacy was measured and it was confirmed that there was a significant improvement in the results of the proposed algorithm when compared with similar retrieval algorithms

    A semi-automated FAQ retrieval system for HIV/AIDS

    Get PDF
    This thesis describes a semi-automated FAQ retrieval system that can be queried by users through short text messages on low-end mobile phones to provide answers on HIV/AIDS related queries. First we address the issue of result presentation on low-end mobile phones by proposing an iterative interaction retrieval strategy where the user engages with the FAQ retrieval system in the question answering process. At each iteration, the system returns only one question-answer pair to the user and the iterative process terminates after the user's information need has been satisfied. Since the proposed system is iterative, this thesis attempts to reduce the number of iterations (search length) between the users and the system so that users do not abandon the search process before their information need has been satisfied. Moreover, we conducted a user study to determine the number of iterations that users are willing to tolerate before abandoning the iterative search process. We subsequently used the bad abandonment statistics from this study to develop an evaluation measure for estimating the probability that any random user will be satisfied when using our FAQ retrieval system. In addition, we used a query log and its click-through data to address three main FAQ document collection deficiency problems in order to improve the retrieval performance and the probability that any random user will be satisfied when using our FAQ retrieval system. Conclusions are derived concerning whether we can reduce the rate at which users abandon their search before their information need has been satisfied by using information from previous searches to: Address the term mismatch problem between the users' SMS queries and the relevant FAQ documents in the collection; to selectively rank the FAQ document according to how often they have been previously identified as relevant by users for a particular query term; and to identify those queries that do not have a relevant FAQ document in the collection. In particular, we proposed a novel template-based approach that uses queries from a query log for which the true relevant FAQ documents are known to enrich the FAQ documents with additional terms in order to alleviate the term mismatch problem. These terms are added as a separate field in a field-based model using two different proposed enrichment strategies, namely the Term Frequency and the Term Occurrence strategies. This thesis thoroughly investigates the effectiveness of the aforementioned FAQ document enrichment strategies using three different field-based models. Our findings suggest that we can improve the overall recall and the probability that any random user will be satisfied by enriching the FAQ documents with additional terms from queries in our query log. Moreover, our investigation suggests that it is important to use an FAQ document enrichment strategy that takes into consideration the number of times a term occurs in the query when enriching the FAQ documents. We subsequently show that our proposed enrichment approach for alleviating the term mismatch problem generalise well on other datasets. Through the evaluation of our proposed approach for selectively ranking the FAQ documents, we show that we can improve the retrieval performance and the probability that any random user will be satisfied when using our FAQ retrieval system by incorporating the click popularity score of a query term t on an FAQ document d into the scoring and ranking process. Our results generalised well on a new dataset. However, when we deploy the click popularity score of a query term t on an FAQ document d on an enriched FAQ document collection, we saw a decrease in the retrieval performance and the probability that any random user will be satisfied when using our FAQ retrieval system. Furthermore, we used our query log to build a binary classifier for detecting those queries that do not have a relevant FAQ document in the collection (Missing Content Queries (MCQs))). Before building such a classifier, we empirically evaluated several feature sets in order to determine the best combination of features for building a model that yields the best classification accuracy in identifying the MCQs and the non-MCQs. Using a different dataset, we show that we can improve the overall retrieval performance and the probability that any random user will be satisfied when using our FAQ retrieval system by deploying a MCQs detection subsystem in our FAQ retrieval system to filter out the MCQs. Finally, this thesis demonstrates that correcting spelling errors can help improve the retrieval performance and the probability that any random user will be satisfied when using our FAQ retrieval system. We tested our FAQ retrieval system with two different testing sets, one containing the original SMS queries and the other containing the SMS queries which were manually corrected for spelling errors. Our results show a significant improvement in the retrieval performance and the probability that any random user will be satisfied when using our FAQ retrieval system

    Short message service normalization for communication with a health information system

    Get PDF
    Philosophiae Doctor - PhDShort Message Service (SMS) is one of the most popularly used services for communication between mobile phone users. In recent times it has also been proposed as a means for information access. However, there are several challenges to be overcome in order to process an SMS, especially when it is used as a query in an information retrieval system.SMS users often tend deliberately to use compacted and grammatically incorrect writing that makes the message difficult to process with conventional information retrieval systems. To overcome this, a pre-processing step known as normalization is required. In this thesis an investigation of SMS normalization algorithms is carried out. To this end,studies have been conducted into the design of algorithms for translating and normalizing SMS text. Character-based, unsupervised and rule-based techniques are presented. An investigation was also undertaken into the design and development of a system for information access via SMS. A specific system was designed to access information related to a Frequently Asked Questions (FAQ) database in healthcare, using a case study. This study secures SMS communication, especially for healthcare information systems. The proposed technique is to encipher the messages using the secure shell (SSH) protocol

    Customisable chatbot as a research instrument

    Get PDF
    Abstract. Chatbots are proliferating rapidly online for a variety of different purposes. This thesis presents a customisable chatbot that was designed and developed as a research instrument for online customer interaction research. The developed chatbot facilitates creation of different bot personas, data management tools, and a fully functional online chat user interface. Customer-facing bots in the system are rulebased, with basic input processing and text response selection based on best match. The system uses its own database to store user-chatbot dialogue history. Further, bots can be assigned unique dialogue scripts and their profiles can be customised concerning name, description and profile image. In the presented validation studies, participants completed a task by taking part in a conversation with different bots, as hosted by the system and invoked through distinct URL parameters. Second, the participants filled in a questionnaire on their experience with the bot, designed to reveal differences in how the bots were perceived. Our results suggest that the chatbot’s personality impacted how customers experienced the interactions. Therefore, the developed system can facilitate research scenarios that deal with investigating participant responses to different chatbot personas. Future work is necessary for a wider range of applications and enhanced response control.Personoitava chatbot tutkimustyökaluna. Tiivistelmä. Chatbotit yleistyvät nopeasti Internetissä ja niitä käytetään enenevissä määrin useissa eri käyttötarkoituksissa. Tämä diplomityö esittelee personoitavan chatbotin, joka on kehitetty tutkimustyökaluksi verkon yli tapahtuvaan vuorovaikutustutkimukseen. Kehitetty chatbot sisältää erilaisten bottipersoonien luonnin, apuvälineitä datan käsittelyn, ja itse botin käyttöliittymän. Järjestelmän käyttäjille vastailevat bottipersoonat ovat sääntöihin perustuvia, niiden syötteet käsitellään suoraviivaisesti ja vastaukseksi valitaan vertailun mukaan paras ennaltamääritellyn skriptin mukaisesti. Järjestelmä käyttää omaa tietokantaa tallentamaan käyttäjä-botti keskusteluhistorian. Lisäksi boteille voidaan asettaa uniikki dialogimalli, ja niiden profiilista voidaan personoida URL-parametrillä nimi, botin kuvaus ja profiilikuva. Chatbotin tekninen toiminta todettiin tutkimuksella, jossa osallistujat suorittivat annetun tehtävän seuraamalla osittain valmista käsikirjoitusta eri bottien kanssa. Tämän jälkeen osallistujat täyttivät käyttäjäkyselyn liittyen heidän kokemukseensa botin kanssa. Kysely oli suunniteltu paljastamaan mahdolliset eroavaisuudet siinä, kuinka botin käyttäytyminen miellettiin keskustelun aikana. Käyttäjätestin tulokset viittaavat siihen, että chatbotin persoonalla oli vaikutus käyttäjien kokemukseen. Kehitetty järjestelmä siis pystyy mahdollistamaan tutkimusasetelmia, joissa tutkitaan osallistujien reaktioita erilaisten chattibottien persooniin. Jatkotyö kehitetyn chatbotin yhteydessä keskittyy monimutkaisempien käyttötarkoitusten lisäämiseen ja botin vastausten parantamiseen edistyksellisemmän luonnollisen kielen käsittelyn avulla

    Improving Retrieval of Information from the Internet

    Get PDF
    To improve the quality of the search result returned by the internet which makes users have to look through a huge amount of links for the real answers, we utilized the high quality links Google produces and the Information Retrieval technology to implement a Question Answering (QA) system. This system analyzes and downloads the text contents from the relevant web pages Google searches based on the users\u27 questions to build a dynamic knowledge collection; retrieves the relevant passages from the collection and sends the ranked passages back. The users can further refine their questions in the query refinement step for the better answers. A novel search strategy was designed to detect the semantic connections between the question and the documents. This answer retrieval also involves the TF-IDF algorithm and Vector Space Model for the document indexing. We have modified the original Cosine Coefficient Similarity Measurement to rank the candidate answers

    Election Data Visualisation

    Get PDF
    Visualisations of election data produced by the mass media, other organisations and even individuals are becoming increasingly available across a wide variety of platforms and in many different forms. As more data become available digitally and as improvements to computer hardware and software are made, these visualisations have become more ambitious in scope and more user-friendly. Research has shown that visualising data is an extremely powerful method of communicating information to specialists and non-specialists alike. This amounts to a democratisation of access to political and electoral data. To some extent political science lags behind the progress that has been made in the field of data visualisation. Much of the academic output remains committed to the paper format and much of the data presentation is in the form of simple text and tables. In the digital and information age there is a danger that political science will fall behind. This thesis reports on a number of case studies where efforts were made to visualise election data in order to clarify its structure and to present its meaning. The first case study demonstrates the value of data visualisation to the research process itself, facilitating the understanding of effects produced by different ways of estimating missing data. A second study sought to use visualisation to explain complex aspects of voting systems to the wider public. Three further case studies demonstrate the value of collaboration between political scientists and others possessing a range of skills embracing data management, software engineering, broadcasting and graphic design. These studies also demonstrate some of the problems that are encountered when trying to distil complex data into a form that can be easily viewed and interpreted by non-expert users. More importantly, these studies suggest that when the skills balance is correct then visualisation is both viable and necessary for communicating information on elections
    corecore