657 research outputs found

    Word order variation and string similarity algorithm to reduce pattern scripting in pattern matching conversational agents

    Get PDF
    This paper presents a novel sentence similarity algorithm designed to tackle the issue of free word order in the Urdu language. Free word order in a language poses many challenges when implemented in a conversational agent, primarily due to the fact that it increases the amount of scripting time needed to script the domain knowledge. A language with free word order like Urdu means a single phrase/utterance can be expressed in many different ways using the same words and still be grammatically correct. This led to the research of a novel string similarity algorithm which was utilized in the development of an Urdu conversational agent. The algorithm was tested through a black box testing methodology which involved processing different variations of scripted patterns through the system to gauge the performance and accuracy of the algorithm with regards to recognizing word order variations of the related scripted patterns. Initial testing has highlighted that the algorithm is able to recognize legal word order variations and reduce the knowledge base scripting of conversational agents significantly. Thus saving great time and effort when scripting the knowledge base of a conversational agent

    Methodology and algorithms for Urdu language processing in a conversational agent

    Get PDF
    This thesis presents the research and development of a novel text based goal-orientated conversational agent (CA) for the Urdu language called UMAIR (Urdu Machine for Artificially Intelligent Recourse). A CA is a computer program that emulates a human in order to facilitate a conversation with the user. The aim is investigate the Urdu language and its lexical and grammatical features in order to, design a novel engine to handle the language unique features of Urdu. The weakness in current Conversational Agent (CA) engines is that they are not suited to be implemented in other languages which have grammar rules and structure totally different to English. From a historical perspective CA’s including the design of scripting engines, scripting methodologies, resources and implementation procedures have been implemented for the most part in English and other Western languages (i.e. German and Spanish). The development of an Urdu conversational agent has required the research and development of new CA framework which incorporates methodologies and components in order overcome the language unique features of Urdu such as free word order, inconsistent use of space, diacritical marks and spelling. The new CA framework was utilised to implement UMAIR. UMAIR is a customer service agent for National Database and Registration Authority (NADRA) designed to answer user queries related to ID card and Passport applications. UMAIR is able to answer user queries related to the domain through discourse with the user by leading the conversation using questions and offering appropriate advice with the intention of leading the discourse to a pre-determined goal. The research and development of UMAIR led to the creation of several novel CA components, namely a new rule based Urdu CA engine which combines pattern matching and sentence/string similarity techniques along with new algorithms to process user utterances. Furthermore, a CA evaluation framework has been researched and tested which addresses the gap in research to develop the evaluation of natural language systems in general. Empirical end user evaluation has validated the new algorithms and components implemented in UMAIR. The results show that UMAIR is effective as an Urdu CA, with the majority of conversations leading to the goal of the conversation. Moreover the results also revealed that the components of the framework work well to mitigate the challenges of free word order and inconsistent word segmentation

    Development of UMAIR the Urdu Conversational Agent for Customer Service

    Get PDF
    This paper outlines the development of UMAIR an Urdu conversational agent developed as a customer service representative. UMAIRs architecture includes a novel engine, scripting language and WOW (Word Order Wizard) string similarity algorithm which are combined to tackle the language unique challenges of Urdu. Initial testing of the new architecture has yielded positive results towards UMAIR being able to cope with the inherent differences in the Urdu language such as word order

    LANA-I: An Arabic Conversational Intelligent Tutoring System for Children with ASD

    Get PDF
    © 2019, Springer Nature Switzerland AG. Children with Autism Spectrum Disorder (ASD) share certain difficulties but being autistic will affect them in different ways in terms of their level of intellectual ability. Children with high functioning autism or Asperger syndrome are very intelligent academically but they still have difficulties in social and communication skills. Many of these children are taught within mainstream schools but there is a shortage of specialised teachers to deal with their specific needs. One solution is to use a virtual tutor to supplement the education of children with ASD in mainstream schools. This paper describes research to develop a novel Arabic Conversational Intelligent Tutoring System, called LANA-I, for children with ASD that adapts to the Visual, Auditory and Kinaesthetic learning styles model (VAK) to enhance learning. This paper also proposes an evaluation methodology and describes an experimental evaluation of LANA-I. The evaluation was conducted with neurotypical children and indicated promising results with a statistically significant difference between user’s scores with and without adapting to learning style. Moreover, the results show that LANA-I is effective as an Arabic Conversational Agent (CA) with the majority of conversations leading to the goal of completing the tutorial and the majority of the correct responses (89%)

    Development of an Arabic conversational intelligent tutoring system for education of children with autism spectrum disorder

    Get PDF
    Children with Autism Spectrum Disorder (ASD) are affected in different degrees in terms of their level of intellectual ability. Some people with Asperger syndrome or high functioning autism are very intelligent academically but they still have difficulties in social and communication skills. In recent years, many of these pupils are taught within mainstream schools. However, the process of facilitating their learning and participation remains a complex and poorly understood area of education. Although many teachers in mainstream schools are firmly committed to the principles of inclusive education, they do not feel that they have the necessary training and support to provide adequately for pupils with ASD. One solution for this problem is to use a virtual tutor to supplement the education of pupils with ASD in mainstream schools. This thesis describes research to develop a Novel Arabic Conversational Intelligent Tutoring System (CITS), called LANA, for children with ASD, which delivers topics related to the science subject by engaging with the user in Arabic language. The Visual, Auditory, and Kinaesthetic (VAK) learning style model is used in LANA to adapt to the children’s learning style by personalising the tutoring session. Development of an Arabic Conversational Agent has many challenges. Part of the challenge in building such a system is the requirement to deal with the grammatical features and the morphological nature of the Arabic language. The proposed novel architecture for LANA uses both pattern matching (PM) and a new Arabic short text similarity (STS) measure to extract facts from user’s responses to match rules in scripted conversation in a particular domain (Science). In this research, two prototypes of an Arabic CITS were developed (LANA-I) and (LANA-II). LANA-I was developed and evaluated with 24 neurotypical children to evaluate the effectiveness and robustness of the system engine. LANA-II was developed to enhance LANA-I by addressing spelling mistakes and words variation with prefix and suffix. Also in LANA-II, TEACCH method was added to the user interface to adapt the tutorial environment to the autistic students learning, and the knowledge base was expanded by adding a new tutorial. An evaluation methodology and experiment were designed to evaluate the enhanced components of LANA-II architecture. The results illustrated a statistically significant impact on the effectiveness of LANA-II engine when compared to LANA-I. In addition, the results indicated a statistically significant improvement on the autistic students learning gain with adapting to their learning styles indicating that LANA-II can be adapted to autistic children’s learning styles and enhance their learning

    Interpreting Human Responses in Dialogue Systems using Fuzzy Semantic Similarity Measures

    Get PDF
    Dialogue systems are automated systems that interact with humans using natural language. Much work has been done on dialogue management and learning using a range of computational intelligence based approaches, however the complexity of human dialogue in different contexts still presents many challenges. The key impact of work presented in this paper is to use fuzzy semantic similarity measures embedded within a dialogue system to allow a machine to semantically comprehend human utterances in a given context and thus communicate more effectively with a human in a specific domain using natural language. To achieve this, perception based words should be understood by a machine in context of the dialogue. In this work, a simple question and answer dialogue system is implemented for a café customer satisfaction feedback survey. Both fuzzy and crisp semantic similarity measures are used within the dialogue engine to assess the accuracy and robustness of rule firing. Results from a 32 participant study, show that the fuzzy measure improves rule matching within the dialogue system by 21.88% compared with the crisp measure known as STASIS, thus providing a more natural and fluid dialogue exchange

    Arabic goal-oriented conversational agents using semantic similarity techniques

    Get PDF
    Conversational agents (CAs) are computer programs used to interact with humans in conversation. Goal-Oriented Conversational agents (GO-CAs) are programs that interact with humans to serve a specific domain of interest; its’ importance has increased recently and covered fields of technology, sciences and marketing. There are several types of CAs used in the industry, some of them are simple with limited usage, others are sophisticated. Generally, most CAs were to serve the English language speakers, a few were built for the Arabic language, this is due to the complexity of the Arabic language, lack of researchers in both linguistic and computing. This thesis covered two types of GO-CAs. The first is the traditional pattern matching goal oriented CA (PMGO-CA), and the other is the semantic goal oriented CA (SGO-CA). Pattern matching conversational agents (PMGO-CA) techniques are widely used in industry due to their flexibility and high performance. However, they are labour intensive, difficult to maintain or update, and need continuous housekeeping to manage users’ utterances (especially when instructions or knowledge changes). In addition to that they lack for any machine intelligence. Semantic conversational agents (SGO-CA) techniques utilises humanly constructed knowledge bases such as WordNet to measure word and sentence similarity. Such measurement witnessed many researches for the English language, and very little for the Arabic language. In this thesis, the researcher developed a novelty of a new methodology for the Arabic conversational agents (using both Pattern Matching and Semantic CAs), starting from scripting, knowledge engineering, architecture, implementation and evaluation. New tools to measure the word and sentence similarity were also constructed. To test performance of those CAs, a domain representing the Iraqi passport services was built. Both CAs were evaluated and tested by domain experts using special evaluation metrics. The evaluation showed very promising results, and the viability of the system for real life

    Aneesah: a novel methodology and algorithms for sustained dialogues and query refinement in natural language interfaces to databases

    Get PDF
    This thesis presents the research undertaken to develop a novel approach towards the development of a text-based Conversational Natural Language Interface to Databases, known as ANEESAH. Natural Language Interfaces to Databases (NLIDBs) are computer applications, which replace the requirement for an end user to commission a skilled programmer to query a database by using natural language. The aim of the proposed research is to investigate the use of a Natural Language Interface to Database (NLIDB) capable of conversing with users to automate the query formulation process for database information retrieval. Historical challenges and limitations have prevented the wider use of NLIDB applications in real-life environments. The challenges relevant to the scope of proposed research include the absence of flexible conversation between NLIDB applications and users, automated database query building from multiple dialogues and flexibility to sustain dialogues for information refinement. The areas of research explored include; NLIDBs, conversational agents (CAs), natural language processing (NLP) techniques, artificial intelligence (AI), knowledge engineering, and relational databases. Current NLIDBs do not have conversational abilities to sustain dialogues, especially with regards to information required for dynamic query formulation. A novel approach, ANEESAH is introduced to deal with these challenges. ANEESAH was developed to allow users to communicate using natural language to retrieve information from a relational database. ANEESAH can interact with the users conversationally and sustain dialogues to automate the query formulation and information refinement process. The research and development of ANEESAH steered the engineering of several novel NLIDB components such as a CA implemented NLIDB framework, a rule-based CA that combines pattern matching and sentence similarity techniques, algorithms to engage users in conversation and support sustained dialogues for information refinement. Additional components of the proposed framework include a novel SQL query engine for the dynamic formulation of queries to extract database information and perform querying the query operations to support the information refinement. Furthermore, a generic evaluation methodology combining subjective and objective measures was introduced to evaluate the implemented conversational NLIDB framework. Empirical end user evaluation was also used to validate the components of the implemented framework. The evaluation results demonstrated ANEESAH produced the desired database information for users over a set of test scenarios. The evaluation results also revealed that the proposed framework components can overcome the challenges of sustaining dialogues, information refinement and querying the query operations

    Arabic conversational agent for modern Islamic education

    Get PDF
    This thesis presents research that combines the benefits of intelligent tutoring systems (ITS), Arabic conversational agents (CA) and learning theories by constructing a novel Arabic conversational intelligent tutoring system (CITS) called Abdullah. Abdullah CITS is a software program intended to deliver a tutorial to students aged between 10 and 12 years old, that covers the essential topics in Islam using natural language. The CITS aims to mimic a human Arabic tutor by engaging the students in dialogue using Modern standard Arabic language (MSA), whilst also allowing conversation and discussion in classical Arabic language (CAL). Developing a CITS for the Arabic language faces many challenges due to the complexity of the morphological system, non-standardization of the written text, ambiguity, and lack of resources. However, the main challenge for the developed Arabic CITS is how the user utterances are recognized and responded to by the CA, as well as how the domain is scripted and maintained. This research presents a novel Arabic CA and accompanying a scripting language that use a form of pattern matching, to handle users’ conversations when the user converse in MSA. A short text similarity measure is used within Abdullah CITS to extract the responses from CAL resources such as the Quran, Hadith, and Tafsir if there are no matching patterns with the Arabic conversation agent’s scripts. Abdullah CITS is able to capture the user’s level of knowledge and adapt the tutoring session and tutoring style to suit that particular learner’s level of knowledge. This is achieved through the inclusion of several learning theories and methods such as Gagne’s learning theory, Piaget learning theory, and storytelling method. These learning theories and methods implemented within Abdullah’s CITS architecture, are applied to personalise a tutorial to an individual learner. This research presents the first Arabic CITS, which utilises established learning typically employed in a classroom environment. The system was evaluated through end user testing with the target age group in schools both in Jordan and in the UK. Empirical experimentation has produced some positive results, indicating that Abdullah CITS is gauging the individual learner’s knowledge level and adapting the tutoring session to ensure learning gain is achieved

    Fuzzy natural language similarity measures through computing with words

    Get PDF
    A vibrant area of research is the understanding of human language by machines to engage in conversation with humans to achieve set goals. Human language is naturally fuzzy by nature, with words meaning different things to different people, depending on the context. Fuzzy words are words with a subjective meaning, typically used in everyday human natural language dialogue and often ambiguous and vague in meaning and dependent on an individual’s perception. Fuzzy Sentence Similarity Measures (FSSM) are algorithms that can compare two or more short texts which contain fuzzy words and return a numeric measure of similarity of meaning between them. The motivation for this research is to create a new FSSM called FUSE (FUzzy Similarity mEasure). FUSE is an ontology-based similarity measure that uses Interval Type-2 Fuzzy Sets to model relationships between categories of human perception-based words. Four versions of FUSE (FUSE_1.0 – FUSE_4.0) have been developed, investigating the presence of linguistic hedges, the expansion of fuzzy categories and their use in natural language, incorporating logical operators such as ‘not’ and the introduction of the fuzzy influence factor. FUSE has been compared to several state-of-the-art, traditional semantic similarity measures (SSM’s) which do not consider the presence of fuzzy words. FUSE has also been compared to the only published FSSM, FAST (Fuzzy Algorithm for Similarity Testing), which has a limited dictionary of fuzzy words and uses Type-1 Fuzzy Sets to model relationships between categories of human perception-based words. Results have shown FUSE is able to improve on the limitations of traditional SSM’s and the FAST algorithm by achieving a higher correlation with the average human rating (AHR) compared to traditional SSM’s and FAST using several published and gold-standard datasets. To validate FUSE, in the context of a real-world application, versions of the algorithm were incorporated into a simple Question & Answer (Q&A) dialogue system (DS), referred to as FUSION, to evaluate the improvement of natural language understanding. FUSION was tested on two different scenarios using human participants and results compared to a traditional SSM known as STASIS. Results of the DS experiments showed a True rating of 88.65% compared to STASIS with an average True rating of 61.36%. Results showed that the FUSE algorithm can be used within real world applications and evaluation of the DS showed an improvement of natural language understanding, allowing semantic similarity to be calculated more accurately from natural user responses. The key contributions of this work can be summarised as follows: The development of a new methodology to model fuzzy words using Interval Type-2 fuzzy sets; leading to the creation of a fuzzy dictionary for nine fuzzy categories, a useful resource which can be used by other researchers in the field of natural language processing and Computing with Words with other fuzzy applications such as semantic clustering. The development of a FSSM known as FUSE, which was expanded over four versions, investigating the incorporation of linguistic hedges, the expansion of fuzzy categories and their use in natural language, inclusion of logical operators such as ‘not’ and the introduction of the fuzzy influence factor. Integration of the FUSE algorithm into a simple Q&A DS referred to as FUSION demonstrated that FSSM can be used in a real-world practical implementation, therefore making FUSE and its fuzzy dictionary generalisable to other applications
    • …
    corecore