16 research outputs found

    Semi-Automatic Method to Assist Expert for Association Rules Validation

    Get PDF
    Abstract-In order to help the expert to validate association rules extracted from data, some quality measures are proposed in the literature. We distinguish two categories: objective and subjective measures. The first one depends on a fixed threshold and on data quality from which the rules are extracted. The second one consists on providing to the expert some tools in the objective to explore and visualize rules during the evaluation step. However, the number of extracted rules to validate remains high. Thus, the manually mining rules task is very hard. To solve this problem, we propose, in this paper, a semi-automatic method to assist the expert during the association rule's validation. Our method uses rule-based classification as follow: (i) We transform association rules into classification rules (classifiers), (ii) We use the generated classifiers for data classification. (iii) We visualize association rules with their quality classification to give an idea to the expert and to assist him during validation process

    Trends and challenges of Arabic Chatbots: Literature review

    No full text
    A conversational system is a natural language processing task that has recently attracted increasing attention with the advancements in Large Language Models (LLMs) and Language Models for Dialogue Applications (LaMDA). However, Conversational Artificial Intelligence (AI) research has mainly been carried out in English. Despite the growing popularity of Arabic as one of the most widely used languages on the Internet, only a few studies have concentrated on Arabic conversational dialogue systems thus far. In this study, we conduct a comprehensive qualitative analysis of the key research works in this domain, examining the limitations and strengths of existing approaches. We start with chatbot history and classification. Then, we examine approaches that leverage Arabic chatbots Rule-based/Retrieval-based and Deep learning-based. In particular, we survey the evolution of Generative Conversational AI with the evolution of deep-learning techniques. Next, we look at the different metrics used to assess conversational systems. Finally, we outline language Challenges for building Generative Arabic Conversational AI. [JJCIT 2023; 9(3.000): 261-286

    Extraction de concepts et de relations entre concepts Ă  partir des documents multilingues (approche statistique et ontologique)

    No full text
    Les travaux menés dans le cadre de cette thèse se situent dans la problématique de recherche- indexation des documents et plus spécifiquement dans celle de l extraction des descripteurs sémantiques pour l indexation. Le but de la Recherche d Information (RI) est de mettre en œuvre un ensemble de modèles et de systèmes permettant la sélection d un ensemble de documents satisfaisant un besoin utilisateur en termes d information exprimé sous forme d une requête. Un Système de Recherche d Information (SRI) est composé principalement de deux processus. Un processus de représentation et un processus de recherche. Le processus de représentation est appelé indexation, il permet de représenter les documents et la requête par des descripteurs ou des indexes. Ces descripteurs reflètent au mieux le contenu des documents. Le processus de recherche consiste à comparer les représentations des documents à la représentation de la requête. Dans les SRIs classiques, les descripteurs utilisés sont des mots (simples ou composés). Ces SRIs considèrent le document comme étant un ensemble de mots, souvent appelé sac de mots . Dans ces systèmes, les mots sont considérés comme des graphies sans sémantique. Les seules informations exploitées concernant ces mots sont leurs fréquences d apparition dans les documents. Ces systèmes ne prennent pas en considération les relations sémantiques entre les mots. Par exemple, il est impossible de trouver des documents représentés par un mot M1 synonyme d un mot M2, dans le cas où la requête est représentée par M2. Aussi, dans un SRI classique un document indexé par le terme bus ne sera jamais retrouvé par une requête indexée par le terme taxi , pourtant il s agit de deux termes qui traitent le même thème moyen de transport . Afin de remédier à ces limites, plusieurs travaux se sont intéressés à la prise en compte de l aspect sémantique des termes d indexation. Ce type d indexation est appelé indexation sémantique ou conceptuelle.The research work of this thesis is related to the problem of document search indexing and more specifically in that of the extraction of semantic descriptors for document indexing. Information Retrieval System (IRS) is a set of models and systems for selecting a set of documents satisfying user needs in terms of information expressed as a query. In IR, a query is composed mainly of two processes for representation and retrieval. The process of representation is called indexing, it allows to represent documents and query descriptors, or indexes. These descriptors reflect the contents of documents. The retrieval process consists on the comparison between documents representations and query representation. In the classical IRS, the descriptors used are words (simple or compound). These IRS consider the document as a set of words, often called a "bag of words". In these systems, the words are considered as graphs without semantics. The only information used for these words is their occurrence frequency in the documents. These systems do not take into account the semantic relationships between words. For example, it is impossible to find documents represented by a word synonymous with M1 word M2, where the request is represented by M2. Also, in a classic IRS document indexed by the term "bus" will never be found by a query indexed by the word "taxi", yet these are two words that deal with the same subject "means of transportation." To address these limitations, several studies were interested taking into account of the semantic indexing terms. This type of indexing is called semantic or conceptual indexing. These works take into account the notion of concept in place of notion of word. In this work the terms denoting concepts are extracted from the document by using statistical techniques. These terms are then projected onto resource of semantics such as: ontology, thesaurus and so on to extract the concepts involved.VILLEURBANNE-DOC'INSA LYON (692662301) / SudocSudocFranceF

    QoS-Aware Scheduling of Workflows in Cloud Computing Environments

    No full text
    International audienc

    Cloud Services Orchestration: A Comparative Study of Existing Approaches

    No full text
    International audienc

    A Two Level Architecture for Data Warehousing and OLAP Over Big Data

    No full text
    International audienceDuring the last two decades, architecture of traditional Data Warehouses (DW) have played a key role in decision support system. A typical architecture of traditional DW involves source systems, components implementing data collection process, stores (a central repository and data marts), as well as consumers: reporting and analytic applications. However, the emergence of several new Internet services, Mobile applications, Web applications, Social media (Facebook, Twitter, and Instagram, so on), Devices, and Sensors, requires new Data warehousing systems and suitable architectures: large volume and format of collected datasets, data source variety, streaming, integration of unstructured data and powerful analytical processing. In this paper, we propose a two level architecture for Data warehousing and OLAP over Big Data. The first level Platform Independent Architecture (PIA) allows identifying and specifying the main components of the architecture to collect, store, transform and process the different kind of data. This level is technology independent and focuses only on the requirements of the data features and the processes needed to design a Data warehousing and OLAP over Big Data. The second level Platform Specific Architecture (PSA) allows specifying the platforms and technologies that would be used to achieve the different steps from the collection of data until the reporting and analytic applications
    corecore