6 research outputs found

    Extraction et gestion de l'information à partir des documents arabes

    No full text
    Cette thèse porte sur l extraction et la gestion de l information dans le cas de la langue arabe qui est une langue orientale et sémitique. Cette langue est différente des langues occidentales surtout aux niveaux de la morphologie et des variations orthographiques. En effet, les performances des systèmes d extraction d information en langue arabe restent encore problématiques. Alors, nous nous sommes intéressés à étudier les performances des moteurs de recherche, les plus célèbres entre 2006 et 2010, sur un corpus constitué de mille documents arabes. Nous avons constaté que l analyse morphologique n est pas prise en compte dans ces moteurs. L analyse morphologique d un mot arabe consiste à identifier ses morphèmes, ses affixes, son modèle et sa racine. Nous avons proposé une étude comparative des méthodes d'extraction des caractéristiques morphologiques à partir d un mot arabe. Cette étude est réalisée sur corpus iSPEDAL en utilisant le système Eval que nous avons également proposés dans le cadre cette thèse. iSPEDAL est un dictionnaire structuré et évolutif de la langue arabe qui est facilement exploitable en utilisant un langage de requête approprié. Il est automatiquement enrichi à partir des dictionnaires classiques ou des corpus quelconques. Le système Eval permet d'implémenter les méthodes d'extraction des caractéristiques morphologiques d un mot arabe dans un environnement unique tout en respectant la spécificité de chacune d elles. Cette étude a permit de dégager un groupe de méthodes qui ont des bonnes performances dans ce domaine. L intégration de ce groupe dans les divers moteurs de recherche permet d améliorer la performance d extraction de l information en langue arabe. Cette thèse a été réalisée dans le cadre d une coopération scientifique de recherche franco-libanaise CEDRE sous le projet RIMA : Recherche intelligente d information multimédia multilingue arabe.This thesis focuses on extracting and managing information in the case of Arabic that is an oriental and Semitic language. This language is different from Western languages especially at the morphology and spelling variations. Indeed, the performance of information retrieval systems in the arabic language is still problematic. For this reason, we are interested in studying the performance of search engines which is the most famous between 2006 and 2010, on a corpus of a thousand arabic documents. We found that morphological analysis is not taken in consideration in these engines. Morphological analysis of an arabic word is to identify its morphemes, its affixes, its model and its root. We proposed a comparative study of features of extracting morphological methods from an Arabic word. This study was performed on corpus iSPEDAL using the Eval system that we have also proposed in this thesis. iSPEDAL is a structured and progressive dictionary of Arabic language that is easily exploitable by using an appropriate query language. It is supplied automatically from traditional dictionaries or any other corpus. Eval system can implement the features of extracting morphological methods from an arabic word in a unique environment while respecting the specificity of each one. This study has identified a group of methods that have good performance in this domain. The integration of this group in the various search engines can improve the performance of information retrieval in Arabic language. This thesis was realized under scientific research cooperation between France and Lebanon CEDAR in the project RIMA (Recherche intelligente d information multimédia multilingue arabe).ST DENIS-BU PARIS8 (930662101) / SudocSudocFranceF

    Determinant characteristics in EEG signal based on bursts amplitude segmentation for predicting pathological outcomes of a premature newborn

    No full text
    Date Added to IEEE Xplore: 01 December 2017 Electronic ISBN: 978-1-5090-6011-5Print on Demand(PoD) ISBN: 978-1-5090-6012-2INSPEC Accession Number: 17379658International audienceEEG signal contains some specific patterns that predict neuro-developmental impairments of a premature newborn. Extracting these patterns from a set of EEG records provides a dataset to be used in machine learning in order to implement an intelligent classification system that predict prognosis of the baby. In a previous work we proved that Inter-burst intervals (IBI) found in the EEG records predicts abnormal outcomes of the premature. A bibliographic study on the amplitude of an EEG signal, with the annotations of the neuro-pediatricians, showed that low amplitudes in EEG signal are strongly correlated with an abnormal prognosis of the premature, similar to that of IBI. According to these hypotheses, we present in this paper, a segmentation methodology on the amplitude of bursts intervals of EEG signal into 3 segments: low, medium and high, in addition to the inter-burst intervals. We create a new algorithm that detects 6 important parameters in each interval of these 4 segments. After applying this new methodology, we obtain a new classified dataset that contains 24 parameters extracted from these 4 segments to obtain with gestational age of the preterm and the day of recording 26 input attributes and one output which is the class (normal, sick or risky)

    Determinant characteristics in EEG signal based on bursts amplitude segmentation for predicting pathological outcomes of a premature newborn, with validation using ANN

    Get PDF
    International audienceEEG signal contains some specific patterns that predict neuro-developmental impairments of a premature newborn. Extracting these patterns from a set of EEG records provides a dataset to be used in machine learning in order to implement an intelligent classification system that predict prognosis of the baby. In a previous work we proved that inter-burst intervals (IBI) found in the EEG records predicts abnormal outcomes of the premature. A bibliographic study on the amplitude of an EEG signal, with the annotations of the neuro-pediatricians, showed that low amplitudes in EEG signal are strongly correlated with an abnormal prognosis of the premature, similar to that of IBI. According to these hypotheses, we present in this paper, a segmentation methodology on the amplitude of bursts intervals of EEG signal into 3 segments: low, medium and high, in addition to the inter-burst intervals. We create a new algorithm that detects 6 important parameters in each interval of these 4 segments. After applying this new methodology, we obtain a new classified dataset that contains 24 parameters extracted from these 4 segments to obtain with gestational age of the preterm and the day of recording 26 input attributes and one output which is the class (normal, sick or risky). Finally we validate the pertinence of these attributes using artificial neural network
    corecore