9 research outputs found

    A novel hybrid algorithm for morphological analysis: artificial Neural-Net-XMOR

    Get PDF
    In this study, we present a novel algorithm that combines a rule-based approach and an artificial neural network-based approach in morphological analysis. The usage of hybrid models including both techniques is evaluated for performance improvements. The proposed hybrid algorithm is based on the idea of the dynamic generation of an artificial neural network according to two-level phonological rules. In this study, the combination of linguistic parsing, a neural network-based error correction model, and statistical filtering is utilized to increase the coverage of pure morphological analysis. We experimented hybrid algorithm applying rule-based and long short-term memory-based (LSTM-based) techniques, and the results show that we improved the morphological analysis performance for optical character recognizer (OCR) and social media data. Thus, for the new hybrid algorithm with LSTM, the accuracy reached 99.91% for the OCR dataset and 99.82% for social media data. © TÜBİTAK

    Undergraduates’ interest towards learning genetics concepts through integrated stemproblem based learning approach

    Get PDF
    Scientific and innovative society can be produced by giving priorities in Science, Technology, Engineering, and Mathematics (STEM) as emphasized by Malaysian Higher Education Blueprint (2015-2025). STEM need to be implemented at higher education because universities need to produce competent graduates to support economy growth and sustainable development. Learning STEM through Problem Based Learning might allow the undergraduates to become more enthusiastic when problem-based instruction is incorporated with STEM by implementing teamwork and problem-solving techniques to engage the first-year undergraduates fully with the learning. This study was conducted to investigate whether Integrated STEM Problem Based Learning module could enhance and retain the interest towards genetics concepts among first-year undergraduates. Topics in genetics was considered difficult not only to teach but also to learn. In this research, to overcome the genetic concepts learning difficulties, genetic related topics were chosen to introduce STEM through problem-based learning approach, which might help first-year undergraduates to acquire deep genetic content knowledge. This is very vital for the first-year undergraduates, as the knowledge gained in their first semester will be applied in the upcoming courses in their entire undergraduates’ programs of study. A Pre-Experimental research design with one group-posttest design was applied. A total of 50 participants who are first-year undergraduates from Faculty of Biology from one of the public universities in Malaysia were involved. The Genetics Interest Questionnaire used to study if the STEM Problem Based Learning module could enhance and retain the interest towards genetics concepts. The research has proven that Integrated STEM through problem-based learning approach could enhance and retains the interest in learning genetics concepts among first-year undergraduates

    Paris dans les récits de voyage d’écrivains arabes : repérage, analyse sémantique et cartographie de toponymes

    Get PDF
    À la croisée du traitement du langage naturel, des études littéraires et des humanités spatiales, nous présentons dans cet article une approche pour cartographier les modalités sémantiques positives ou négatives associées aux noms de lieux dans des textes en arabe. La chaîne de traitement comprend le repérage des entités nommées de lieu, l’analyse sémantique de leur contexte (opinions, émotions et sentiments), ainsi que la cartographie de leurs instances sur des cartes géographiques. Notre corpus de travail comprend six récits de voyage à Paris de grands écrivains arabes des xixe et xxe siècles. Des approches à base de règles et à base d’apprentissage automatique ont été expérimentées et évaluées pour le repérage des entités nommées de lieu et pour l’analyse sémantique. Les résultats de notre étude permettent de confirmer l’apport de cette méthode automatique pour la recherche littéraire, en contribuant à une étude sémantique de vaste ampleur.We present in this paper an automated method to map out positive or negative semantic modalities associated with place names in Arabic travelogue literature. This research sits at the crossroads of Natural Language Processing, Literary Studies, and Digital Humanities. Our pipeline identifies place named entities, analyzes their semantic context (with regard to opinions, sentiments and emotions), and locates the place names on geographic maps. Our corpus includes six travel writings on Paris from some of the most influential Arab writers of the 19th and 20th centuries. We evaluate rule-based and machine-learning approaches for their efficacy in named entity recognition and semantic analysis. The results of our automated analysis confirm, to a great extent, the judgements and interpretations of traditional critical scholarship on these Arabic literary texts

    A Named Entity Recognition System Applied to Arabic Text in the Medical Domain

    Get PDF
    Currently, 30-35% of the global population uses the Internet. Furthermore, there is a rapidly increasing number of non-English language internet users, accompanied by an also increasing amount of unstructured text online. One area replete with underexploited online text is the Arabic medical domain, and one method that can be used to extract valuable data from Arabic medical texts is Named Entity Recognition (NER). NER is the process by which a system can automatically detect and categorise Named Entities (NE). NER has numerous applications in many domains, and medical texts are no exception. NER applied to the medical domain could assist in detection of patterns in medical records, allowing doctors to make better diagnoses and treatment decisions, enabling medical staff to quickly assess a patient's records and ensuring that patients are informed about their data, as just a few examples. However, all these applications would require a very high level of accuracy. To improve the accuracy of NER in this domain, new approaches need to be developed that are tailored to the types of named entities to be extracted and categorised. In an effort to solve this problem, this research applied Bayesian Belief Networks (BBN) to the process. BBN, a probabilistic model for prediction of random variables and their dependencies, can be used to detect and predict entities. The aim of this research is to apply BBN to the NER task to extract relevant medical entities such as disease names, symptoms, treatment methods, and diagnosis methods from modern Arabic texts in the medical domain. To achieve this aim, a new corpus related to the medical domain has been built and annotated. Our BBN approach achieved a 96.60% precision, 90.79% recall, and 93.60% F-measure for the disease entity, while for the treatment method entity, it achieved 69.33%, 70.99%, and 70.15% for precision, recall, and F-measure, respectively. For the diagnosis method and symptom categories, our system achieved 84.91% and 71.34%, respectively, for precision, 53.36% and 49.34%, respectively, for recall, and 65.53% and 58.33%, for F-measure, respectively. Our BBN strategy achieved good accuracy for NEs in the categories of disease and treatment method. However, the average word length of the other two NE categories observed, diagnosis method and symptom, may have had a negative effect on their accuracy. Overall, the application of BBN to Arabic medical NER is successful, but more development is needed to improve accuracy to a standard at which the results can be applied to real medical systems
    corecore