1,087 research outputs found
Recommended from our members
Ensemble methods for instance-based Arabic language authorship attribution
The Authorship Attribution (AA) is considered as a subfield of authorship analysis and it is an important problem as the range of anonymous information increased with fast growing of internet usage worldwide. In other languages such as English, Spanish and Chinese, such issue is quite well studied. However, in Arabic language, the AA problem has received less attention from the research community due to complexity and nature of Arabic sentences. The paper presented an intensive review on previous studies for Arabic language. Based on that, this study has employed the Technique for Order Preferences by Similarity to Ideal Solution (TOPSIS) method to choose the base classifier of the ensemble methods. In terms of attribution features, hundreds of stylometric features and distinct words using several tools have been extracted. Then, Adaboost and Bagging ensemble methods have been applied on Arabic enquires (Fatwa) dataset. The findings showed an improvement of the effectiveness of the authorship attribution task in the Arabic language
PHONOTACTIC AND ACOUSTIC LANGUAGE RECOGNITION
Práce pojednává o fonotaktickĂ©m a akustickĂ©m pĹ™Ăstupu pro automatickĂ© rozpoznávánĂ jazyka. Prvnà část práce pojednává o fonotaktickĂ©m pĹ™Ăstupu zaloĹľenĂ©m na vĂ˝skytu fonĂ©movĂ˝ch sekvenci v Ĺ™eÄŤi. NejdĹ™Ăve je prezentován popis vĂ˝voje fonĂ©movĂ©ho rozpoznávaÄŤe jako techniky pro pĹ™epis Ĺ™eÄŤi do sekvence smysluplnĂ˝ch symbolĹŻ. HlavnĂ dĹŻraz je kladen na dobrĂ© natrĂ©novánĂ fonĂ©movĂ©ho rozpoznávaÄŤe a kombinaci vĂ˝sledkĹŻ z nÄ›kolika fonĂ©movĂ˝ch rozpoznávaÄŤĹŻ trĂ©novanĂ˝ch na rĹŻznĂ˝ch jazycĂch (ParalelnĂ fonĂ©movĂ© rozpoznávánĂ následovanĂ© jazykovĂ˝mi modely (PPRLM)). Práce takĂ© pojednává o novĂ© technice anti-modely v PPRLM a studuje pouĹľitĂ fonĂ©movĂ˝ch grafĹŻ mĂsto nejlepšĂho pĹ™episu. Na závÄ›r práce jsou porovnány dva pĹ™Ăstupy modelovánĂ vĂ˝stupu fonĂ©movĂ©ho rozpoznávaÄŤe -- standardnĂ n-gramovĂ© jazykovĂ© modely a binárnĂ rozhodovacĂ stromy. HlavnĂ pĹ™Ănos v akustickĂ©m pĹ™Ăstupu je diskriminativnĂ modelovánĂ cĂlovĂ˝ch modelĹŻ jazykĹŻ a prvnĂ experimenty s kombinacĂ diskriminativnĂho trĂ©novánĂ a na pĹ™ĂznacĂch, kde byl odstranÄ›n vliv kanálu. Práce dále zkoumá rĹŻznĂ© druhy technik fĂşzi akustickĂ©ho a fonotaktickĂ©ho pĹ™Ăstupu. Všechny experimenty jsou provedeny na standardnĂch datech z NIST evaluaci konanĂ© v letech 2003, 2005 a 2007, takĹľe jsou pĹ™Ămo porovnatelnĂ© s vĂ˝sledky ostatnĂch skupin zabĂ˝vajĂcĂch se automatickĂ˝m rozpoznávánĂm jazyka. S fĂşzĂ uvedenĂ˝ch technik jsme posunuli state-of-the-art vĂ˝sledky a dosáhli vynikajĂcĂch vĂ˝sledkĹŻ ve dvou NIST evaluacĂch.This thesis deals with phonotactic and acoustic techniques for automatic language recognition (LRE). The first part of the thesis deals with the phonotactic language recognition based on co-occurrences of phone sequences in speech. A thorough study of phone recognition as tokenization technique for LRE is done, with focus on the amounts of training data for phone recognizer and on the combination of phone recognizers trained on several language (Parallel Phone Recognition followed by Language Model - PPRLM). The thesis also deals with novel technique of anti-models in PPRLM and investigates into using phone lattices instead of strings. The work on phonotactic approach is concluded by a comparison of classical n-gram modeling techniques and binary decision trees. The acoustic LRE was addressed too, with the main focus on discriminative techniques for training target language acoustic models and on initial (but successful) experiments with removing channel dependencies. We have also investigated into the fusion of phonotactic and acoustic approaches. All experiments were performed on standard data from NIST 2003, 2005 and 2007 evaluations so that the results are directly comparable to other laboratories in the LRE community. With the above mentioned techniques, the fused systems defined the state-of-the-art in the LRE field and reached excellent results in NIST evaluations.
Selected Computing Research Papers Volume 7 June 2018
Contents
Critical Evaluation of Arabic Sentimental Analysis and Their Accuracy on Microblogs (Maha Al-Sakran)
Evaluating Current Research on Psychometric Factors Affecting Teachers in ICT Integration (Daniel Otieno Aoko)
A Critical Analysis of Current Measures for Preventing Use of Fraudulent Resources in Cloud Computing (Grant Bulman)
An Analytical Assessment of Modern Human Robot Interaction Systems (Dominic Button)
Critical Evaluation of Current Power Management Methods Used in Mobile Devices (One Lekula)
A Critical Evaluation of Current Face Recognition Systems Research Aimed at Improving Accuracy for Class Attendance (Gladys B. Mogotsi)
Usability of E-commerce Website Based on Perceived Homepage Visual Aesthetics (Mercy Ochiel)
An Overview Investigation of Reducing the Impact of DDOS Attacks on Cloud Computing within Organisations (Jabed Rahman)
Critical Analysis of Online Verification Techniques in Internet Banking Transactions (Fredrick Tshane
Introducing Phonetic Information to Speaker Embedding for Speaker Verification
Phonetic information is one of the most essential components of a speech signal, playing an important role for many speech processing tasks. However, it is difficult to integrate phonetic information into speaker verification systems since it occurs primarily at the frame level while speaker characteristics typically reside at the segment level. In deep neural network-based speaker verification, existing methods only apply phonetic information to the frame-wise trained speaker embeddings. To improve this weakness, this paper proposes phonetic adaptation and hybrid multi-task learning and further combines these into c-vector and simplified c-vector architectures. Experiments on National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2010 show that the four proposed speaker embeddings achieve better performance than the baseline. The c-vector system performs the best, providing over 30% and 15% relative improvements in equal error rate (EER) for the core-extended and 10 s–10 s conditions, respectively. On the NIST SRE 2016, 2018, and VoxCeleb datasets, the proposed c-vector approach improves the performance even when there is a language mismatch within the training sets or between the training and evaluation sets. Extensive experimental results demonstrate the effectiveness and robustness of the proposed methods
- …