Search CORE

1,087 research outputs found

Recommended from our members

Ensemble methods for instance-based Arabic language authorship attribution

Author: Al-Hadhrami T
Al-Sarem M
Alsaeedi A
Boulila W
Saeed F
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/01/2020
Field of study

The Authorship Attribution (AA) is considered as a subfield of authorship analysis and it is an important problem as the range of anonymous information increased with fast growing of internet usage worldwide. In other languages such as English, Spanish and Chinese, such issue is quite well studied. However, in Arabic language, the AA problem has received less attention from the research community due to complexity and nature of Arabic sentences. The paper presented an intensive review on previous studies for Arabic language. Based on that, this study has employed the Technique for Order Preferences by Similarity to Ideal Solution (TOPSIS) method to choose the base classifier of the ensemble methods. In terms of attribution features, hundreds of stylometric features and distinct words using several tools have been extracted. Then, Adaboost and Bagging ensemble methods have been applied on Arabic enquires (Fatwa) dataset. The findings showed an improvement of the effectiveness of the authorship attribution task in the Arabic language

Nottingham Trent Institutional Repository (IRep)

PHONOTACTIC AND ACOUSTIC LANGUAGE RECOGNITION

Author: Matějka Pavel
Publication venue: Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií
Publication date: 01/01/2009
Field of study

Práce pojednává o fonotaktickém a akustickém přístupu pro automatické rozpoznávání jazyka. První část práce pojednává o fonotaktickém přístupu založeném na výskytu fonémových sekvenci v řeči. Nejdříve je prezentován popis vývoje fonémového rozpoznávače jako techniky pro přepis řeči do sekvence smysluplných symbolů. Hlavní důraz je kladen na dobré natrénování fonémového rozpoznávače a kombinaci výsledků z několika fonémových rozpoznávačů trénovaných na různých jazycích (Paralelní fonémové rozpoznávání následované jazykovými modely (PPRLM)). Práce také pojednává o nové technice anti-modely v PPRLM a studuje použití fonémových grafů místo nejlepšího přepisu. Na závěr práce jsou porovnány dva přístupy modelování výstupu fonémového rozpoznávače -- standardní n-gramové jazykové modely a binární rozhodovací stromy. Hlavní přínos v akustickém přístupu je diskriminativní modelování cílových modelů jazyků a první experimenty s kombinací diskriminativního trénování a na příznacích, kde byl odstraněn vliv kanálu. Práce dále zkoumá různé druhy technik fúzi akustického a fonotaktického přístupu. Všechny experimenty jsou provedeny na standardních datech z NIST evaluaci konané v letech 2003, 2005 a 2007, takže jsou přímo porovnatelné s výsledky ostatních skupin zabývajících se automatickým rozpoznáváním jazyka. S fúzí uvedených technik jsme posunuli state-of-the-art výsledky a dosáhli vynikajících výsledků ve dvou NIST evaluacích.This thesis deals with phonotactic and acoustic techniques for automatic language recognition (LRE). The first part of the thesis deals with the phonotactic language recognition based on co-occurrences of phone sequences in speech. A thorough study of phone recognition as tokenization technique for LRE is done, with focus on the amounts of training data for phone recognizer and on the combination of phone recognizers trained on several language (Parallel Phone Recognition followed by Language Model - PPRLM). The thesis also deals with novel technique of anti-models in PPRLM and investigates into using phone lattices instead of strings. The work on phonotactic approach is concluded by a comparison of classical n-gram modeling techniques and binary decision trees. The acoustic LRE was addressed too, with the main focus on discriminative techniques for training target language acoustic models and on initial (but successful) experiments with removing channel dependencies. We have also investigated into the fusion of phonotactic and acoustic approaches. All experiments were performed on standard data from NIST 2003, 2005 and 2007 evaluations so that the results are directly comparable to other laboratories in the LRE community. With the above mentioned techniques, the fused systems defined the state-of-the-art in the LRE field and reached excellent results in NIST evaluations.

Digital library of Brno University of Technology

National Repository of Grey Literature

Selected Computing Research Papers Volume 7 June 2018

Author: Al-Sakran Maha
Aoko Daniel Otieno
Bulman Grant
Button Dominic
Kendal Simon
Lekula One
Mogotsi Gladys B.
Ochiel Mercy
Rahman Jabed
Tshane Fredrick
Publication venue: University of Sunderland
Publication date: 01/06/2018
Field of study

Contents Critical Evaluation of Arabic Sentimental Analysis and Their Accuracy on Microblogs (Maha Al-Sakran) Evaluating Current Research on Psychometric Factors Affecting Teachers in ICT Integration (Daniel Otieno Aoko) A Critical Analysis of Current Measures for Preventing Use of Fraudulent Resources in Cloud Computing (Grant Bulman) An Analytical Assessment of Modern Human Robot Interaction Systems (Dominic Button) Critical Evaluation of Current Power Management Methods Used in Mobile Devices (One Lekula) A Critical Evaluation of Current Face Recognition Systems Research Aimed at Improving Accuracy for Class Attendance (Gladys B. Mogotsi) Usability of E-commerce Website Based on Perceived Homepage Visual Aesthetics (Mercy Ochiel) An Overview Investigation of Reducing the Impact of DDOS Attacks on Cloud Computing within Organisations (Jabed Rahman) Critical Analysis of Online Verification Techniques in Internet Banking Transactions (Fredrick Tshane

Sunderland University Institutional Repository

Introducing Phonetic Information to Speaker Embedding for Speaker Verification

Author: He Liang
Johnson Michael T.
Liu Yi
Publication venue: UKnowledge
Publication date: 05/12/2019
Field of study

Phonetic information is one of the most essential components of a speech signal, playing an important role for many speech processing tasks. However, it is difficult to integrate phonetic information into speaker verification systems since it occurs primarily at the frame level while speaker characteristics typically reside at the segment level. In deep neural network-based speaker verification, existing methods only apply phonetic information to the frame-wise trained speaker embeddings. To improve this weakness, this paper proposes phonetic adaptation and hybrid multi-task learning and further combines these into c-vector and simplified c-vector architectures. Experiments on National Institute of Standards and Technology (NIST) speaker recognition evaluation (SRE) 2010 show that the four proposed speaker embeddings achieve better performance than the baseline. The c-vector system performs the best, providing over 30% and 15% relative improvements in equal error rate (EER) for the core-extended and 10 s–10 s conditions, respectively. On the NIST SRE 2016, 2018, and VoxCeleb datasets, the proposed c-vector approach improves the performance even when there is a language mismatch within the training sets or between the training and evaluation sets. Extensive experimental results demonstrate the effectiveness and robustness of the proposed methods

University of Kentucky

Data analytics 2016: proceedings of the fifth international conference on data analytics

Author: Bhulai Sandjai
Semanjski Ivana
Publication venue: The International Academy, Research and Industry Association
Publication date: 01/01/2016
Field of study

VU Research Portal

Ghent University Academic Bibliography

A systematic literature review of factor analytic and mixture models of ICD-11 PTSD and CPTSD using the International Trauma Questionnaire

Author: Cloitre Marylène
Hyland Philip
Karatzias Thanos
McBride Orla
Murphy Jamie
Nolan Emma
Redican Enya
Shevlin Mark
Publication venue: 'Elsevier BV'
Publication date: 30/04/2021
Field of study

Ulster University's Research Portal