66 research outputs found

    Automatic Recognition of Arabic Poetry Meter from Speech Signal using Long Short-term Memory and Support Vector Machine

    Get PDF
    The recognition of the poetry meter in spoken lines is a natural language processing application that aims to identify a stressed and unstressed syllabic pattern in a line of a poem. Stateof-the-art studies include few works on the automatic recognition of Arud meters, all of which are text-based models, and none is voice based. Poetry meter recognition is not easy for an ordinary reader, it is very difficult for the listener and it is usually performed manually by experts. This paper proposes a model to detect the poetry meter from a single spoken line (“Bayt”) of an Arabic poem. Data of 230 samples collected from 10 poems of Arabic poetry, including three meters read by two speakers, are used in this work. The work adopts the extraction of linear prediction cepstrum coefficient and Mel frequency cepstral coefficient (MFCC) features, as a time series input to the proposed long short-term memory (LSTM) classifier, in addition to a global feature set that is computed using some statistics of the features across all of the frames to feed the support vector machine (SVM) classifier. The results show that the SVM model achieves the highest accuracy in the speakerdependent approach. It improves results by 3%, as compared to the state-of-the-art studies, whereas for the speaker-independent approach, the MFCC feature using LSTM exceeds the other proposed models

    Enhancing physical layer security of cognitive radio transceiver via chaotic OFDM

    Get PDF
    Due to the enormous potential of improving the spectral utilization by using Cognitive Radio (CR), designing adaptive access system and addressing its physical layer security are the most important and challenging issues in CR networks. Since CR transceivers need to transmit over multiple non-contiguous frequency holes, multi-carrier based system is one of the best candidates for CR's physical layer design. In this paper, we propose a combined chaotic scrambling (CS) and chaotic shift keying (CSK) scheme in Orthogonal Frequency Division Multiplexing (OFDM) based CR to enhance its physical layer security. By employing chaos based third order Chebyshev map which allows optimum bit error rate (BER) performance of CSK modulation, the proposed combined scheme outperforms the traditional OFDM system in overlay scenario with Rayleigh fading channel. Importantly, with two layers of encryption based on chaotic scrambling and CSK modulation, large key size can be generated to resist any brute-force attack, leading to a significantly improved level of security

    Enhancing secrecy rate in cognitive radio networks via multilevel Stackelberg game

    Get PDF
    In this letter, physical layer (PHY) security is investigated for both primary and secondary transmissions of a cognitive radio network (CRN) that is in danger of malicious attempt by an eavesdropper (ED). In our proposed system, the secondary transmitter (ST) is acted as a trusted relay (TR) for primary transmission and the PHY security is facilitated by the cooperation between the primary transmitter (PT) and the ST using the multilevel Stackelberg game. In particular, we formulate and solve the optimization problem of maximizing secrecy rates in different phases of primary and secondary transmissions. Finally, numerical examples are provided to demonstrate that the spectrum leasing based on trading secondary access for cooperation is a promising framework for enhancing secrecy rate in CRNs

    Enhancing secrecy rate in cognitive radio networks via stackelberg game

    Get PDF
    In this paper, a game theory based cooperation scheme is investigated to enhance the physical layer security in both primary and secondary transmissions of a cognitive radio network (CRN). In CRNs, the primary network may decide to lease its own spectrum for a fraction of time to the secondary nodes in exchange of appropriate remuneration. We consider the secondary transmitter node as a trusted relay for primary transmission to forward primary messages in a decode-and-forward (DF) fashion and, at the same time, allows part of its available power to be used to transmit artificial noise (i.e., jamming signal) to enhance primary and secondary secrecy rates. In order to allocate power between message and jamming signals, we formulate and solve the optimization problem for maximizing the secrecy rates under malicious attempts from EDs. We then analyse the cooperation between the primary and secondary nodes from a game-theoretic perspective where we model their interaction as a Stackelberg game with a theoretically proved and computed Stackelberg equilibrium. We show that the spectrum leasing based on trading secondary access for cooperation by means of relay and jammer is a promising framework for enhancing security in CRNs

    Efficient Kinect Sensor-based Kurdish Sign Language Recognition Using Echo System Network

    Get PDF
    Sign language assists in building communication and bridging gaps in understanding. Automatic sign language recognition (ASLR) is a field that has recently been studied for various sign languages. However, Kurdish sign language (KuSL) is relatively new and therefore researches and designed datasets on it are limited. This paper has proposed a model to translate KuSL into text and has designed a dataset using Kinect V2 sensor. The computation complexity of feature extraction and classification steps, which are serious problems for ASLR, has been investigated in this paper. The paper proposed a feature engineering approach on the skeleton position alone to provide a better representation of the features and avoid the use of all of the image information. In addition, the paper proposed model makes use of recurrent neural networks (RNNs)-based models. Training RNNs is inherently difficult, and consequently, motivates to investigate alternatives. Besides the trainable long short-term memory (LSTM), this study has proposed the untrained low complexity echo system network (ESN) classifier. The accuracy of both LSTM and ESN indicates they can outperform those in state-of-the-art studies. In addition, ESN which has not been proposed thus far for ASLT exhibits comparable accuracy to the LSTM with a significantly lower training time

    Automatic Speech Emotion Recognition- Feature Space Dimensionality and Classification Challenges

    Get PDF
    In the last decade, research in Speech Emotion Recognition (SER) has become a major endeavour in Human Computer Interaction (HCI), and speech processing. Accurate SER is essential for many applications, like assessing customer satisfaction with quality of services, and detecting/assessing emotional state of children in care. The large number of studies published on SER reflects the demand for its use. The main concern of this thesis is the investigation of SER from a pattern recognition and machine learning points of view. In particular, we aim to identify appropriate mathematical models of SER and examine the process of designing automatic emotion recognition schemes. There are major challenges to automatic SER including ambiguity about the list/definition of emotions, the lack of agreement on a manageable set of uncorrelated speech-based emotion relevant features, and the difficulty of collected emotion-related datasets under natural circumstances. We shall initiate our work by dealing with the identification of appropriate sets of emotion related features/attributes extractible from speech signals as considered from psychological and computational points of views. We shall investigate the use of pattern-recognition approaches to remove redundancies and achieve compactification of digital representation of the extracted data with minimal loss of information. The thesis will include the design of new or complement existing SER schemes and conduct large sets of experiments to empirically test their performances on different databases, identify advantages, and shortcomings of using speech alone for emotion recognition. Existing SER studies seem to deal with the ambiguity/dis-agreement on a “limited” number of emotion-related features by expanding the list from the same speech signal source/sites and apply various feature selection procedures as a mean of reducing redundancies. Attempts are made to discover more relevant features to emotion from speech. One of our investigations focuses on proposing a newly sets of features for SER, extracted from Linear Predictive (LP)-residual speech. We shall demonstrate the usefulness of the proposed relatively small set of features by testing the performance of an SER scheme that is based on fusing our set of features with the existing set of thousands of features using common machine learning schemes of Support Vector Machine (SVM) and Artificial Neural Network (ANN). The challenge of growing dimensionality of SER feature space and its impact on increased model complexity is another major focus of our research project. By studying the pros and cons of the commonly used feature selection approaches, we argued in favour of meta-feature selection and developed various methods in this direction, not only to reduce dimension, but also to adapt and de-correlate emotional feature spaces for improved SER model recognition accuracy. We used rincipal Component Analysis (PCA) and proposed Data Independent PCA (DIPCA) by training on independent emotional and non-emotional datasets. The DIPCA projections, especially when extracted from speech data coloured with different emotions or from Neutral speech data, had comparable capability to the PCA in terms of SER performance. Another adopted approach in this thesis for dimension reduction is the Random Projection (RP) matrices, independent of training data. We have shown that some versions of RP with SVM classifier can offer an adaptation space for Speaker Independent SER that avoid over-fitting and hence improves recognition accuracy. Using PCA trained on a set of data, while testing on emotional data features, has significant implication for machine learning in general. The thesis other major contribution focuses on the classification aspects of SER. We investigate the drawbacks of the well-known SVM classifier when applied to a preprocessed data by PCA and RP. We shall demonstrate the advantages of using the Linear Discriminant Classifier (LDC) instead especially for PCA de-correlated metafeatures. We initiated a variety of LDC-based ensembles classification, to test performance of scheme using a new form of bagging different subsets of metafeature subsets extracted by PCA with encouraging results. The experiments conducted were applied on two benchmark datasets (Emo-Berlin and FAU-Aibo), and an in-house dataset in the Kurdish language. Recognition accuracy achieved by are significantly higher than the state of art results on all datasets. The results, however, revealed a difficult challenge in the form of persisting wide gap in accuracy over different datasets, which cannot be explained entirely by the differences between the natures of the datasets. We conducted various pilot studies that were based on various visualizations of the confusion matrices for the “difficult” databases to build multi-level SER schemes. These studies provide initial evidences to the presence of more than one “emotion” in the same portion of speech. A possible solution may be through presenting recognition accuracy in a score-based measurement like the spider chart. Such an approach may also reveal the presence of Doddington zoo phenomena in SER

    The Visual Effectiveness of the Techniques of the Graphic Display (An Experimental study)

    Get PDF
    يقع البحثُ الحاليُّ والموسومُ بـ (الفاعليَّةُ البصريَّةُ لتقنيّاتِ الإظهارِ الكَرافيكيّةِ) في أربعِ فصولٍ، وقد تناول الفصلُ الأوَّلُ مشكلةَ البحثِ التي إنتهت بالتساؤلاتِ الاتيّةِ: هل تُحَقِقُ جميعُ تقنيّاتِ الكَرافيك فاعليّة بصريّة متساوية على مُستوى الحَقلِ الاظهاريِّ للطَبْعَةِ؟ هل يُوَفِّر التَعَدُد في الإدخالِ التقنيِّ للمُنْجَزِ الواحدِ فاعليةً أكبر للحقلِ البصريِّ؟ ماهي التقنيّات الكَرافيكيّةِ التي تُحققُ فاعليّةٍ بصريّةٍ عاليّةٍ عن غيرها من التقنيّات؟ ومن ثمَّ أهميَّة البحثِ والحاجةِ اليهِ، وهدف البحث في تعرُّف الفاعليَّة البصريّة لتقنيّات الكَرافيك، وتحديد بعض المصطلحات الواردةِ في عنوانِ البحثِ .   وجاءَ الفصلُ الثاني تحت عنوان آليّاتُ ألمَنْظومةِ البَصَريَّةِ في إشْتِغالاتِ التَقَنيّاتِ الكَرافيكيّةِ، الذي تحدّثَ عن آليّات الإشتغال الأدائيّة الخاصّة بتقنيّاتِ الإظهارِ الكَرافيكيَّةِ من حيث الطباعة البارزة والغائرة والنافذة.     أمّا الفصلُ الثالثُ فقد ضمَّ إجراءات البحث والتي حددت مجتمع البحث وعيِّنتهُ التي تضمّنت خمسةُ أعمالٍ كَرافيكيّةٍ كانت قد أُنْجِزَتْ من قبلِ الباحثِ بتقنيّاتِ إظهارٍ مختلفةٍ على النحو الذي يتلاءم وهدفُ البحثِ الحالي.    تضمَّنَ الفصلُ الرابع النتائج التي توصَّلَ اليها الباحِثُ، ومنها: فاعليَّة تقنيّة لصق القصاصات (الكولغراف Collegraphy)، كما في إنموذج عيِّنة (1): المَلمَس: لها فاعليّة كبيرة في إظهارِ الإحساسِ بالملمس الخاص بالقصاصات المطبوعة. اللون: إنَّه قليل الفاعليّة في إدخالاتها اللونيّة (زهد لوني)، بسبب صعوبة تحبير الأجزاء. الخط: له فاعليّة بصريّة كبيرة في إظهارِ الخطوطِ. النقطة: قليلة الفاعليّة في إظهارِ التكويناتِ النقطيّةِ . التدرُّجاتِ اللونيَّةِ: لا يمكنُ إحداث تدرُّجات لونيّة وفق هذهِ التقنيَّة (قليلة الفاعليّة). الرليفات (النتوءات): لها فاعليّة بصريّة كبيرة في إظهار النتوءات على سطح الورق الطباعي، شأنها في ذلك شأن تقنيّة الطباعة العمياء. عدد الطبعات: يكون عدد الطبعات محدود في التقنيّة بسبب تلف أجزاء من القصاصات عند كل عمليّة طباعة (قليلة الفعليّة). الكلفة: قليلة الكلفة. مدى توفر موادها: متوفِّرة بكثرة. سرعة الإنجاز: سريعة الإنجاز (فاعليَّة عاليّة). إمكانيّة الإستخدام اللوني: لا يمكن إستخدام أي نوع من الألوانِ غير الحبرِ الطباعيِّ (قليلة الفاعليَّة). الحجم: يكون الحجم محدود بحجم ماكنة الطباعة Litho press (قليلة الفاعليَّة).    ومن إستنتاجات البحث: لكلِّ تقنيَّةٍ إظهاريَّةٍ كَرافيكيّةٍ فاعليَّة خاصّة بها ، تختَلِفُ قليلاً أو كثيراً عن غيرِها من التقنيّاتِ. تزدادُ الفاعليّةُ البصريَّةُ للمُنْجَزِ الكَرافيكيِّ الواحد بزيادةِ تقنيّاتِ الإشتغال. ثمَّ المقترحات والتوصيات لتليها قائمة الأشكال والمصادر والمراجع التي أفاد منها الباحث في إظهارِ البحثِ على النحو الحالي.The current research and (The visual effectiveness of the techniques of the graphic display) (Experimental study), in four chapters, the first chapter dealt with the problem of research, which ended with the following questions: Do all the graphics techniques achieve equal visual efficiency at the field level? Does multitasking in the technical input of a single object provide greater efficiency for the optical field? 3 - What are the graphics techniques that achieve high visual efficiency from other technologies? Then the importance of research and the need for it, and the goal of research in the recognition of the visual effectiveness of the techniques of the graphic, and identify some of the terms contained in the title of the research. The second chapter, entitled Mechanisms of the visual system in the applications of graphic techniques, in which the researcher talked about the mechanisms of the performance of the techniques of the manifestation of the graphic in terms of printing prominent, deep and window.    The third chapter came as a procedure for the current research, through which the research society and its sample were determined, which included five works of graphics that were completed by the researcher with different presentation techniques as appropriate to the current research objective. The fourth chapter includes the findings of the researcher, including: )A) The effectiveness of the technique of clipping, as in the sample (1):  1- The texture: It has great effectiveness in showing the sense of texture of printed clips 2- Color: It is very inefficient in their color entries (chromatic asceticism), because of the difficulty of impregnation of the parts. 3-Line: It has a great visual effectiveness in showing lines 4-Point: Few effective in showing the point configurations. 5-Gradients: color gradients can not be produced according to this technique (low efficiency). 6-Relatives: It has great visual efficiency in showing the protrusions on the surface of the printing paper, as does the blind printing technique. 7-Number of editions: The number of editions is limited in this technique because parts of the scraps are damaged at each print process (actual few). 8-Cost: Low cost. 9-Availability of materials: Available widely. 10- speed of completion: fast completion (high efficiency). 11-The possibility of use color: can not use any type of colors other than ink (low efficiency). 12-Size: Size is limited to the size of the Litho press The research findings: Each technique has its own efficiency, which differs little or much from other techniques. (B) The visual efficiency of the single karafic achievement increases with the increase in the working techniques. Then the suggestions and recommendations followed by the list of forms, sources and references that the researcher used to show the search as the current

    Emotion recognition from speech: tools and challenges

    Get PDF
    Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more “natural” than the acted datasets where the subjects attempt to suppress all but one emotion. © (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only

    Enhancing secrecy rate in cognitive radio networks via stackelberg game

    Get PDF
    In this paper, a game theory based cooperation scheme is investigated to enhance the physical layer security in both primary and secondary transmissions of a cognitive radio network (CRN). In CRNs, the primary network may decide to lease its own spectrum for a fraction of time to the secondary nodes in exchange of appropriate remuneration. We consider the secondary transmitter node as a trusted relay for primary transmission to forward primary messages in a decode-and-forward (DF) fashion and, at the same time, allows part of its available power to be used to transmit artificial noise (i.e., jamming signal) to enhance primary and secondary secrecy rates. In order to allocate power between message and jamming signals, we formulate and solve the optimization problem for maximizing the secrecy rates under malicious attempts from EDs. We then analyse the cooperation between the primary and secondary nodes from a game-theoretic perspective where we model their interaction as a Stackelberg game with a theoretically proved and computed Stackelberg equilibrium. We show that the spectrum leasing based on trading secondary access for cooperation by means of relay and jammer is a promising framework for enhancing security in CRNs

    Kurdish Dialects and Neighbor Languages Automatic Recognition

    Get PDF
    Dialect recognition is one of the most hot topics in the speech analysis area. In this study a system for dialect and language recognition is developed using phonetic and a style based features. The study suggests a new set of feature using one-dimensional LBP feature.  The results show that the proposed LBP set of feature is useful to improve dialect and language recognition accuracy. The acquired data involved in this study are three Kurdish dialects (Sorani, Badini and Hawrami) with three neighbor languages (Arabic, Persian and Turkish). The study proposed a new method to interpret the closeness of the Kurdish dialects and their neighbor languages using confusion matrix and a non-metric multi-dimensional visualization technique. The result shows that the Kurdish dialects can be clustered and linearly separated from the neighbor languages
    corecore