87 research outputs found

    CLASSIFICATION OF VISEMES USING VISUAL CUES

    Get PDF
    Studies have shown that visual features extracted from the lips of a speaker (visemes) can be used to automatically classify the visual representation of phonemes. Different visual features were extracted from the audio-visual recordings of a set of phonemes and used to define Linear Discriminant Analysis (LDA) functions to classify the phonemes. . Audio-visual recordings from 18 speakers of Native American English for 12 Vowel-Consonant-Vowel (VCV) sounds were obtained using the consonants /b,v,w,ð,d,z/ and the vowels /ɑ,i/. The visual features used in this study were related to the lip height, lip width, motion in upper lips and the rate at which lips move while producing the VCV sequences. Features extracted from half of the speakers were used to design the classifier and features extracted from the other half were used in testing the classifiers.When each VCV sound was treated as an independent class, resulting in 12 classes, the percentage of correct recognition was 55.3% in the training set and 43.1% in the testing set. This percentage increased as classes were merged based on the level of confusion appearing between them in the results. When the same consonants with different vowels were treated as one class, resulting in 6 classes, the percentage of correct classification was 65.2% in the training set and 61.6% in the testing set. This is consistent with psycho-visual experiments in which subjects were unable to distinguish between visemes associated with VCV words with the same consonant but different vowels. When the VCV sounds were grouped into 3 classes, the percentage of correct classification in the training set was 84.4% and 81.1% in the testing set.In the second part of the study, linear discriminant functions were developed for every speaker resulting in 18 different sets of LDA functions. For every speaker, five VCV utterances were used to design the LDA functions, and 3 different VCV utterances were used to test these functions. For the training data, the range of correct classification for the 18 speakers was 90-100% with an average of 96.2%. For the testing data, the range of correct classification was 50-86% with an average of 68%.A step-wise linear discriminant analysis evaluated the contribution of different features towards the dissemination problem. The analysis indicated that classifiers using only the top 7 features in the analysis had a performance drop of 2-5%. The top 7 features were related to the shape of the mouth and the rate of motion of lips when the consonant in the VCV sequence was being produced. Results of this work showed that visual features extracted from the lips can separate the visual representation of phonemes into different classes

    The University as Technology-Focused Center on Entrepreneurship in the Middle East: Case of Jordan

    Get PDF
    Stringent business ecosystem raises the demand for novel ways of operation to survive and innovate. Such change and improved capabilities ask for reconsidering the role of the university in filling the gaps. This research puts in hand first steps and insights regarding the added value of the university in one of the Middle East countries. The main objective is to foster the economic development of the local community in Aqaba city by establishing a technology-focused center on entrepreneurship. Such center will strive to utilize the various resources in hand in order to make available high quality outcomes concerning the local community development. Expected benefits include providing various services to a wide number of clients. Such services assist the local community to overcome challenges as the scarce employment opportunities and the expanding number of higher education graduates. Keywords: Center on entrepreneurship, Entrepreneurial learning center, Regional innovation ecosystem, Technology based business, University entrepreneurshi

    Illumination removal and text segmnetation for Al-Quran using binary representation

    Get PDF
    Segmentation process for segmenting Al-Quran needs to be studied carefully. This is because Al-Quran is the book of Allah swt. Any incorrect segmentation will affect the holiness of Al-Quran. A major difficulty is the appearance of illumination around text areas as well as of noisy black stripes. In this study, we propose a novel algorithm for detecting the illumination on Al-Quran page. Our aim is to segment Al-Quran pages to pages without illumination, and to segment Al-Quran pages to text line images without any changes on the content. First we apply a pre-processing which includes binarization. Then, we detect the illumination of Al-Quran pages. In this stage, we introduce the vertical and horizontal white percentages which have been proved efficient for detecting the illumination. Finally, the new images are segmented to text line. The experimental results on several Al-Quran pages from different Al-Quran style demonstrate the effectiveness of the proposed technique

    Modified Genetic Algorithm for Feature Selection and Hyper Parameter Optimization: Case of XGBoost in Spam Prediction

    Full text link
    Recently, spam on online social networks has attracted attention in the research and business world. Twitter has become the preferred medium to spread spam content. Many research efforts attempted to encounter social networks spam. Twitter brought extra challenges represented by the feature space size, and imbalanced data distributions. Usually, the related research works focus on part of these main challenges or produce black-box models. In this paper, we propose a modified genetic algorithm for simultaneous dimensionality reduction and hyper parameter optimization over imbalanced datasets. The algorithm initialized an eXtreme Gradient Boosting classifier and reduced the features space of tweets dataset; to generate a spam prediction model. The model is validated using a 50 times repeated 10-fold stratified cross-validation, and analyzed using nonparametric statistical tests. The resulted prediction model attains on average 82.32\% and 92.67\% in terms of geometric mean and accuracy respectively, utilizing less than 10\% of the total feature space. The empirical results show that the modified genetic algorithm outperforms Chi2Chi^2 and PCAPCA feature selection methods. In addition, eXtreme Gradient Boosting outperforms many machine learning algorithms, including BERT-based deep learning model, in spam prediction. Furthermore, the proposed approach is applied to SMS spam modeling and compared to related works

    Modeling the Telemarketing Process using Genetic Algorithms and Extreme Boosting: Feature Selection and Cost-Sensitive Analytical Approach

    Full text link
    Currently, almost all direct marketing activities take place virtually rather than in person, weakening interpersonal skills at an alarming pace. Furthermore, businesses have been striving to sense and foster the tendency of their clients to accept a marketing offer. The digital transformation and the increased virtual presence forced firms to seek novel marketing research approaches. This research aims at leveraging the power of telemarketing data in modeling the willingness of clients to make a term deposit and finding the most significant characteristics of the clients. Real-world data from a Portuguese bank and national socio-economic metrics are used to model the telemarketing decision-making process. This research makes two key contributions. First, propose a novel genetic algorithm-based classifier to select the best discriminating features and tune classifier parameters simultaneously. Second, build an explainable prediction model. The best-generated classification models were intensively validated using 50 times repeated 10-fold stratified cross-validation and the selected features have been analyzed. The models significantly outperform the related works in terms of class of interest accuracy, they attained an average of 89.07\% and 0.059 in terms of geometric mean and type I error respectively. The model is expected to maximize the potential profit margin at the least possible cost and provide more insights to support marketing decision-making

    Context Modeler for Wavelet Compression of Spectral Hyperspectral Images

    Get PDF
    A context-modeling sub-algorithm has been developed as part of an algorithm that effects three-dimensional (3D) wavelet-based compression of hyperspectral image data. The context-modeling subalgorithm, hereafter denoted the context modeler, provides estimates of probability distributions of wavelet-transformed data being encoded. These estimates are utilized by an entropy coding subalgorithm that is another major component of the compression algorithm. The estimates make it possible to compress the image data more effectively than would otherwise be possible. The following background discussion is prerequisite to a meaningful summary of the context modeler. This discussion is presented relative to ICER-3D, which is the name attached to a particular compression algorithm and the software that implements it. The ICER-3D software is summarized briefly in the preceding article, ICER-3D Hyperspectral Image Compression Software (NPO-43238). Some aspects of this algorithm were previously described, in a slightly more general context than the ICER-3D software, in "Improving 3D Wavelet-Based Compression of Hyperspectral Images" (NPO-41381), NASA Tech Briefs, Vol. 33, No. 3 (March 2009), page 7a. In turn, ICER-3D is a product of generalization of ICER, another previously reported algorithm and computer program that can perform both lossless and lossy wavelet-based compression and decompression of gray-scale-image data. In ICER-3D, hyperspectral image data are decomposed using a 3D discrete wavelet transform (DWT). Following wavelet decomposition, mean values are subtracted from spatial planes of spatially low-pass subbands prior to encoding. The resulting data are converted to sign-magnitude form and compressed. In ICER-3D, compression is progressive, in that compressed information is ordered so that as more of the compressed data stream is received, successive reconstructions of the hyperspectral image data are of successively higher overall fidelity

    Fast and Adaptive Lossless Onboard Hyperspectral Data Compression System

    Get PDF
    Modern hyperspectral imaging systems are able to acquire far more data than can be downlinked from a spacecraft. Onboard data compression helps to alleviate this problem, but requires a system capable of power efficiency and high throughput. Software solutions have limited throughput performance and are power-hungry. Dedicated hardware solutions can provide both high throughput and power efficiency, while taking the load off of the main processor. Thus a hardware compression system was developed. The implementation uses a field-programmable gate array (FPGA). The implementation is based on the fast lossless (FL) compression algorithm reported in Fast Lossless Compression of Multispectral-Image Data (NPO-42517), NASA Tech Briefs, Vol. 30, No. 8 (August 2006), page 26, which achieves excellent compression performance and has low complexity. This algorithm performs predictive compression using an adaptive filtering method, and uses adaptive Golomb coding. The implementation also packetizes the coded data. The FL algorithm is well suited for implementation in hardware. In the FPGA implementation, one sample is compressed every clock cycle, which makes for a fast and practical realtime solution for space applications. Benefits of this implementation are: 1) The underlying algorithm achieves a combination of low complexity and compression effectiveness that exceeds that of techniques currently in use. 2) The algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. 3) Hardware acceleration provides a throughput improvement of 10 to 100 times vs. the software implementation. A prototype of the compressor is available in software, but it runs at a speed that does not meet spacecraft requirements. The hardware implementation targets the Xilinx Virtex IV FPGAs, and makes the use of this compressor practical for Earth satellites as well as beyond-Earth missions with hyperspectral instruments

    Improving 3D Wavelet-Based Compression of Hyperspectral Images

    Get PDF
    Two methods of increasing the effectiveness of three-dimensional (3D) wavelet-based compression of hyperspectral images have been developed. (As used here, images signifies both images and digital data representing images.) The methods are oriented toward reducing or eliminating detrimental effects of a phenomenon, referred to as spectral ringing, that is described below. In 3D wavelet-based compression, an image is represented by a multiresolution wavelet decomposition consisting of several subbands obtained by applying wavelet transforms in the two spatial dimensions corresponding to the two spatial coordinate axes of the image plane, and by applying wavelet transforms in the spectral dimension. Spectral ringing is named after the more familiar spatial ringing (spurious spatial oscillations) that can be seen parallel to and near edges in ordinary images reconstructed from compressed data. These ringing phenomena are attributable to effects of quantization. In hyperspectral data, the individual spectral bands play the role of edges, causing spurious oscillations to occur in the spectral dimension. In the absence of such corrective measures as the present two methods, spectral ringing can manifest itself as systematic biases in some reconstructed spectral bands and can reduce the effectiveness of compression of spatially-low-pass subbands. One of the two methods is denoted mean subtraction. The basic idea of this method is to subtract mean values from spatial planes of spatially low-pass subbands prior to encoding, because (a) such spatial planes often have mean values that are far from zero and (b) zero-mean data are better suited for compression by methods that are effective for subbands of two-dimensional (2D) images. In this method, after the 3D wavelet decomposition is performed, mean values are computed for and subtracted from each spatial plane of each spatially-low-pass subband. The resulting data are converted to sign-magnitude form and compressed in a manner similar to that of a baseline hyperspectral- image-compression method. The mean values are encoded in the compressed bit stream and added back to the data at the appropriate decompression step. The overhead incurred by encoding the mean values only a few bits per spectral band is negligible with respect to the huge size of a typical hyperspectral data set. The other method is denoted modified decomposition. This method is so named because it involves a modified version of a commonly used multiresolution wavelet decomposition, known in the art as the 3D Mallat decomposition, in which (a) the first of multiple stages of a 3D wavelet transform is applied to the entire dataset and (b) subsequent stages are applied only to the horizontally-, vertically-, and spectrally-low-pass subband from the preceding stage. In the modified decomposition, in stages after the first, not only is the spatially-low-pass, spectrally-low-pass subband further decomposed, but also spatially-low-pass, spectrally-high-pass subbands are further decomposed spatially. Either method can be used alone to improve the quality of a reconstructed image (see figure). Alternatively, the two methods can be combined by first performing modified decomposition, then subtracting the mean values from spatial planes of spatially-low-pass subbands

    ICER-3D Hyperspectral Image Compression Software

    Get PDF
    Software has been developed to implement the ICER-3D algorithm. ICER-3D effects progressive, three-dimensional (3D), wavelet-based compression of hyperspectral images. If a compressed data stream is truncated, the progressive nature of the algorithm enables reconstruction of hyperspectral data at fidelity commensurate with the given data volume. The ICER-3D software is capable of providing either lossless or lossy compression, and incorporates an error-containment scheme to limit the effects of data loss during transmission. The compression algorithm, which was derived from the ICER image compression algorithm, includes wavelet-transform, context-modeling, and entropy coding subalgorithms. The 3D wavelet decomposition structure used by ICER-3D exploits correlations in all three dimensions of sets of hyperspectral image data, while facilitating elimination of spectral ringing artifacts, using a technique summarized in "Improving 3D Wavelet-Based Compression of Spectral Images" (NPO-41381), NASA Tech Briefs, Vol. 33, No. 3 (March 2009), page 7a. Correlation is further exploited by a context-modeling subalgorithm, which exploits spectral dependencies in the wavelet-transformed hyperspectral data, using an algorithm that is summarized in "Context Modeler for Wavelet Compression of Hyperspectral Images" (NPO-43239), which follows this article. An important feature of ICER-3D is a scheme for limiting the adverse effects of loss of data during transmission. In this scheme, as in the similar scheme used by ICER, the spatial-frequency domain is partitioned into rectangular error-containment regions. In ICER-3D, the partitions extend through all the wavelength bands. The data in each partition are compressed independently of those in the other partitions, so that loss or corruption of data from any partition does not affect the other partitions. Furthermore, because compression is progressive within each partition, when data are lost, any data from that partition received prior to the loss can be used to reconstruct that partition at lower fidelity. By virtue of the compression improvement it achieves relative to previous means of onboard data compression, this software enables (1) increased return of hyperspectral scientific data in the presence of limits on the rates of transmission of data from spacecraft to Earth via radio communication links and/or (2) reduction in spacecraft radio-communication power and/or cost through reduction in the amounts of data required to be downlinked and stored onboard prior to downlink. The software is also suitable for compressing hyperspectral images for ground storage or archival purposes
    corecore