28 research outputs found

    LDEB -- Label Digitization with Emotion Binarization and Machine Learning for Emotion Recognition in Conversational Dialogues

    Full text link
    Emotion recognition in conversations (ERC) is vital to the advancements of conversational AI and its applications. Therefore, the development of an automated ERC model using the concepts of machine learning (ML) would be beneficial. However, the conversational dialogues present a unique problem where each dialogue depicts nested emotions that entangle the association between the emotional feature descriptors and emotion type (or label). This entanglement that can be multiplied with the presence of data paucity is an obstacle for a ML model. To overcome this problem, we proposed a novel approach called Label Digitization with Emotion Binarization (LDEB) that disentangles the twists by utilizing the text normalization and 7-bit digital encoding techniques and constructs a meaningful feature space for a ML model to be trained. We also utilized the publicly available dataset called the FETA-DailyDialog dataset for feature learning and developed a hierarchical ERC model using random forest (RF) and artificial neural network (ANN) classifiers. Simulations showed that the ANN-based ERC model was able to predict emotion with the best accuracy and precision scores of about 74% and 76%, respectively. Simulations also showed that the ANN-model could reach a training accuracy score of about 98% with 60 epochs. On the other hand, the RF-based ERC model was able to predict emotions with the best accuracy and precision scores of about 78% and 75%, respectively.Comment: 10 pages, 3 figures, 4 table

    A Software Engineering Schema for Data Intensive Applications

    Get PDF
    The features developed by a software engineer (system specification) for a software system may significantly differ from the features required by a user (user requirements) for their envisioned system. These discrepancies are generally resulted from the complexity of the system, the vagueness of the user requirements, or the lack of knowledge and experience of the software engineer. The principles of software engineering and the recommendations of the ACM's Software Engineering Education Knowledge (SEEK) document can provide solutions to minimize these discrepancies; in turn, improve the quality of a software system and increase user satisfaction. In this paper, a software development framework, called SETh, is presented. The SETh framework consists of a set of visual models that support software engineering education and practices in a systematic manner. It also enables backward tracking/tracing and forward tracking/tracing capabilities - two important concepts that can facilitate the greenfield and evolutionary type software engineering projects. The SETh framework connects every step of the development of a software system tightly; hence, the learners and the experienced software engineers can study, understand, and build efficient software systems for emerging data science applications

    Optimization: A Journal of Mathematical Programming and Operations Research

    Get PDF
    In this article we study support vector machine (SVM) classifiers in the face of uncertain knowledge sets and show how data uncertainty in knowledge sets can be treated in SVM classification by employing robust optimization. We present knowledge-based SVM classifiers with uncertain knowledge sets using convex quadratic optimization duality. We show that the knowledge-based SVM, where prior knowledge is in the form of uncertain linear constraints, results in an uncertain convex optimization problem with a set containment constraint. Using a new extension of Farkas' lemma, we reformulate the robust counterpart of the uncertain convex optimization problem in the case of interval uncertainty as a convex quadratic optimization problem. We then reformulate the resulting convex optimization problems as a simple quadratic optimization problem with non-negativity constraints using the Lagrange duality. We obtain the solution of the converted problem by a fixed point iterative algorithm and establish the convergence of the algorithm. We finally present some preliminary results of our computational experiments of the metho

    Logistic Map-Based Fragile Watermarking for Pixel Level Tamper Detection and Resistance

    Get PDF
    An efficient fragile image watermarking technique for pixel level tamper detection and resistance is proposed. It uses five most significant bits of the pixels to generate watermark bits and embeds them in the three least significant bits. The proposed technique uses a logistic map and takes advantage of its sensitivity property to a small change in the initial condition. At the same time, it incorporates the confusion/diffusion and hashing techniques used in many cryptographic systems to resist tampering at pixel level as well as at block level. This paper also presents two new approaches called nonaggressive and aggressive tamper detection algorithms. Simulations show that the proposed technique can provide more than 99.39% tamper detection capability with less than 2.31% false-positive detection and less than 0.61% false-negative detection responses

    No-reference visually significant blocking artifact metric for natural scene images

    Get PDF
    Quantifying visually annoying blocking artifacts is essential for image and video quality assessment. This paper presents a no-reference technique that uses the multi neural channels aspect of human visual system (HVS) to quantify visual impairment by altering the outputs of these sensory channels independently using statistical “standard score” formula in the Fourier domain. It also uses the bit patterns of the least significant bits (LSB) to extract blocking artifacts. Simulation results show that the blocking artifact extracted using this approach follows subjective visual interpretation of blocking artifacts. This paper also presents a visually significant blocking artifact metric (VSBAM) along with some experimental results

    Characterization of Differentially Private Logistic Regression

    Get PDF
    The purpose of this paper is to present an approach that can help data owners select suitable values for the privacy parameter of a differentially private logistic regression (DPLR), whose main intention is to achieve a balance between privacy strength and classification accuracy. The proposed approach implements a supervised learning technique and a feature extraction technique to address this challenging problem and generate solutions. The supervised learning technique selects subspaces from a training data set and generates DPLR classifiers for a range of values of the privacy parameter. The feature extraction technique transforms an original subspace to a differentially private subspace by querying the original subspace multiple times using the DPLR model and the privacy parameter values that were selected by the supervised learning module. The proposed approach then employs a signal processing technique called signal-interference-ratio as a measure to quantify the privacy level of the differentially private subspaces; hence, allows data owner learn the privacy level that the DPLR models can provide for a given subspace and a given classification accuracy

    Modeling of class imbalance using an empirical approach with spambase dataset and random forest classification

    Get PDF
    Classification of imbalanced data is an important research problem as most of the data encountered in real world systems is imbalanced. Recently a representation learning technique called Synthetic Minority Over-sampling Technique (SMOTE) has been proposed to handle imbalanced data problem. Random Forest (RF) algorithm with SMOTE has been previously used to improve classification performance in minority class over majority class. Although RF with SMOTE demonstrates improved classification performance, the relationship between the classification performance and the imbalanced ratio between the majority and minority classes is not well defined. Therefore mathematical models that describe this relationship is useful especially in the big data environment which suffers from imbalanced data. In this paper, we proposed a mathematical model using an empirical approach applied to the well known Spambase dataset and Random Forest classification approach including its adoption with SMOTE representation learning technique. We have presented a linear model which describes the relationship between true positive classification rate and the imbalanced ratio between the majority and minority classes. This model can help IT researchers to develop better spam filter algorithms
    corecore