Search CORE

28 research outputs found

LDEB -- Label Digitization with Emotion Binarization and Machine Learning for Emotion Recognition in Conversational Dialogues

Author: Dey Amitabha
Suthaharan Shan
Publication venue
Publication date: 03/06/2023
Field of study

Emotion recognition in conversations (ERC) is vital to the advancements of conversational AI and its applications. Therefore, the development of an automated ERC model using the concepts of machine learning (ML) would be beneficial. However, the conversational dialogues present a unique problem where each dialogue depicts nested emotions that entangle the association between the emotional feature descriptors and emotion type (or label). This entanglement that can be multiplied with the presence of data paucity is an obstacle for a ML model. To overcome this problem, we proposed a novel approach called Label Digitization with Emotion Binarization (LDEB) that disentangles the twists by utilizing the text normalization and 7-bit digital encoding techniques and constructs a meaningful feature space for a ML model to be trained. We also utilized the publicly available dataset called the FETA-DailyDialog dataset for feature learning and developed a hierarchical ERC model using random forest (RF) and artificial neural network (ANN) classifiers. Simulations showed that the ANN-based ERC model was able to predict emotion with the best accuracy and precision scores of about 74% and 76%, respectively. Simulations also showed that the ANN-model could reach a training accuracy score of about 98% with 60 epochs. On the other hand, the RF-based ERC model was able to predict emotions with the best accuracy and precision scores of about 78% and 75%, respectively.Comment: 10 pages, 3 figures, 4 table

arXiv.org e-Print Archive

A Software Engineering Schema for Data Intensive Applications

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2018
Field of study

The features developed by a software engineer (system specification) for a software system may significantly differ from the features required by a user (user requirements) for their envisioned system. These discrepancies are generally resulted from the complexity of the system, the vagueness of the user requirements, or the lack of knowledge and experience of the software engineer. The principles of software engineering and the recommendations of the ACM's Software Engineering Education Knowledge (SEEK) document can provide solutions to minimize these discrepancies; in turn, improve the quality of a software system and increase user satisfaction. In this paper, a software development framework, called SETh, is presented. The SETh framework consists of a set of visual models that support software engineering education and practices in a systematic manner. It also enables backward tracking/tracing and forward tracking/tracing capabilities - two important concepts that can facilitate the greenfield and evolutionary type software engineering projects. The SETh framework connects every step of the development of a software system tightly; hence, the learners and the experienced software engineers can study, understand, and build efficient software systems for emerging data science applications

The University of North Carolina at Greensboro

Optimization: A Journal of Mathematical Programming and Operations Research

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2014
Field of study

In this article we study support vector machine (SVM) classifiers in the face of uncertain knowledge sets and show how data uncertainty in knowledge sets can be treated in SVM classification by employing robust optimization. We present knowledge-based SVM classifiers with uncertain knowledge sets using convex quadratic optimization duality. We show that the knowledge-based SVM, where prior knowledge is in the form of uncertain linear constraints, results in an uncertain convex optimization problem with a set containment constraint. Using a new extension of Farkas' lemma, we reformulate the robust counterpart of the uncertain convex optimization problem in the case of interval uncertainty as a convex quadratic optimization problem. We then reformulate the resulting convex optimization problems as a simple quadratic optimization problem with non-negativity constraints using the Lagrange duality. We obtain the solution of the converted problem by a fixed point iterative algorithm and establish the convergence of the algorithm. We finally present some preliminary results of our computational experiments of the metho

The University of North Carolina at Greensboro

Logistic Map-Based Fragile Watermarking for Pixel Level Tamper Detection and Resistance

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2010
Field of study

An efficient fragile image watermarking technique for pixel level tamper detection and resistance is proposed. It uses five most significant bits of the pixels to generate watermark bits and embeds them in the three least significant bits. The proposed technique uses a logistic map and takes advantage of its sensitivity property to a small change in the initial condition. At the same time, it incorporates the confusion/diffusion and hashing techniques used in many cryptographic systems to resist tampering at pixel level as well as at block level. This paper also presents two new approaches called nonaggressive and aggressive tamper detection algorithms. Simulations show that the proposed technique can provide more than 99.39% tamper detection capability with less than 2.31% false-positive detection and less than 0.61% false-negative detection responses

The University of North Carolina at Greensboro

No-reference visually significant blocking artifact metric for natural scene images

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2009
Field of study

Quantifying visually annoying blocking artifacts is essential for image and video quality assessment. This paper presents a no-reference technique that uses the multi neural channels aspect of human visual system (HVS) to quantify visual impairment by altering the outputs of these sensory channels independently using statistical “standard score” formula in the Fourier domain. It also uses the bit patterns of the least significant bits (LSB) to extract blocking artifacts. Simulation results show that the blocking artifact extracted using this approach follows subjective visual interpretation of blocking artifacts. This paper also presents a visually significant blocking artifact metric (VSBAM) along with some experimental results

The University of North Carolina at Greensboro

Characterization of Differentially Private Logistic Regression

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2018
Field of study

The purpose of this paper is to present an approach that can help data owners select suitable values for the privacy parameter of a differentially private logistic regression (DPLR), whose main intention is to achieve a balance between privacy strength and classification accuracy. The proposed approach implements a supervised learning technique and a feature extraction technique to address this challenging problem and generate solutions. The supervised learning technique selects subspaces from a training data set and generates DPLR classifiers for a range of values of the privacy parameter. The feature extraction technique transforms an original subspace to a differentially private subspace by querying the original subspace multiple times using the DPLR model and the privacy parameter values that were selected by the supervised learning module. The proposed approach then employs a signal processing technique called signal-interference-ratio as a measure to quantify the privacy level of the differentially private subspaces; hence, allows data owner learn the privacy level that the DPLR models can provide for a given subspace and a given classification accuracy

The University of North Carolina at Greensboro

Modeling of class imbalance using an empirical approach with spambase dataset and random forest classification

Author: NC DOCKS at The University of North Carolina at Greensboro
Suthaharan Shanmugatha "Shan"
Publication venue
Publication date: 01/01/2014
Field of study

Classification of imbalanced data is an important research problem as most of the data encountered in real world systems is imbalanced. Recently a representation learning technique called Synthetic Minority Over-sampling Technique (SMOTE) has been proposed to handle imbalanced data problem. Random Forest (RF) algorithm with SMOTE has been previously used to improve classification performance in minority class over majority class. Although RF with SMOTE demonstrates improved classification performance, the relationship between the classification performance and the imbalanced ratio between the majority and minority classes is not well defined. Therefore mathematical models that describe this relationship is useful especially in the big data environment which suffers from imbalanced data. In this paper, we proposed a mathematical model using an empirical approach applied to the well known Spambase dataset and Random Forest classification approach including its adoption with SMOTE representation learning technique. We have presented a linear model which describes the relationship between true positive classification rate and the imbalanced ratio between the majority and minority classes. This model can help IT researchers to develop better spam filter algorithms

The University of North Carolina at Greensboro