868 research outputs found

    TOPIC CLASSIFICATION USING HYBRID OF UNSUPERVISED AND SUPERVISED LEARNING

    Get PDF
    There has been research around the idea of representing words in text as vectors and many models proposed that vary in performance as well as applications. Text processing is used for content recommendation, sentiment analysis, plagiarism detection, content creation, language translation, etc. to name a few. Specifically, we want to look at the problem of topic detection in text content of articles/blogs/summaries. With the humungous amount of text content published each and every minute on the internet, it is imperative that we have very good algorithms and approaches to analyze all the content and be able to classify most of it with high confidence for further use. The project aims to work with unsupervised and supervised machine learning algorithms in an effort to tackle the topic detection problem. The project will target various unsupervised learning algorithms like Word2vec, doc2vec and LDA for corpus and language dictionary learning to have a trained model which understand the semantic of texts. The objective of the project is to combine this unsupervised learning with supervised learning algorithms like Support Vector Machine and deep learning methods to analyze and hopefully better the performance in terms of accuracy of topic detection. The project also aims at performing user interest-based modelling, which is orthogonal to topics modeling. The idea is to make sure the model is free of predefined categories. The project results show that hybrid models are comfortably accurate when classifying text in a particular topic category. The project also concludes that user interest modelling can also be accurately achieved along with topic detection. The project successfully determines these results without any meta information about the input text and purely based on the corpus of the input text. This makes the project framework really robust as it has no dependency on source of text, length of text or any other meta information about the text content

    Advanced quantum based neural network classifier and its application for objectionable web content filtering

    Full text link
    © 2013 IEEE. In this paper, an Advanced Quantum-based Neural Network Classifier (AQNN) is proposed. The proposed AQNN is used to form an objectionable Web content filtering system (OWF). The aim is to design a neural network with a few numbers of hidden layer neurons with the optimal connection weights and the threshold of neurons. The proposed algorithm uses the concept of quantum computing and genetic concept to evolve connection weights and the threshold of neurons. Quantum computing uses qubit as a probabilistic representation which is the smallest unit of information in the quantum computing concept. In this algorithm, a threshold boundary parameter is also introduced to find the optimal value of the threshold of neurons. The proposed algorithm forms neural network architecture which is used to form an objectionable Web content filtering system which detects objectionable Web request by the user. To judge the performance of the proposed AQNN, a total of 2000 (1000 objectionable + 1000 non-objectionable) Website's contents have been used. The results of AQNN are also compared with QNN-F and well-known classifiers as backpropagation, support vector machine (SVM), multilayer perceptron, decision tree algorithm, and artificial neural network. The results show that the AQNN as classifier performs better than existing classifiers. The performance of the proposed objectionable Web content filtering system (OWF) is also compared with well-known objectionable Web filtering software and existing models. It is found that the proposed OWF performs better than existing solutions in terms of filtering objectionable content

    Bibliometric Survey on Incremental Learning in Text Classification Algorithms for False Information Detection

    Get PDF
    The false information or misinformation over the web has severe effects on people, business and society as a whole. Therefore, detection of misinformation has become a topic of research among many researchers. Detecting misinformation of textual articles is directly connected to text classification problem. With the massive and dynamic generation of unstructured textual documents over the web, incremental learning in text classification has gained more popularity. This survey explores recent advancements in incremental learning in text classification and review the research publications of the area from Scopus, Web of Science, Google Scholar, and IEEE databases and perform quantitative analysis by using methods such as publication statistics, collaboration degree, research network analysis, and citation analysis. The contribution of this study in incremental learning in text classification provides researchers insights on the latest status of the research through literature survey, and helps the researchers to know the various applications and the techniques used recently in the field

    Real-time deformation and fracture in a game environment

    Full text link
    This paper describes a simulation system that has been developed to model the deformation and fracture of solid objects in a real-time gaming context. Based around a corotational tetrahedral finite element method, this system has been constructed from components published in the graphics and computational physics literatures. The goal of this paper is to describe how these components can be combined to produce an engine that is robust to unpredictable user interactions, fast enough to model reasonable scenarios at real-time speeds, suitable for use in the design of a game level, and with appropriate controls allowing content creators to match artistic direction. Details concerning parallel implementation, solver design, rendering method, and other aspects of the simulation are elucidated with the intent of providing a guide to others wishing to implement similar systems. Examples from in-game scenes captured on the Xbox 360, PS3, and PC platforms are included. © 2009 ACM

    A practical guide and software for analysing pairwise comparison experiments

    Get PDF
    Most popular strategies to capture subjective judgments from humans involve the construction of a unidimensional relative measurement scale, representing order preferences or judgments about a set of objects or conditions. This information is generally captured by means of direct scoring, either in the form of a Likert or cardinal scale, or by comparative judgments in pairs or sets. In this sense, the use of pairwise comparisons is becoming increasingly popular because of the simplicity of this experimental procedure. However, this strategy requires non-trivial data analysis to aggregate the comparison ranks into a quality scale and analyse the results, in order to take full advantage of the collected data. This paper explains the process of translating pairwise comparison data into a measurement scale, discusses the benefits and limitations of such scaling methods and introduces a publicly available software in Matlab. We improve on existing scaling methods by introducing outlier analysis, providing methods for computing confidence intervals and statistical testing and introducing a prior, which reduces estimation error when the number of observers is low. Most of our examples focus on image quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm

    SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

    Full text link
    Despite efforts to align large language models (LLMs) with human values, widely-used LLMs such as GPT, Llama, Claude, and PaLM are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content. To address this vulnerability, we propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks on LLMs. Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense first randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs. SmoothLLM reduces the attack success rate on numerous popular LLMs to below one percentage point, avoids unnecessary conservatism, and admits provable guarantees on attack mitigation. Moreover, our defense uses exponentially fewer queries than existing attacks and is compatible with any LLM. Our code is publicly available at the following link: https://github.com/arobey1/smooth-llm

    Feedback-Based Gameplay Metrics and Gameplay Performance Segmentation: An audio-visual approach for assessing player experience.

    Get PDF
    Gameplay metrics is a method and approach that is growing in popularity amongst the game studies research community for its capacity to assess players’ engagement with game systems. Yet, little has been done, to date, to quantify players’ responses to feedback employed by games that conveys information to players, i.e., their audio-visual streams. The present thesis introduces a novel approach to player experience assessment - termed feedback-based gameplay metrics - which seeks to gather gameplay metrics from the audio-visual feedback streams presented to the player during play. So far, gameplay metrics - quantitative data about a game state and the player's interaction with the game system - are directly logged via the game's source code. The need to utilise source code restricts the range of games that researchers can analyse. By using computer science algorithms for audio-visual processing, yet to be employed for processing gameplay footage, the present thesis seeks to extract similar metrics through the audio-visual streams, thus circumventing the need for access to, whilst also proposing a method that focuses on describing the way gameplay information is broadcast to the player during play. In order to operationalise feedback-based gameplay metrics, the present thesis introduces the concept of gameplay performance segmentation which describes how coherent segments of play can be identified and extracted from lengthy game play sessions. Moreover, in order to both contextualise the method for processing metrics and provide a conceptual framework for analysing the results of a feedback-based gameplay metric segmentation, a multi-layered architecture based on five gameplay concepts (system, game world instance, spatial-temporal, degree of freedom and interaction) is also introduced. Finally, based on data gathered from game play sessions with participants, the present thesis discusses the validity of feedback-based gameplay metrics, gameplay performance segmentation and the multi-layered architecture. A software system has also been specifically developed to produce gameplay summaries based on feedback-based gameplay metrics, and examples of summaries (based on several games) are presented and analysed. The present thesis also demonstrates that feedback-based gameplay metrics can be conjointly analysed with other forms of data (such as biometry) in order to build a more complete picture of game play experience. Feedback based game-play metrics constitutes a post-processing approach that allows the researcher or analyst to explore the data however they wish and as many times as they wish. The method is also able to process any audio-visual file, and can therefore process material from a range of audio-visual sources. This novel methodology brings together game studies and computer sciences by extending the range of games that can now be researched but also to provide a viable solution accounting for the exact way players experience games

    Designing Light Filters to Detect Skin Using a Low-powered Sensor

    Get PDF
    Detection of nudity in photos and videos, especially prior to uploading to the internet, is vital to solving many problems related to adolescent sexting, the distribution of child pornography, and cyber-bullying. The problem with using nudity detection algorithms as a means to combat these problems is that: 1) it implies that a digitized nude photo of a minor already exists (i.e., child pornography), and 2) there are real ethical and legal concerns around the distribution and processing of child pornography. Once a camera captures an image, that image is no longer secure. Therefore, we need to develop new privacy-preserving solutions that prevent the digital capture of nude imagery of minors. My research takes a first step in trying to accomplish this long-term goal: In this thesis, I examine the feasibility of using a low-powered sensor to detect skin dominance (defined as an image comprised of 50% or more of human skin tone) in a visual scene. By designing four custom light filters to enhance the digital information extracted from 300 scenes captured with the sensor (without digitizing high-fidelity visual features), I was able to accurately detect a skin dominant scene with 83.7% accuracy, 83% precision, and 85% recall. The long-term goal to be achieved in the future is to design a low-powered vision sensor that can be mounted on a digital camera lens on a teen\u27s mobile device to detect and/or prevent the capture of nude imagery. Thus, I discuss the limitations of this work toward this larger goal, as well as future research directions
    corecore