Search CORE

525 research outputs found

VGM-RNN: Recurrent Neural Networks for Video Game Music Generation

Author: Mauthes Nicolas
Publication venue: SJSU ScholarWorks
Publication date: 01/04/2018
Field of study

The recent explosion of interest in deep neural networks has affected and in some cases reinvigorated work in fields as diverse as natural language processing, image recognition, speech recognition and many more. For sequence learning tasks, recurrent neural networks and in particular LSTM-based networks have shown promising results. Recently there has been interest – for example in the research by Google’s Magenta team – in applying so-called “language modeling” recurrent neural networks to musical tasks, including for the automatic generation of original music. In this work we demonstrate our own LSTM-based music language modeling recurrent network. We show that it is able to learn musical features from a MIDI dataset and generate output that is musically interesting while demonstrating features of melody, harmony and rhythm. We source our dataset from VGMusic.com, a collection of user-submitted MIDI transcriptions of video game songs, and attempt to generate output which emulates this kind of music

SJSU ScholarWorks

A Framework for Devanagari Script-based Captcha

Author: Rao M. Kameswara
Yalamanchili Sushma
Publication venue
Publication date: 01/01/2011
Field of study

Human Interactive Proofs (HIPs) are automatic reverse Turing tests designed to distinguish between various groups of users. Completely Automatic Public Turing test to tell Computers and Humans Apart (CAPTCHA) is a HIP system that distinguish between humans and malicious computer programs. Many CAPTCHAs have been proposed in the literature that text-graphical based, audio-based, puzzle-based and mathematical questions-based. The design and implementation of CAPTCHAs fall in the realm of Artificial Intelligence. We aim to utilize CAPTCHAs as a tool to improve the security of Internet based applications. In this paper we present a framework for a text-based CAPTCHA based on Devanagari script which can exploit the difference in the reading proficiency between humans and computer programs. Our selection of Devanagari script-based CAPTCHA is based on the fact that it is used by a large number of Indian languages including Hindi which is the third most spoken language. There is potential for an exponential rise in the applications that are likely to be developed in that script thereby making it easy to secure Indian language based applications.Comment: 10 pages, 8 Figures, CCSEA 2011 - First International Conference, Chennai, July 15-17, 201

arXiv.org e-Print Archive

Crossref

Embedded noninteractive continuous bot detection

Author: Govindaraju Venu
Yampolskiy Roman V.
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/03/2008
Field of study

Multiplayer online computer games are quickly growing in popularity, with millions of players logging in every day. While most play in accordance with the rules set up by the game designers, some choose to utilize artificially intelligent assistant programs, a.k.a. bots, to gain an unfair advantage over other players. In this article we demonstrate how an embedded noninteractive test can be used to prevent automatic artificially intelligent players from illegally participating in online game-play. Our solution has numerous advantages over traditional tests, such as its nonobtrusive nature, continuous verification, and simple noninteractive and outsourcing-proof design. © 2008 ACM

University of Louisville

Evaluating the usability and security of a video CAPTCHA

Author: Kluever Kurt Alfred
Publication venue: RIT Scholar Works
Publication date: 01/01/2008
Field of study

A CAPTCHA is a variation of the Turing test, in which a challenge is used to distinguish humans from computers (`bots\u27) on the internet. They are commonly used to prevent the abuse of online services. CAPTCHAs discriminate using hard articial intelligence problems: the most common type requires a user to transcribe distorted characters displayed within a noisy image. Unfortunately, many users and them frustrating and break rates as high as 60% have been reported (for Microsoft\u27s Hotmail). We present a new CAPTCHA in which users provide three words (`tags\u27) that describe a video. A challenge is passed if a user\u27s tag belongs to a set of automatically generated ground-truth tags. In an experiment, we were able to increase human pass rates for our video CAPTCHAs from 69.7% to 90.2% (184 participants over 20 videos). Under the same conditions, the pass rate for an attack submitting the three most frequent tags (estimated over 86,368 videos) remained nearly constant (5% over the 20 videos, roughly 12.9% over a separate sample of 5146 videos). Challenge videos were taken from YouTube.com. For each video, 90 tags were added from related videos to the ground-truth set; security was maintained by pruning all tags with a frequency 0.6%. Tag stemming and approximate matching were also used to increase human pass rates. Only 20.1% of participants preferred text-based CAPTCHAs, while 58.2% preferred our video-based alternative. Finally, we demonstrate how our technique for extending the ground truth tags allows for different usability/security trade-offs, and discuss how it can be applied to other types of CAPTCHAs

RIT Scholar Works

CAPTCHA Types and Breaking Techniques: Design Issues, Challenges, and Future Research Directions

Author: Khan F. A.
Moqurrab S. A.
Srivastava G.
Tariq N.
Publication venue
Publication date: 16/07/2023
Field of study

The proliferation of the Internet and mobile devices has resulted in malicious bots access to genuine resources and data. Bots may instigate phishing, unauthorized access, denial-of-service, and spoofing attacks to mention a few. Authentication and testing mechanisms to verify the end-users and prohibit malicious programs from infiltrating the services and data are strong defense systems against malicious bots. Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) is an authentication process to confirm that the user is a human hence, access is granted. This paper provides an in-depth survey on CAPTCHAs and focuses on two main things: (1) a detailed discussion on various CAPTCHA types along with their advantages, disadvantages, and design recommendations, and (2) an in-depth analysis of different CAPTCHA breaking techniques. The survey is based on over two hundred studies on the subject matter conducted since 2003 to date. The analysis reinforces the need to design more attack-resistant CAPTCHAs while keeping their usability intact. The paper also highlights the design challenges and open issues related to CAPTCHAs. Furthermore, it also provides useful recommendations for breaking CAPTCHAs

arXiv.org e-Print Archive

Artificial Intelligence Within the Creative Process of Contemporary Classical Music

Author: Laidlow Robert
Publication venue
Publication date: 10/01/2023
Field of study

This submission consists of nine pieces of original music in addition to a reflective and critical commentary. With one exception, these pieces are each for live performance, written for ensembles and soloists of various descriptions. The exception is an audio-visual work for fixed media. These pieces were written as part of my practice-based research PhD and concern the relationship between artificial intelligence and my compositional process. They outline the development of my compositional practice, resulting in the piece Silicon for orchestra and electronics which forms a major part of this submission. The commentary details the algorithms used in the creation of this music, and the aesthetic concerns I developed through working with artificial intelligence. These include the relationship between future and past, authorship, authenticity, musical structuralism, and agency, amongst others. It also describes methods and techniques relating to specific musical elements I developed through working with AI which have had a significant impact on my work. This research builds upon the areas of research related to my own, especially contemporary classical music, creativity and its relationship to artificial intelligence, machine learning, and algorithmic music practice. It is intended to contribute to the growing field of artistic research that exists within and between these areas

E-space: Manchester Metropolitan University's Research Repository

Designing CAPTCHA Algorithm: Splitting and Rotating the Images against OCRs

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Generation of realistic human behaviour

Author: Vougioukas Konstantinos
Publication venue: Computing, Imperial College London
Publication date: 01/08/2022
Field of study

As the use of computers and robots in our everyday lives increases so does the need for better interaction with these devices. Human-computer interaction relies on the ability to understand and generate human behavioural signals such as speech, facial expressions and motion. This thesis deals with the synthesis and evaluation of such signals, focusing not only on their intelligibility but also on their realism. Since these signals are often correlated, it is common for methods to drive the generation of one signal using another. The thesis begins by tackling the problem of speech-driven facial animation and proposing models capable of producing realistic animations from a single image and an audio clip. The goal of these models is to produce a video of a target person, whose lips move in accordance with the driving audio. Particular focus is also placed on a) generating spontaneous expression such as blinks, b) achieving audio-visual synchrony and c) transferring or producing natural head motion. The second problem addressed in this thesis is that of video-driven speech reconstruction, which aims at converting a silent video into waveforms containing speech. The method proposed for solving this problem is capable of generating intelligible and accurate speech for both seen and unseen speakers. The spoken content is correctly captured thanks to a perceptual loss, which uses features from pre-trained speech-driven animation models. The ability of the video-to-speech model to run in real-time allows its use in hearing assistive devices and telecommunications. The final work proposed in this thesis is a generic domain translation system, that can be used for any translation problem including those mapping across different modalities. The framework is made up of two networks performing translations in opposite directions and can be successfully applied to solve diverse sets of translation problems, including speech-driven animation and video-driven speech reconstruction.Open Acces

Spiral - Imperial College Digital Repository