525 research outputs found
VGM-RNN: Recurrent Neural Networks for Video Game Music Generation
The recent explosion of interest in deep neural networks has affected and in some cases reinvigorated work in fields as diverse as natural language processing, image recognition, speech recognition and many more. For sequence learning tasks, recurrent neural networks and in particular LSTM-based networks have shown promising results. Recently there has been interest – for example in the research by Google’s Magenta team – in applying so-called “language modeling” recurrent neural networks to musical tasks, including for the automatic generation of original music. In this work we demonstrate our own LSTM-based music language modeling recurrent network. We show that it is able to learn musical features from a MIDI dataset and generate output that is musically interesting while demonstrating features of melody, harmony and rhythm. We source our dataset from VGMusic.com, a collection of user-submitted MIDI transcriptions of video game songs, and attempt to generate output which emulates this kind of music
A Framework for Devanagari Script-based Captcha
Human Interactive Proofs (HIPs) are automatic reverse Turing tests designed
to distinguish between various groups of users. Completely Automatic Public
Turing test to tell Computers and Humans Apart (CAPTCHA) is a HIP system that
distinguish between humans and malicious computer programs. Many CAPTCHAs have
been proposed in the literature that text-graphical based, audio-based,
puzzle-based and mathematical questions-based. The design and implementation of
CAPTCHAs fall in the realm of Artificial Intelligence. We aim to utilize
CAPTCHAs as a tool to improve the security of Internet based applications. In
this paper we present a framework for a text-based CAPTCHA based on Devanagari
script which can exploit the difference in the reading proficiency between
humans and computer programs. Our selection of Devanagari script-based CAPTCHA
is based on the fact that it is used by a large number of Indian languages
including Hindi which is the third most spoken language. There is potential for
an exponential rise in the applications that are likely to be developed in that
script thereby making it easy to secure Indian language based applications.Comment: 10 pages, 8 Figures, CCSEA 2011 - First International Conference,
Chennai, July 15-17, 201
Embedded noninteractive continuous bot detection
Multiplayer online computer games are quickly growing in popularity, with millions of players logging in every day. While most play in accordance with the rules set up by the game designers, some choose to utilize artificially intelligent assistant programs, a.k.a. bots, to gain an unfair advantage over other players. In this article we demonstrate how an embedded noninteractive test can be used to prevent automatic artificially intelligent players from illegally participating in online game-play. Our solution has numerous advantages over traditional tests, such as its nonobtrusive nature, continuous verification, and simple noninteractive and outsourcing-proof design. © 2008 ACM
Evaluating the usability and security of a video CAPTCHA
A CAPTCHA is a variation of the Turing test, in which a challenge is used to distinguish humans from computers (`bots\u27) on the internet. They are commonly used to prevent the abuse of online services. CAPTCHAs discriminate using hard articial intelligence problems: the most common type requires a user to transcribe distorted characters displayed within a noisy image. Unfortunately, many users and them frustrating and break rates as high as 60% have been reported (for Microsoft\u27s Hotmail). We present a new CAPTCHA in which users provide three words (`tags\u27) that describe a video. A challenge is passed if a user\u27s tag belongs to a set of automatically generated ground-truth tags. In an experiment, we were able to increase human pass rates for our video CAPTCHAs from 69.7% to 90.2% (184 participants over 20 videos). Under the same conditions, the pass rate for an attack submitting the three most frequent tags (estimated over 86,368 videos) remained nearly constant (5% over the 20 videos, roughly 12.9% over a separate sample of 5146 videos). Challenge videos were taken from YouTube.com. For each video, 90 tags were added from related videos to the ground-truth set; security was maintained by pruning all tags with a frequency 0.6%. Tag stemming and approximate matching were also used to increase human pass rates. Only 20.1% of participants preferred text-based CAPTCHAs, while 58.2% preferred our video-based alternative. Finally, we demonstrate how our technique for extending the ground truth tags allows for different usability/security trade-offs, and discuss how it can be applied to other types of CAPTCHAs
CAPTCHA Types and Breaking Techniques: Design Issues, Challenges, and Future Research Directions
The proliferation of the Internet and mobile devices has resulted in
malicious bots access to genuine resources and data. Bots may instigate
phishing, unauthorized access, denial-of-service, and spoofing attacks to
mention a few. Authentication and testing mechanisms to verify the end-users
and prohibit malicious programs from infiltrating the services and data are
strong defense systems against malicious bots. Completely Automated Public
Turing test to tell Computers and Humans Apart (CAPTCHA) is an authentication
process to confirm that the user is a human hence, access is granted. This
paper provides an in-depth survey on CAPTCHAs and focuses on two main things:
(1) a detailed discussion on various CAPTCHA types along with their advantages,
disadvantages, and design recommendations, and (2) an in-depth analysis of
different CAPTCHA breaking techniques. The survey is based on over two hundred
studies on the subject matter conducted since 2003 to date. The analysis
reinforces the need to design more attack-resistant CAPTCHAs while keeping
their usability intact. The paper also highlights the design challenges and
open issues related to CAPTCHAs. Furthermore, it also provides useful
recommendations for breaking CAPTCHAs
Artificial Intelligence Within the Creative Process of Contemporary Classical Music
This submission consists of nine pieces of original music in addition to a reflective and critical commentary. With one exception, these pieces are each for live performance, written for ensembles and soloists of various descriptions. The exception is an audio-visual work for fixed media.
These pieces were written as part of my practice-based research PhD and concern the relationship between artificial intelligence and my compositional process. They outline the development of my compositional practice, resulting in the piece Silicon for orchestra and electronics which forms a major part of this submission.
The commentary details the algorithms used in the creation of this music, and the aesthetic concerns I developed through working with artificial intelligence. These include the relationship between future and past, authorship, authenticity, musical structuralism, and agency, amongst others. It also describes methods and techniques relating to specific musical elements I developed through working with AI which have had a significant impact on my work.
This research builds upon the areas of research related to my own, especially contemporary classical music, creativity and its relationship to artificial intelligence, machine learning, and algorithmic music practice. It is intended to contribute to the growing field of artistic research that exists within and between these areas
Generation of realistic human behaviour
As the use of computers and robots in our everyday lives increases so does the need for better interaction with these devices. Human-computer interaction relies on the ability to understand and generate human behavioural signals such as speech, facial expressions and motion. This thesis deals with the synthesis and evaluation of such signals, focusing not only on their intelligibility but also on their realism. Since these signals are often correlated, it is common for methods to drive the generation of one signal using another. The thesis begins by tackling the problem of speech-driven facial animation and proposing models capable of producing realistic animations from a single image and an audio clip. The goal of these models is to produce a video of a target person, whose lips move in accordance with the driving audio. Particular focus is also placed on a) generating spontaneous expression such as blinks, b) achieving audio-visual synchrony and c) transferring or producing natural head motion. The second problem addressed in this thesis is that of video-driven speech reconstruction, which aims at converting a silent video into waveforms containing speech. The method proposed for solving this problem is capable of generating intelligible and accurate speech for both seen and unseen speakers. The spoken content is correctly captured thanks to a perceptual loss, which uses features from pre-trained speech-driven animation models. The ability of the video-to-speech model to run in real-time allows its use in hearing assistive devices and telecommunications. The final work proposed in this thesis is a generic domain translation system, that can be used for any translation problem including those mapping across different modalities. The framework is made up of two networks performing translations in opposite directions and can be successfully applied to solve diverse sets of translation problems, including speech-driven animation and video-driven speech reconstruction.Open Acces
- …