Search CORE

162 research outputs found

Automated Music Slideshow Generation using Web Images Based on Lyrics

Author: 舟澤慎太郎
Publication venue: 甲藤研究室
Publication date: 05/02/2010
Field of study

Waseda University Repository

The Ethnic Lyrics Fetcher tool

Author: Carlos N Silla Jr
Murilo AP Almeida
Rafael P Ribeiro
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

Springer - Publisher Connector

Developing App Features for Figure Drawing Practice

Author: Huynh Steven T
Publication venue: Digital WPI
Publication date: 13/12/2019
Field of study

DrawSession is a website where people can practice figure drawing; it accomplishes this by emulating life drawing sessions. The goal of this project was to develop new features for DrawSession, such as recommending poses and finding anatomical landmarks with pose estimation. These features were derived from research on deliberate practice, figure drawing, progressive web apps, and prior related work. Two observational studies were conducted in order to evaluate DrawSession with these features. Ultimately, the new features were well received, but further research about visual art teachers is recommended in order to make DrawSession more accessible to a wider audience

DigitalCommons@WPI

Second Level Agreements

Author: Lev-Aretz Yafit
Publication venue: IdeaExchange@UAkron
Publication date: 23/06/2015
Field of study

This Article analyzes in-depth a significant practice that has not been recognized in legal scholarship. Their unique structure and the way in which Second Level Agreements have developed within the relatively short time of their existence have important consequences for the various players in the copyright market...The Article also offers a normative assessment of the benefits and shortcomings of the Second Level Agreements practice...The Article then carefully looks at the future of Second Level Agreements while reviewing four potential catalysts—the shift towards premium content, the Viacom v. Google ruling, the move towards disintermediation, and the rise of noncommercial licensing system. The remainder of this Article consists of seven parts. Part II discusses the emergence of the Web 2.0 age, and offers a useful classification of UGC content. Part III demonstrates the application of current copyright law to UGC platforms by reviewing theories of secondary liability and the Digital Millennium Copyright Act’s (“DMCA”) provisions. Part IV considers the practice of tolerated use and its contribution to UGC platforms’ success and users’ ability to generate derivative content. Second Level Agreements are described and studied thoroughly in Part V. Part VI offers a normative contribution and careful prediction of the future of Second Level Agreements and copyright in the digital realm. A conclusion follows in Part VII

The University of Akron

Soundtrack recommendation for images

Author: Stupar Aleksandar
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2013
Field of study

The drastic increase in production of multimedia content has emphasized the research concerning its organization and retrieval. In this thesis, we address the problem of music retrieval when a set of images is given as input query, i.e., the problem of soundtrack recommendation for images. The task at hand is to recommend appropriate music to be played during the presentation of a given set of query images. To tackle this problem, we formulate a hypothesis that the knowledge appropriate for the task is contained in publicly available contemporary movies. Our approach, Picasso, employs similarity search techniques inside the image and music domains, harvesting movies to form a link between the domains. To achieve a fair and unbiased comparison between different soundtrack recommendation approaches, we proposed an evaluation benchmark. The evaluation results are reported for Picasso and the baseline approach, using the proposed benchmark. We further address two efficiency aspects that arise from the Picasso approach. First, we investigate the problem of processing top-K queries with set-defined selections and propose an index structure that aims at minimizing the query answering latency. Second, we address the problem of similarity search in high-dimensional spaces and propose two enhancements to the Locality Sensitive Hashing (LSH) scheme. We also investigate the prospects of a distributed similarity search algorithm based on LSH using the MapReduce framework. Finally, we give an overview of the PicasSound|a smartphone application based on the Picasso approach.Der drastische Anstieg von verfügbaren Multimedia-Inhalten hat die Bedeutung der Forschung über deren Organisation sowie Suche innerhalb der Daten hervorgehoben. In dieser Doktorarbeit betrachten wir das Problem der Suche nach geeigneten Musikstücken als Hintergrundmusik für Diashows. Wir formulieren die Hypothese, dass die für das Problem erforderlichen Kenntnisse in öffentlich zugänglichen, zeitgenössischen Filmen enthalten sind. Unser Ansatz, Picasso, verwendet Techniken aus dem Bereich der Ähnlichkeitssuche innerhalb von Bild- und Musik-Domains, um basierend auf Filmszenen eine Verbindung zwischen beliebigen Bildern und Musikstücken zu lernen. Um einen fairen und unvoreingenommenen Vergleich zwischen verschiedenen Ansätzen zur Musikempfehlung zu erreichen, schlagen wir einen Bewertungs-Benchmark vor. Die Ergebnisse der Auswertung werden, anhand des vorgeschlagenen Benchmarks, für Picasso und einen weiteren, auf Emotionen basierenden Ansatz, vorgestellt. Zusätzlich behandeln wir zwei Effizienzaspekte, die sich aus dem Picasso Ansatz ergeben. (i) Wir untersuchen das Problem der Ausführung von top-K Anfragen, bei denen die Ergebnismenge ad-hoc auf eine kleine Teilmenge des gesamten Indexes eingeschränkt wird. (ii) Wir behandeln das Problem der Ähnlichkeitssuche in hochdimensionalen Räumen und schlagen zwei Erweiterungen des Lokalitätssensitiven Hashing (LSH) Schemas vor. Zusätzlich untersuchen wir die Erfolgsaussichten eines verteilten Algorithmus für die Ähnlichkeitssuche, der auf LSH unter Verwendung des MapReduce Frameworks basiert. Neben den vorgenannten wissenschaftlichen Ergebnissen beschreiben wir ferner das Design und die Implementierung von PicassSound, einer auf Picasso basierenden Smartphone-Anwendung

Universaar

Acronym

MPG.PuRe

Upper Tag Ontology (UTO) For Integrating Social Tagging Data

Author: Ding Ying
Foo Schubert
Fried Michael
Jacob Elin K.
Toma Ioan
Yan Erjia
Publication venue: Journal of the American Society for Information Science and Technology
Publication date: 01/03/2010
Field of study

Data integration and mediation have become central concerns of information technology over the past few decades. With the advent of the Web and the rapid increases in the amount of data and the number of Web documents and users, researchers have focused on enhancing the interoperability of data through the development of metadata schemes. Other researchers have looked to the wealth of metadata generated by bookmarking sites on the Social Web. While several existing ontologies have capitalized on the semantics of metadata created by tagging activities, the Upper Tag Ontology (UTO) emphasizes the structure of tagging activities to facilitate modeling of tagging data and the integration of data from different bookmarking sites as well as the alignment of tagging ontologies. UTO is described and its utility in modeling, harvesting, integrating, searching, and analyzing data is demonstrated with metadata harvested from three major social tagging systems (Delicious, Flickr, and YouTube)

IUScholarWorks (University of Indiana)

Multimodal Video Analysis and Modeling

Author: Roininen Mikko
Publication venue: Tampere University of Technology
Publication date: 01/01/2016
Field of study

From recalling long forgotten experiences based on a familiar scent or on a piece of music, to lip reading aided conversation in noisy environments or travel sickness caused by mismatch of the signals from vision and the vestibular system, the human perception manifests countless examples of subtle and effortless joint adoption of the multiple senses provided to us by evolution. Emulating such multisensory (or multimodal, i.e., comprising multiple types of input modes or modalities) processing computationally offers tools for more effective, efficient, or robust accomplishment of many multimedia tasks using evidence from the multiple input modalities. Information from the modalities can also be analyzed for patterns and connections across them, opening up interesting applications not feasible with a single modality, such as prediction of some aspects of one modality based on another. In this dissertation, multimodal analysis techniques are applied to selected video tasks with accompanying modalities. More speciﬁcally, all the tasks involve some type of analysis of videos recorded by non-professional videographers using mobile devices.Fusion of information from multiple modalities is applied to recording environment classiﬁcation from video and audio as well as to sport type classiﬁcation from a set of multi-device videos, corresponding audio, and recording device motion sensor data. The environment classiﬁcation combines support vector machine (SVM) classiﬁers trained on various global visual low-level features with audio event histogram based environment classiﬁcation using k nearest neighbors (k-NN). Rule-based fusion schemes with genetic algorithm (GA)-optimized modality weights are compared to training a SVM classiﬁer to perform the multimodal fusion. A comprehensive selection of fusion strategies is compared for the task of classifying the sport type of a set of recordings from a common event. These include fusion prior to, simultaneously with, and after classiﬁcation; various approaches for using modality quality estimates; and fusing soft conﬁdence scores as well as crisp single-class predictions. Additionally, different strategies are examined for aggregating the decisions of single videos to a collective prediction from the set of videos recorded concurrently with multiple devices. In both tasks multimodal analysis shows clear advantage over separate classiﬁcation of the modalities.Another part of the work investigates cross-modal pattern analysis and audio-based video editing. This study examines the feasibility of automatically timing shot cuts of multi-camera concert recordings according to music-related cutting patterns learnt from professional concert videos. Cut timing is a crucial part of automated creation of multicamera mashups, where shots from multiple recording devices from a common event are alternated with the aim at mimicing a professionally produced video. In the framework, separate statistical models are formed for typical patterns of beat-quantized cuts in short segments, differences in beats between consecutive cuts, and relative deviation of cuts from exact beat times. Based on music meter and audio change point analysis of a new recording, the models can be used for synthesizing cut times. In a user study the proposed framework clearly outperforms a baseline automatic method with comparably advanced audio analysis and wins 48.2 % of comparisons against hand-edited videos

Trepo - Institutional Repository of Tampere University

Cultural Context-Aware Models and IT Applications for the Exploitation of Musical Heritage

Author: Pretto Niccolò
Publication venue
Publication date: 30/11/2018
Field of study

Information engineering has always expanded its scope by inspiring innovation in different scientific disciplines. In particular, in the last sixty years, music and engineering have forged a strong connection in the discipline known as “Sound and Music Computing”. Musical heritage is a paradigmatic case that includes several multi-faceted cultural artefacts and traditions. Several issues arise from the analog-digital transfer of cultural objects, concerning their creation, preservation, access, analysis and experiencing. The keystone is the relationship of these digitized cultural objects with their carrier and cultural context. The terms “cultural context” and “cultural context awareness” are delineated, alongside the concepts of contextual information and metadata. Since they maintain the integrity of the object, its meaning and cultural context, their role is critical. This thesis explores three main case studies concerning historical audio recordings and ancient musical instruments, aiming to delineate models to preserve, analyze, access and experience the digital versions of these three prominent examples of musical heritage. The first case study concerns analog magnetic tapes, and, in particular, tape music, a particular experimental music born in the second half of the XX century. This case study has relevant implications from the musicology, philology and archivists’ points of view, since the carrier has a paramount role and the tight connection with its content can easily break during the digitization process or the access phase. With the aim to help musicologists and audio technicians in their work, several tools based on Artificial Intelligence are evaluated in tasks such as the discontinuity detection and equalization recognition. By considering the peculiarities of tape music, the philological problem of stemmatics in digitized audio documents is tackled: an algorithm based on phylogenetic techniques is proposed and assessed, confirming the suitability of these techniques for this task. Then, a methodology for a historically faithful access to digitized tape music recordings is introduced, by considering contextual information and its relationship with the carrier and the replay device. Based on this methodology, an Android app which virtualizes a tape recorder is presented, together with its assessment. Furthermore, two web applications are proposed to faithfully experience digitized 78 rpm discs and magnetic tape recordings, respectively. Finally, a prototype of web application for musicological analysis is presented. This aims to concentrate relevant part of the knowledge acquired in this work into a single interface. The second case study is a corpus of Arab-Andalusian music, suitable for computational research, which opens new opportunities to musicological studies by applying data-driven analysis. The description of the corpus is based on the five criteria formalized in the CompMusic project of the University Pompeu Fabra of Barcelona: purpose, coverage, completeness, quality and re-usability. Four Jupyter notebooks were developed with the aim to provide a useful tool for computational musicologists for analyzing and using data and metadata of such corpus. The third case study concerns an exceptional historical musical instrument: an ancient Pan flute exhibited at the Museum of Archaeological Sciences and Art of the University of Padova. The final objective was the creation of a multimedia installation to valorize this precious artifact and to allow visitors to interact with the archaeological find and to learn its history. The case study provided the opportunity to study a methodology suitable for the valorization of this ancient musical instrument, but also extendible to other artifacts or museum collections. Both the methodology and the resulting multimedia installation are presented, followed by the assessment carried out by a multidisciplinary group of experts

Archivio istituzionale della ricerca - Università di Padova

The Ithacan, 1999-03-25

Author: Ithaca College
Publication venue: Digital Commons IC
Publication date: 25/03/1999
Field of study

https://digitalcommons.ithaca.edu/ithacan_1998-99/1022/thumbnail.jp

Ithaca College