147 research outputs found

    Automated analysis of musical structure

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 93-96).Listening to music and perceiving its structure is a fairly easy task for humans, even for listeners without formal musical training. For example, we can notice changes of notes, chords and keys, though we might not be able to name them (segmentation based on tonality and harmonic analysis); we can parse a musical piece into phrases or sections (segmentation based on recurrent structural analysis); we can identify and memorize the main themes or the catchiest parts - hooks - of a piece (summarization based on hook analysis); we can detect the most informative musical parts for making certain judgments (detection of salience for classification). However, building computational models to mimic these processes is a hard problem. Furthermore, the amount of digital music that has been generated and stored has already become unfathomable. How to efficiently store and retrieve the digital content is an important real-world problem. This dissertation presents our research on automatic music segmentation, summarization and classification using a framework combining music cognition, machine learning and signal processing. It will inquire scientifically into the nature of human perception of music, and offer a practical solution to difficult problems of machine intelligence for automatic musical content analysis and pattern discovery.(cont.) Specifically, for segmentation, an HMM-based approach will be used for key change and chord change detection; and a method for detecting the self-similarity property using approximate pattern matching will be presented for recurrent structural analysis. For summarization, we will investigate the locations where the catchiest parts of a musical piece normally appear and develop strategies for automatically generating music thumbnails based on this analysis. For musical salience detection, we will examine methods for weighting the importance of musical segments based on the confidence of classification. Two classification techniques and their definitions of confidence will be explored. The effectiveness of all our methods will be demonstrated by quantitative evaluations and/or human experiments on complex real-world musical stimuli.by Wei Chai.Ph.D

    A Comparative Law Perspective on Intermediaries\u27 Direct Liability in Cloud Computing Context -- A Proposal for China

    Get PDF
    This dissertation is motivated by two questions: How does the emergence of cloud-computing technology impact major countries’ copyright law regarding the issue of intermediaries’ direct liability? What should Chinese legislature body learn from those countries regarding this issue? Answering the first question lays a foundation for answering the second question. Usually, a cloud-computing intermediary’s specific activity may possess risk of violating a copyright holder’s right of reproduction, right of communication to the public and right of distribution. Comparatively, that intermediary can raise defenses under the exhaustion doctrine and the fair use doctrine. Analysis on these two topics consists of two parts. The first part examines copyright law in major countries or regional organizations such as the U.S., Japan or the European Union. The second part is an analysis of current related Chinese legislation and a proposal for China. This dissertation examines relevant international copyright treaties, major countries’ related legislature documents and related cases. This dissertation offers a thorough legal analysis how cloud-computing technology affects copyright worldwide. The proposal at the end consists of two parts. The first part provides four general legislature advices for China. The second part focuses on how China’s legislature should adjust copyright owner’s exclusive rights and intermediaries’ defense theories to react the impact brought by the cloud-computing technology

    Adapting Copyright for the Mashup Generation

    Get PDF

    Visualizing Music Collections Based on Metadata: Concepts, User Studies and Design Implications

    Get PDF
    Modern digital music services and applications enable easy access to vast online and local music collections. To differentiate from their competitors, software developers should aim to design novel, interesting, entertaining, and easy-to-use user interfaces (UIs) and interaction methods for accessing the music collections. One potential approach is to replace or complement the textual lists with static, dynamic, adaptive, and/or interactive visualizations of selected musical attributes. A well-designed visualization has the potential to make interaction with a service or an application an entertaining and intuitive experience, and it can also improve the usability and efficiency of the system. This doctoral thesis belongs to the intersection of the fields of human-computer interaction (HCI), music information retrieval (MIR), and information visualization (Infovis). HCI studies the design, implementation and evaluation of interactive computing systems; MIR focuses on the different strategies for helping users seek music or music-related information; and Infovis studies the use of visual representations of abstract data to amplify cognition. The purpose of the thesis is to explore the feasibility of visualizing music collections based on three types of musical metadata: musical genre, tempo, and the release year of the music. More specifically, the research goal is to study which visual variables and structures are best suitable for representing the metadata, and how the visualizations can be used in the design of novel UIs for music player applications, including music recommendation systems. The research takes a user- centered and constructive design-science approach, and covers all the different aspects of interaction design: understanding the users, the prototype design, and the evaluation. The performance of the different visualizations from the user perspective was studied in a series of online surveys with 51-104 (mostly Finnish) participants. In addition to tempo and release year, five different visualization methods (colors, icons, fonts, emoticons and avatars) for representing musical genres were investigated. Based on the results, promising ways to represent tempo include the number of objects, shapes with a varying number of corners, and y-axis location combined with some other visual variable or clear labeling. Promising ways to represent the release year include lightness and the perceived location on the z- or x-axis. In the case of genres, the most successful method was the avatars, which used elements from the other methods and required the most screen estate. In the second part of the thesis, three interactive prototype applications (avatars, potentiometers and a virtual world) focusing on visualizing musical genres were designed and evaluated with 40-41 Finnish participants. While the concepts had great potential for complementing traditional text-based music applications, they were too simple and restricted to replace them in longer-term use. Especially the lack of textual search functionality was seen as a major shortcoming. Based on the results of the thesis, it is possible to design recognizable, acceptable, entertaining, and easy-to-use (especially genre) visualizations with certain limitations. Important factors include, e.g., the used metadata vocabulary (e.g., set of musical genres) and visual variables/structures; preferred music discovery mode; available screen estate; and the target culture of the visualizations

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    User-centric Music Information Retrieval

    Get PDF
    The rapid growth of the Internet and the advancements of the Web technologies have made it possible for users to have access to large amounts of on-line music data, including music acoustic signals, lyrics, style/mood labels, and user-assigned tags. The progress has made music listening more fun, but has raised an issue of how to organize this data, and more generally, how computer programs can assist users in their music experience. An important subject in computer-aided music listening is music retrieval, i.e., the issue of efficiently helping users in locating the music they are looking for. Traditionally, songs were organized in a hierarchical structure such as genre-\u3eartist-\u3ealbum-\u3etrack, to facilitate the users’ navigation. However, the intentions of the users are often hard to be captured in such a simply organized structure. The users may want to listen to music of a particular mood, style or topic; and/or any songs similar to some given music samples. This motivated us to work on user-centric music retrieval system to improve users’ satisfaction with the system. The traditional music information retrieval research was mainly concerned with classification, clustering, identification, and similarity search of acoustic data of music by way of feature extraction algorithms and machine learning techniques. More recently the music information retrieval research has focused on utilizing other types of data, such as lyrics, user access patterns, and user-defined tags, and on targeting non-genre categories for classification, such as mood labels and styles. This dissertation focused on investigating and developing effective data mining techniques for (1) organizing and annotating music data with styles, moods and user-assigned tags; (2) performing effective analysis of music data with features from diverse information sources; and (3) recommending music songs to the users utilizing both content features and user access patterns

    Integrating 3D Objects and Pose Estimation for Multimodal Video Annotations

    Get PDF
    With the recent technological advancements, using video has become a focal point on many ubiquitous activities, from presenting ideas to our peers to studying specific events or even simply storing relevant video clips. As a result, taking or making notes can become an invaluable tool in this process by helping us to retain knowledge, document information, or simply reason about recorded contents. This thesis introduces new features for a pre-existing Web-Based multimodal anno- tation tool, namely the integration of 3D components in the current system and pose estimation algorithms aimed at the moving elements in the multimedia content. There- fore, the 3D developments will allow the user to experience a more immersive interaction with the tool by being able to visualize 3D objects either in a neutral or 360º background to then use them as traditional annotations. Afterwards, mechanisms for successfully integrating these 3D models on the currently loaded video will be explored, along with a detailed overview of the use of keypoints (pose estimation) to highlight details in this same setting. The goal of this thesis will thus be the development and evaluation of these features seeking the construction of a virtual environment in which a user can successfully work on a video by combining different types of annotations.Ao longo dos anos, a utilização de video tornou-se um aspecto fundamental em várias das atividades realizadas no quotidiano como seja em demonstrações e apresentações profissionais, para a análise minuciosa de detalhes visuais ou até simplesmente para preservar videos considerados relevantes. Deste modo, o uso de anotações no decorrer destes processos e semelhantes, constitui um fator de elevada importância ao melhorar potencialmente a nossa compreensão relativa aos conteúdos em causa e também a ajudar a reter características importantes ou a documentar informação pertinente. Efetivamente, nesta tese pretende-se introduzir novas funcionalidades para uma fer- ramenta de anotação multimodal, nomeadamente, a integração de componentes 3D no sistema atual e algorítmos de Pose Estimation com vista à deteção de elementos em mo- vimento em video. Assim, com estas features procura-se proporcionar um experiência mais imersiva ao utilizador ao permitir, por exemplo, a visualização preliminar de objec- tos num plano tridimensional em fundos neutros ou até 360º antes de os utilizar como elementos de anotação tradicionais. Com efeito, serão explorados mecanismos para a integração eficiente destes modelos 3D em video juntamente com o uso de keypoints (pose estimation) permitindo acentuar pormenores neste ambiente de visualização. O objetivo desta tese será, assim, o desenvol- vimento e avaliação continuada destas funcionalidades de modo a potenciar o seu uso em ambientes virtuais em simultaneo com as diferentes tipos de anotações já existentes

    Artificial Intelligence for Multimedia Signal Processing

    Get PDF
    Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining
    • …
    corecore