143 research outputs found

    Towards an All-Purpose Content-Based Multimedia Information Retrieval System

    Full text link
    The growth of multimedia collections - in terms of size, heterogeneity, and variety of media types - necessitates systems that are able to conjointly deal with several forms of media, especially when it comes to searching for particular objects. However, existing retrieval systems are organized in silos and treat different media types separately. As a consequence, retrieval across media types is either not supported at all or subject to major limitations. In this paper, we present vitrivr, a content-based multimedia information retrieval stack. As opposed to the keyword search approach implemented by most media management systems, vitrivr makes direct use of the object's content to facilitate different types of similarity search, such as Query-by-Example or Query-by-Sketch, for and, most importantly, across different media types - namely, images, audio, videos, and 3D models. Furthermore, we introduce a new web-based user interface that enables easy-to-use, multimodal retrieval from and browsing in mixed media collections. The effectiveness of vitrivr is shown on the basis of a user study that involves different query and media types. To the best of our knowledge, the full vitrivr stack is unique in that it is the first multimedia retrieval system that seamlessly integrates support for four different types of media. As such, it paves the way towards an all-purpose, content-based multimedia information retrieval system

    Audio Content-Based Music Retrieval

    Get PDF
    The rapidly growing corpus of digital audio material requires novel retrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that follow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their "specificity", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on "granularity", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise "audio identification", "audio matching", and "version identification". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework

    Data-Driven Sound Track Generation

    Get PDF
    Background music is often used to generate a specific atmosphere or to draw our attention to specific events. For example in movies or computer games it is often the accompanying music that conveys the emotional state of a scene and plays an important role for immersing the viewer or player into the virtual environment. In view of home-made videos, slide shows, and other consumer-generated visual media streams, there is a need for computer-assisted tools that allow users to generate aesthetically appealing music tracks in an easy and intuitive way. In this contribution, we consider a data-driven scenario where the musical raw material is given in form of a database containing a variety of audio recordings. Then, for a given visual media stream, the task consists in identifying, manipulating, overlaying, concatenating, and blending suitable music clips to generate a music stream that satisfies certain constraints imposed by the visual data stream and by user specifications. It is our main goal to give an overview of various content-based music processing and retrieval techniques that become important in data-driven sound track generation. In particular, we sketch a general pipeline that highlights how the various techniques act together and come into play when generating musically plausible transitions between subsequent music clips

    Signal processing methods for beat tracking, music segmentation, and audio retrieval

    Get PDF
    The goal of music information retrieval (MIR) is to develop novel strategies and techniques for organizing, exploring, accessing, and understanding music data in an efficient manner. The conversion of waveform-based audio data into semantically meaningful feature representations by the use of digital signal processing techniques is at the center of MIR and constitutes a difficult field of research because of the complexity and diversity of music signals. In this thesis, we introduce novel signal processing methods that allow for extracting musically meaningful information from audio signals. As main strategy, we exploit musical knowledge about the signals\u27 properties to derive feature representations that show a significant degree of robustness against musical variations but still exhibit a high musical expressiveness. We apply this general strategy to three different areas of MIR: Firstly, we introduce novel techniques for extracting tempo and beat information, where we particularly consider challenging music with changing tempo and soft note onsets. Secondly, we present novel algorithms for the automated segmentation and analysis of folk song field recordings, where one has to cope with significant fluctuations in intonation and tempo as well as recording artifacts. Thirdly, we explore a cross-version approach to content-based music retrieval based on the query-by-example paradigm. In all three areas, we focus on application scenarios where strong musical variations make the extraction of musically meaningful information a challenging task.Ziel der automatisierten Musikverarbeitung ist die Entwicklung neuer Strategien und Techniken zur effizienten Organisation großer Musiksammlungen. Ein Schwerpunkt liegt in der Anwendung von Methoden der digitalen Signalverarbeitung zur Umwandlung von Audiosignalen in musikalisch aussagekräftige Merkmalsdarstellungen. Große Herausforderungen bei dieser Aufgabe ergeben sich aus der Komplexität und Vielschichtigkeit der Musiksignale. In dieser Arbeit werden neuartige Methoden vorgestellt, mit deren Hilfe musikalisch interpretierbare Information aus Musiksignalen extrahiert werden kann. Hierbei besteht eine grundlegende Strategie in der konsequenten Ausnutzung musikalischen Vorwissens, um Merkmalsdarstellungen abzuleiten die zum einen ein hohes Maß an Robustheit gegenüber musikalischen Variationen und zum anderen eine hohe musikalische Ausdruckskraft besitzen. Dieses Prinzip wenden wir auf drei verschieden Aufgabenstellungen an: Erstens stellen wir neuartige Ansätze zur Extraktion von Tempo- und Beat-Information aus Audiosignalen vor, die insbesondere auf anspruchsvolle Szenarien mit wechselnden Tempo und weichen Notenanfängen angewendet werden. Zweitens tragen wir mit neuartigen Algorithmen zur Segmentierung und Analyse von Feldaufnahmen von Volksliedern unter Vorliegen großer Intonationsschwankungen bei. Drittens entwickeln wir effiziente Verfahren zur inhaltsbasierten Suche in großen Datenbeständen mit dem Ziel, verschiedene Interpretationen eines Musikstückes zu detektieren. In allen betrachteten Szenarien richten wir unser Augenmerk insbesondere auf die Fälle in denen auf Grund erheblicher musikalischer Variationen die Extraktion musikalisch aussagekräftiger Informationen eine große Herausforderung darstellt

    Music Synchronization, Audio Matching, Pattern Detection, and User Interfaces for a Digital Music Library System

    Get PDF
    Over the last two decades, growing efforts to digitize our cultural heritage could be observed. Most of these digitization initiatives pursuit either one or both of the following goals: to conserve the documents - especially those threatened by decay - and to provide remote access on a grand scale. For music documents these trends are observable as well, and by now several digital music libraries are in existence. An important characteristic of these music libraries is an inherent multimodality resulting from the large variety of available digital music representations, such as scanned score, symbolic score, audio recordings, and videos. In addition, for each piece of music there exists not only one document of each type, but many. Considering and exploiting this multimodality and multiplicity, the DFG-funded digital library initiative PROBADO MUSIC aimed at developing a novel user-friendly interface for content-based retrieval, document access, navigation, and browsing in large music collections. The implementation of such a front end requires the multimodal linking and indexing of the music documents during preprocessing. As the considered music collections can be very large, the automated or at least semi-automated calculation of these structures would be recommendable. The field of music information retrieval (MIR) is particularly concerned with the development of suitable procedures, and it was the goal of PROBADO MUSIC to include existing and newly developed MIR techniques to realize the envisioned digital music library system. In this context, the present thesis discusses the following three MIR tasks: music synchronization, audio matching, and pattern detection. We are going to identify particular issues in these fields and provide algorithmic solutions as well as prototypical implementations. In Music synchronization, for each position in one representation of a piece of music the corresponding position in another representation is calculated. This thesis focuses on the task of aligning scanned score pages of orchestral music with audio recordings. Here, a previously unconsidered piece of information is the textual specification of transposing instruments provided in the score. Our evaluations show that the neglect of such information can result in a measurable loss of synchronization accuracy. Therefore, we propose an OCR-based approach for detecting and interpreting the transposition information in orchestral scores. For a given audio snippet, audio matching methods automatically calculate all musically similar excerpts within a collection of audio recordings. In this context, subsequence dynamic time warping (SSDTW) is a well-established approach as it allows for local and global tempo variations between the query and the retrieved matches. Moving to real-life digital music libraries with larger audio collections, however, the quadratic runtime of SSDTW results in untenable response times. To improve on the response time, this thesis introduces a novel index-based approach to SSDTW-based audio matching. We combine the idea of inverted file lists introduced by Kurth and Müller (Efficient index-based audio matching, 2008) with the shingling techniques often used in the audio identification scenario. In pattern detection, all repeating patterns within one piece of music are determined. Usually, pattern detection operates on symbolic score documents and is often used in the context of computer-aided motivic analysis. Envisioned as a new feature of the PROBADO MUSIC system, this thesis proposes a string-based approach to pattern detection and a novel interactive front end for result visualization and analysis

    On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism

    Full text link
    Barrón Cedeño, LA. (2012). On the Mono- and Cross-Language Detection of Text Re-Use and Plagiarism [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/16012Palanci

    CREATE: Concept Representation and Extraction from Heterogeneous Evidence

    Get PDF
    Traditional information retrieval methodology is guided by document retrieval paradigm, where relevant documents are returned in response to user queries. This paradigm faces serious drawback if the desired result is not explicitly present in a single document. The problem becomes more obvious when a user tries to obtain complete information about a real world entity, such as person, company, location etc. In such cases, various facts about the target entity or concept need to be gathered from multiple document sources. In this work, we present a method to extract information about a target entity based on the concept retrieval paradigm that focuses on extracting and blending information related to a concept from multiple sources if necessary. The paradigm is built around a generic notion of concept which is defined as any item that can be thought of as a topic of interest. Concepts may correspond to any real world entity such as restaurant, person, city, organization, etc, or any abstract item such as news topic, event, theory, etc. Web is a heterogeneous collection of data in different forms such as facts, news, opinions etc. We propose different models for different forms of data, all of which work towards the same goal of concept centric retrieval. We motivate our work based on studies about current trends and demands for information seeking. The framework helps in understanding the intent of content, i.e. opinion versus fact. Our work has been conducted on free text data in English. Nevertheless, our framework can be easily transferred to other languages

    Identificación de versiones musicales (covers) utilizando aprendizaje maquinal

    Get PDF
    The task of recognizing when a song is a version or cover of another is a relatively easy task to do for humans when the song is known. However, to cause that a machine perform this work is complex due to the number of variables involved in preparing the cover, including variations in rhythm, tempo, instrumentation, genre and duration compared to the original version. In this project a methodology to identify covers from the application and analysis of machine-learning techniques, statistical signal processing and second order statistics was developed, in order to get that configuration to give the best results. For this we worked with the database Dataset Million Songs that gave us the metadata of the songs, from which data belonging to the acoustic characteristics of the song, such as pitches and timbres were used. Throughout the project we experimented with different data treatment techniques applied to the metadata provided by the database and we could see its usefulness to the task at hand. According to the results, a system that integrates processing frequency on pitches aligned with the beat, the implementation of a sparse coding and a data clustering system that showed a 63% correct identification of covers was obtained. Information on the possible use of supervised learning techniques combined with different types of metrics giving rise to future experiments to improve the results was also obtained.La tarea de reconocer cuándo una canción es una versión o cover de otra es una tarea relativamente fácil para el ser humano cuando se conoce la canción. Sin embargo, hacer que una máquina realice este trabajo resulta complejo debido al número de variables que se involucran en la elaboración del cover, mismas que incluyen variaciones en el ritmo, tempo, instrumentación, género y duración con respecto a la versión original. En este proyecto se desarrolló una metodología para identificar covers a partir de la aplicación y análisis de técnicas de aprendizaje maquinal, procesamiento de señales y estadística de segundo orden con la finalidad de obtener aquella configuración que diera los mejores resultados. Para esto se trabajó con la base de datos Million Songs Dataset que nos otorgó los metadatos de las canciones, a partir de los cuales se utilizaron los datos pertenecientes a las características acústicas de la canción, tales como, pitches y timbres. A lo largo del proyecto se experimentó con diferentes técnicas de tratamiento de los metadatos que proporcionó la base de datos y se pudo apreciar su utilidad para la tarea a desarrollar. De acuerdo a los resultados obtenidos, se obtuvo un sistema que integra un procesamiento en frecuencia sobre los pitches alineados con el beat, la aplicación de una codificación rala y un sistema de agrupamiento de datos que arrojó un 63% de identificación correcta de covers. También se obtuvo información sobre el posible uso de técnicas combinadas de aprendizaje supervisado con diferentes tipos de métricas dando pie a futuras experimentaciones para mejorar los resultados

    Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

    Get PDF
    [[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI

    Linked Data Supported Information Retrieval

    Get PDF
    Um Inhalte im World Wide Web ausfindig zu machen, sind Suchmaschienen nicht mehr wegzudenken. Semantic Web und Linked Data Technologien ermöglichen ein detaillierteres und eindeutiges Strukturieren der Inhalte und erlauben vollkommen neue Herangehensweisen an die Lösung von Information Retrieval Problemen. Diese Arbeit befasst sich mit den Möglichkeiten, wie Information Retrieval Anwendungen von der Einbeziehung von Linked Data profitieren können. Neue Methoden der computer-gestützten semantischen Textanalyse, semantischen Suche, Informationspriorisierung und -visualisierung werden vorgestellt und umfassend evaluiert. Dabei werden Linked Data Ressourcen und ihre Beziehungen in die Verfahren integriert, um eine Steigerung der Effektivität der Verfahren bzw. ihrer Benutzerfreundlichkeit zu erzielen. Zunächst wird eine Einführung in die Grundlagen des Information Retrieval und Linked Data gegeben. Anschließend werden neue manuelle und automatisierte Verfahren zum semantischen Annotieren von Dokumenten durch deren Verknüpfung mit Linked Data Ressourcen vorgestellt (Entity Linking). Eine umfassende Evaluation der Verfahren wird durchgeführt und das zu Grunde liegende Evaluationssystem umfangreich verbessert. Aufbauend auf den Annotationsverfahren werden zwei neue Retrievalmodelle zur semantischen Suche vorgestellt und evaluiert. Die Verfahren basieren auf dem generalisierten Vektorraummodell und beziehen die semantische Ähnlichkeit anhand von taxonomie-basierten Beziehungen der Linked Data Ressourcen in Dokumenten und Suchanfragen in die Berechnung der Suchergebnisrangfolge ein. Mit dem Ziel die Berechnung von semantischer Ähnlichkeit weiter zu verfeinern, wird ein Verfahren zur Priorisierung von Linked Data Ressourcen vorgestellt und evaluiert. Darauf aufbauend werden Visualisierungstechniken aufgezeigt mit dem Ziel, die Explorierbarkeit und Navigierbarkeit innerhalb eines semantisch annotierten Dokumentenkorpus zu verbessern. Hierfür werden zwei Anwendungen präsentiert. Zum einen eine Linked Data basierte explorative Erweiterung als Ergänzung zu einer traditionellen schlüsselwort-basierten Suchmaschine, zum anderen ein Linked Data basiertes Empfehlungssystem
    corecore