905 research outputs found

    Hashing for Multimedia Similarity Modeling and Large-Scale Retrieval

    Get PDF
    In recent years, the amount of multimedia data such as images, texts, and videos have been growing rapidly on the Internet. Motivated by such trends, this thesis is dedicated to exploiting hashing-based solutions to reveal multimedia data correlations and support intra-media and inter-media similarity search among huge volumes of multimedia data. We start by investigating a hashing-based solution for audio-visual similarity modeling and apply it to the audio-visual sound source localization problem. We show that synchronized signals in audio and visual modalities demonstrate similar temporal changing patterns in certain feature spaces. We propose to use a permutation-based random hashing technique to capture the temporal order dynamics of audio and visual features by hashing them along the temporal axis into a common Hamming space. In this way, the audio-visual correlation problem is transformed into a similarity search problem in the Hamming space. Our hashing-based audio-visual similarity modeling has shown superior performances in the localization and segmentation of sounding objects in videos. The success of the permutation-based hashing method motivates us to generalize and formally define the supervised ranking-based hashing problem, and study its application to large-scale image retrieval. Specifically, we propose an effective supervised learning procedure to learn optimized ranking-based hash functions that can be used for large-scale similarity search. Compared with the randomized version, the optimized ranking-based hash codes are much more compact and discriminative. Moreover, it can be easily extended to kernel space to discover more complex ranking structures that cannot be revealed in linear subspaces. Experiments on large image datasets demonstrate the effectiveness of the proposed method for image retrieval. We further studied the ranking-based hashing method for the cross-media similarity search problem. Specifically, we propose two optimization methods to jointly learn two groups of linear subspaces, one for each media type, so that features\u27 ranking orders in different linear subspaces maximally preserve the cross-media similarities. Additionally, we develop this ranking-based hashing method in the cross-media context into a flexible hashing framework with a more general solution. We have demonstrated through extensive experiments on several real-world datasets that the proposed cross-media hashing method can achieve superior cross-media retrieval performances against several state-of-the-art algorithms. Lastly, to make better use of the supervisory label information, as well as to further improve the efficiency and accuracy of supervised hashing, we propose a novel multimedia discrete hashing framework that optimizes an instance-wise loss objective, as compared to the pairwise losses, using an efficient discrete optimization method. In addition, the proposed method decouples the binary codes learning and hash function learning into two separate stages, thus making the proposed method equally applicable for both single-media and cross-media search. Extensive experiments on both single-media and cross-media retrieval tasks demonstrate the effectiveness of the proposed method

    Probabilistic Graphical Models for ERP-Based Brain Computer Interfaces

    Get PDF
    An event related potential (ERP) is an electrical potential recorded from the nervous system of humans or other animals. An ERP is observed after the presentation of a stimulus. Some examples of the ERPs are P300, N400, among others. Although ERPs are used very often in neuroscience, its generation is not yet well understood and different theories have been proposed to explain the phenomena. ERPs could be generated due to changes in the alpha rhythm, an internal neural control that reset the ongoing oscillations in the brain, or separate and distinct additive neuronal phenomena. When different repetitions of the same stimuli are averaged, a coherence addition of the oscillations is obtained which explain the increase in amplitude in the signals. Two ERPs are mostly studied: N400 and P300. N400 signals arise when a subject tries to make semantic operations that support neural circuits for explicit memory. N400 potentials have been observed mostly in the rhinal cortex. P300 signals are related to attention and memory operations. When a new stimulus appears, a P300 ERP (named P3a) is generated in the frontal lobe. In contrast, when a subject perceives an expected stimulus, a P300 ERP (named P3b) is generated in the temporal – parietal areas. This implicates P3a and P3b are related, suggesting a circuit pathway between the frontal and temporal–parietal regions, whose existence has not been verified. Un potencial relacionado con un evento (ERP) es un potencial eléctrico registrado en el sistema nervioso de los seres humanos u otros animales. Un ERP se observa tras la presentación de un estímulo. Aunque los ERPs se utilizan muy a menudo en neurociencia, su generación aún no se entiende bien y se han propuesto diferentes teorías para explicar el fenómeno. Una interfaz cerebro-computador (BCI) es un sistema de comunicación en el que los mensajes o las órdenes que un sujeto envía al mundo exterior proceden de algunas señales cerebrales en lugar de los nervios y músculos periféricos. La BCI utiliza ritmos sensorimotores o señales ERP, por lo que se necesita un clasificador para distinguir entre los estímulos correctos y los incorrectos. En este trabajo, proponemos utilizar modelos probabilísticos gráficos para el modelado de la dinámica temporal y espacial de las señales cerebrales con aplicaciones a las BCIs. Los modelos gráficos han sido seleccionados por su flexibilidad y capacidad de incorporar información previa. Esta flexibilidad se ha utilizado anteriormente para modelar únicamente la dinámica temporal. Esperamos que el modelo refleje algunos aspectos del funcionamiento del cerebro relacionados con los ERPs, al incluir información espacial y temporal.DoctoradoDoctor en Ingeniería Eléctrica y Electrónic

    Confluence of Vision and Natural Language Processing for Cross-media Semantic Relations Extraction

    Get PDF
    In this dissertation, we focus on extracting and understanding semantically meaningful relationships between data items of various modalities; especially relations between images and natural language. We explore the ideas and techniques to integrate such cross-media semantic relations for machine understanding of large heterogeneous datasets, made available through the expansion of the World Wide Web. The datasets collected from social media websites, news media outlets and blogging platforms usually contain multiple modalities of data. Intelligent systems are needed to automatically make sense out of these datasets and present them in such a way that humans can find the relevant pieces of information or get a summary of the available material. Such systems have to process multiple modalities of data such as images, text, linguistic features, and structured data in reference to each other. For example, image and video search and retrieval engines are required to understand the relations between visual and textual data so that they can provide relevant answers in the form of images and videos to the users\u27 queries presented in the form of text. We emphasize the automatic extraction of semantic topics or concepts from the data available in any form such as images, free-flowing text or metadata. These semantic concepts/topics become the basis of semantic relations across heterogeneous data types, e.g., visual and textual data. A classic problem involving image-text relations is the automatic generation of textual descriptions of images. This problem is the main focus of our work. In many cases, large amount of text is associated with images. Deep exploration of linguistic features of such text is required to fully utilize the semantic information encoded in it. A news dataset involving images and news articles is an example of this scenario. We devise frameworks for automatic news image description generation based on the semantic relations of images, as well as semantic understanding of linguistic features of the news articles

    Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain

    Get PDF
    The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio

    Music Genre Classification Systems - A Computational Approach

    Get PDF

    Technology and Testing

    Get PDF
    From early answer sheets filled in with number 2 pencils, to tests administered by mainframe computers, to assessments wholly constructed by computers, it is clear that technology is changing the field of educational and psychological measurement. The numerous and rapid advances have immediate impact on test creators, assessment professionals, and those who implement and analyze assessments. This comprehensive new volume brings together leading experts on the issues posed by technological applications in testing, with chapters on game-based assessment, testing with simulations, video assessment, computerized test development, large-scale test delivery, model choice, validity, and error issues. Including an overview of existing literature and ground-breaking research, each chapter considers the technological, practical, and ethical considerations of this rapidly-changing area. Ideal for researchers and professionals in testing and assessment, Technology and Testing provides a critical and in-depth look at one of the most pressing topics in educational testing today
    • …
    corecore