7,280 research outputs found

    Video browsing interfaces and applications: a review

    Get PDF
    We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other

    Latent variable methods for visualization through time

    Get PDF

    Temporal contextual descriptors and applications to emotion analysis.

    Get PDF
    The current trends in technology suggest that the next generation of services and devices allows smarter customization and automatic context recognition. Computers learn the behavior of the users and can offer them customized services depending on the context, location, and preferences. One of the most important challenges in human-machine interaction is the proper understanding of human emotions by machines and automated systems. In the recent years, the progress made in machine learning and pattern recognition led to the development of algorithms that are able to learn the detection and identification of human emotions from experience. These algorithms use different modalities such as image, speech, and physiological signals to analyze and learn human emotions. In many settings, the vocal information might be more available than other modalities due to widespread of voice sensors in phones, cars, and computer systems in general. In emotion analysis from speech, an audio utterance is represented by an ordered (in time) sequence of features or a multivariate time series. Typically, the sequence is further mapped into a global descriptor representative of the entire utterance/sequence. This descriptor is used for classification and analysis. In classic approaches, statistics are computed over the entire sequence and used as a global descriptor. This often results in the loss of temporal ordering from the original sequence. Emotion is a succession of acoustic events. By discarding the temporal ordering of these events in the mapping, the classic approaches cannot detect acoustic patterns that lead to a certain emotion. In this dissertation, we propose a novel feature mapping framework. The proposed framework maps temporally ordered sequence of acoustic features into data-driven global descriptors that integrate the temporal information from the original sequence. The framework contains three mapping algorithms. These algorithms integrate the temporal information implicitly and explicitly in the descriptor\u27s representation. In the rst algorithm, the Temporal Averaging Algorithm, we average the data temporally using leaky integrators to produce a global descriptor that implicitly integrates the temporal information from the original sequence. In order to integrate the discrimination between classes in the mapping, we propose the Temporal Response Averaging Algorithm which combines the temporal averaging step of the previous algorithm and unsupervised learning to produce data driven temporal contextual descriptors. In the third algorithm, we use the topology preserving property of the Self-Organizing Maps and the continuous nature of speech to map a temporal sequence into an ordered trajectory representing the behavior over time of the input utterance on a 2-D map of emotions. The temporal information is integrated explicitly in the descriptor which makes it easier to monitor emotions in long speeches. The proposed mapping framework maps speech data of different length to the same equivalent representation which alleviates the problem of dealing with variable length temporal sequences. This is advantageous in real time setting where the size of the analysis window can be variable. Using the proposed feature mapping framework, we build a novel data-driven speech emotion detection and recognition system that indexes speech databases to facilitate the classification and retrieval of emotions. We test the proposed system using two datasets. The first corpus is acted. We showed that the proposed mapping framework outperforms the classic approaches while providing descriptors that are suitable for the analysis and visualization of humans’ emotions in speech data. The second corpus is an authentic dataset. In this dissertation, we evaluate the performances of our system using a collection of debates. For that purpose, we propose a novel debate collection that is one of the first initiatives in the literature. We show that the proposed system is able to learn human emotions from debates

    From Keyword Search to Exploration: How Result Visualization Aids Discovery on the Web

    No full text
    A key to the Web's success is the power of search. The elegant way in which search results are returned is usually remarkably effective. However, for exploratory search in which users need to learn, discover, and understand novel or complex topics, there is substantial room for improvement. Human computer interaction researchers and web browser designers have developed novel strategies to improve Web search by enabling users to conveniently visualize, manipulate, and organize their Web search results. This monograph offers fresh ways to think about search-related cognitive processes and describes innovative design approaches to browsers and related tools. For instance, while key word search presents users with results for specific information (e.g., what is the capitol of Peru), other methods may let users see and explore the contexts of their requests for information (related or previous work, conflicting information), or the properties that associate groups of information assets (group legal decisions by lead attorney). We also consider the both traditional and novel ways in which these strategies have been evaluated. From our review of cognitive processes, browser design, and evaluations, we reflect on the future opportunities and new paradigms for exploring and interacting with Web search results

    Advancing performability in playable media : a simulation-based interface as a dynamic score

    Get PDF
    ï»żï»żWhen designing playable media with non-game orientation, alternative play scenarios to gameplay scenarios must be accompanied by alternative mechanics to game mechanics. Problems of designing playable media with non-game orientation are stated as the problems of designing a platform for creative explorations and creative expressions. For such design problems, two requirements are articulated: 1) play state transitions must be dynamic in non-trivial ways in order to achieve a significant level of engagement, and 2) pathways for players’ experience from exploration to expression must be provided. The transformative pathway from creative exploration to creative expression is analogous to pathways for game players’ skill acquisition in gameplay. The paper first describes a concept of simulation-based interface, and then binds that concept with the concept of dynamic score. The former partially accounts for the first requirement, the latter the second requirement. The paper describes the prototype and realization of the two concepts’ binding. “Score” is here defined as a representation of cue organization through a transmodal abstraction. A simulation based interface is presented with swarm mechanics and its function as a dynamic score is demonstrated with an interactive musical composition and performance

    Musical audio-mining

    Get PDF
    • 

    corecore