45 research outputs found

    Multi-modal surrogates for retrieving and making sense of videos: is synchronization between the multiple modalities optimal?

    Get PDF
    Video surrogates can help people quickly make sense of the content of a video before downloading or seeking more detailed information. Visual and audio features of a video are primary information carriers and might become important components of video retrieval and video sense-making. In the past decades, most research and development efforts on video surrogates have focused on visual features of the video, and comparatively little work has been done on audio surrogates and examining their pros and cons in aiding users' retrieval and sense-making of digital videos. Even less work has been done on multi-modal surrogates, where more than one modality are employed for consuming the surrogates, for example, the audio and visual modalities. This research examined the effectiveness of a number of multi-modal surrogates, and investigated whether synchronization between the audio and visual channels is optimal. A user study was conducted to evaluate six different surrogates on a set of six recognition and inference tasks to answer two main research questions: (1) How do automatically-generated multi-modal surrogates compare to manually-generated ones in video retrieval and video sense-making? and (2) Does synchronization between multiple surrogate channels enhance or inhibit video retrieval and video sense-making? Forty-eight participants participated in the study, in which the surrogates were measured on the the time participants spent on experiencing the surrogates, the time participants spent on doing the tasks, participants' performance accuracy on the tasks, participants' confidence in their task responses, and participants' subjective ratings on the surrogates. On average, the uncoordinated surrogates were more helpful than the coordinated ones, but the manually-generated surrogates were only more helpful than the automatically-generated ones in terms of task completion time. Participants' subjective ratings were more favorable for the coordinated surrogate C2 (Magic A + V) and the uncoordinated surrogate U1 (Magic A + Storyboard V) with respect to usefulness, usability, enjoyment, and engagement. The post-session questionnaire comments demonstrated participants' preference for the coordinated surrogates, but the comments also revealed the value of having uncoordinated sensory channels

    Audio-visual football video analysis, from structure detection to attention analysis

    Get PDF
    Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics. For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification. Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

    Enabling Collaborative Visual Analysis across Heterogeneous Devices

    Get PDF
    We are surrounded by novel device technologies emerging at an unprecedented pace. These devices are heterogeneous in nature: in large and small sizes with many input and sensing mechanisms. When many such devices are used by multiple users with a shared goal, they form a heterogeneous device ecosystem. A device ecosystem has great potential in data science to act as a natural medium for multiple analysts to make sense of data using visualization. It is essential as today's big data problems require more than a single mind or a single machine to solve them. Towards this vision, I introduce the concept of collaborative, cross-device visual analytics (C2-VA) and outline a reference model to develop user interfaces for C2-VA. This dissertation covers interaction models, coordination techniques, and software platforms to enable full stack support for C2-VA. Firstly, we connected devices to form an ecosystem using software primitives introduced in the early frameworks from this dissertation. To work in a device ecosystem, we designed multi-user interaction for visual analysis in front of large displays by finding a balance between proxemics and mid-air gestures. Extending these techniques, we considered the roles of different devices–large and small–to present a conceptual framework for utilizing multiple devices for visual analytics. When applying this framework, findings from a user study showcase flexibility in the analytic workflow and potential for generation of complex insights in device ecosystems. Beyond this, we supported coordination between multiple users in a device ecosystem by depicting the presence, attention, and data coverage of each analyst within a group. Building on these parts of the C2-VA stack, the culmination of this dissertation is a platform called Vistrates. This platform introduces a component model for modular creation of user interfaces that work across multiple devices and users. A component is an analytical primitive–a data processing method, a visualization, or an interaction technique–that is reusable, composable, and extensible. Together, components can support a complex analytical activity. On top of the component model, the support for collaboration and device ecosystems comes for granted in Vistrates. Overall, this enables the exploration of new research ideas within C2-VA

    Cybernationalism and cyberactivism in China

    Get PDF
    El nacionalismo en la era de Internet se está convirtiendo cada vez más en un factor esencial que influye en la agenda-setting de la sociedad china, así como en las relaciones de China con los países extranjeros, especialmente con Occidente. Para China, una mejor comprensión de la estructura teórica universal y de los patrones de comportamiento del nacionalismo facilitaría la articulación social general de esta tendencia y potenciaría su papel positivo en la agenda-setting social. Por otra parte, un estudio del cibernacionalismo chino basado en una perspectiva china en el mundo académico occidental es un intento de transculturación. Desde el punto de vista de las relaciones internacionales y la geopolítica actuales, que son bastante urgentes, este intento ayudaría a mejorar la compatibilidad de China con el actual orden mundial dominado por Occidente, a reducir la desinformación entre China y otros países y a sentar las bases culturales e ideológicas para otras colaboraciones internacionales. Teniendo en cuenta el estado actual de la investigación sobre el nacionalismo chino y la naturaleza participativa de las masas del cibernacionalismo, esta disertación se centra en el cibernacionalismo en las tres partes siguientes. El primero es un estudio de los orígenes históricos del cibernacionalismo chino. Esta sección incluye tanto una exploración del consenso social en la antigua China como un estudio de la influencia del nacionalismo en la historia china moderna. El estudio de los orígenes históricos no sólo nos muestra la secuencia cronológica de la experiencia del desarrollo y la evolución tanto del proto-nacionalismo como del nacionalismo en China, sino que también revela un impulso decisivo para las reivindicaciones y comportamientos actuales del cibernacionalismo. La segunda parte trata del proceso de formación y ascenso del cibernacionalismo desde el siglo XXI. El importante antecedente del paso del nacionalismo al cibernacionalismo es el proceso de informatización de la sociedad china. Una vez completado el estudio de la situación básica de la sociedad china de Internet, especialmente el estudio de los medios sociales como espacio público, podemos vincular Internet con el nacionalismo y examinar el nuevo desarrollo del nacionalismo en la era de la participación de masas. El objetivo final es conectar el proto-nacionalismo, el nacionalismo y el cibernacionalismo, y seguir construyendo una comprensión del cibernacionalismo que sea coherente tanto con los principios universales del nacionalismo como con el contexto chino. Por último, validamos los resultados derivados del estudio anterior a través de la realidad social, es decir, estudiando las prácticas de ciberactivismo del cibernacionalismo para juzgar su suficiencia general así como su validez. Llevaremos a cabo varios estudios de caso de natural language processing basados en big data para reproducir la lógica de comportamiento y el impacto real del ciberactivismo de la manera más cercana posible a la realidad de Internet, evitando al mismo tiempo los defectos de argumentación unilateral y de infrarrepresentación de los estudios de caso tradicionales.Nationalism in the Internet age is increasingly becoming an essential factor influencing agendasetting within Chinese society, as well as China’s relations with foreign countries, especially the West. For China, a better understanding of the universal theoretical structure and behavioral patterns of nationalism would facilitate the overall social articulation of this trend and enhance its positive role in social agenda setting. On the other hand, a study of Chinese cybernationalism based on a Chinese perspective in western academia is an attempt at transculturation. From the viewpoint of the current rather urgent international relations and geopolitics, such an attempt would help to enhance China’s compatibility with the current western-dominated world order, reduce misinformation between China and other countries, and lay the cultural and ideological groundwork for various other international collaborations. Considering the current state of Chinese nationalism research and the mass participatory nature of cybernationalism, this dissertation focuses on cybernationalism in the following three parts. The first is a study of the historical origins of Chinese cybernationalism. This section includes both an exploration of the social consensus in ancient China and a survey of the influence of nationalism in modern Chinese history. The historical origins study not only shows us the chronological sequence of experiencing the development and evolution of both proto-nationalism and nationalism in China, but also reveals a decisive impetus for the current claims and behaviors of cybernationalism. The second part deals with the process of formation and rise of cybernationalism since the 21st century. The important background for the move from nationalism to cybernationalism is the informatization process of Chinese society. After we have completed the study of the basic situation of Chinese Internet society, especially the study of social media as a public space, we can link the Internet with nationalism and examine the new development of nationalism in the era of mass participation. The ultimate goal is to connect the proto-nationalism, nationalism, cybernationalism, and furtherly construct an understanding of cybernationalism that is consistent with both the universal principles of nationalism and the Chinese context. Finally, we validate the results derived from the previous study through social reality, i.e., by studying the cyberactivism practices of cybernationalism to judge its general sufficiency as well as validity. We will conduct several natural language processing case studies based on big data to reproduce the behavioral logic and actual impact of cyberactivism in the closest possible way to the Internet reality while avoiding the unilateral argumentation and under-representation flaws of traditional case studies

    The equipped explorer : virtual reality as a medium for learning

    Get PDF
    Thesis: Ph. D., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2018.Cataloged from PDF version of thesis.Includes bibliographical references (pages 177-183).What opportunities does virtual reality offer to improve the way we learn? In this thesis, I investigate the ways that constructivist approaches, in particular exploratory and experiential learning, can be uniquely supported by immersive virtual worlds. Against the background of these learning theories, I introduce a design framework that centers around defining a medium of virtuality that is fundamentally social, and uses capture of movement and interaction as a key means for creating interactive scenarios and narrative. Within the world conjured by this medium, the Equipped Explorer learns, reviews, creates and communicates using tools that I propose and classify according to a taxonomy. A series of prototypes and design explorations are used as proofs of concept for aspects of the design framework. Experimental studies are used to investigate foundational questions concerning the learning benefits of using VR over 2D interactive media, and the viability of social interaction and collaboration in VR. I reflect on the implications of this framework and my experimental results to extrapolate how they might impact the future classroom and the practice of learning and discovery more broadly. Finally, I discuss what kinds of research might be needed to maximize that impact moving forward.by Scott Wilkins Greenwald.Ph. D
    corecore