18,959 research outputs found

    A Deep Siamese Network for Scene Detection in Broadcast Videos

    Get PDF
    We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.Comment: ACM Multimedia 201

    Audio-Visual VQ Shot Clustering for Video Programs

    Get PDF
    Many post-production video documents such as movies, sitcoms and cartoons present well structured story-lines organized in separated audio-visual scenes. Accurate grouping of shots into these logical video segments could lead to semantic indexing of scenes and events for interactive multimedia retrieval. In this paper we introduce a novel shot based analysis approach which aims to cluster together shots with similar audio-visual content. We demonstrate how the use of codebooks of audio and visual codewords (generated by a vector quantization process) results to be an effective method to represent clusters containing shots with similar long-term consistency of chromatic compositions and audio. The output clusters obtained by a simple single-link clustering algorithm, allow the further application of the well-known scene transition graph framework for scene change detection and shot-pattern investigation. In the end the merging of audio and visual results leads to a hierarchical description of the whole video document, useful for multimedia retrieval and summarization purposes

    Identifying Video Content Consistency by Vector Quantization

    Get PDF
    Many post-production videos such as movies and cartoons present well structured story-lines organized in separated visual scenes. Accurate grouping of shots into these logical segments could lead to semantic indexing of scenes for interactive multimedia retrieval and video summaries. In this paper we introduce a novel shot-based analysis approach which aims to cluster together shots with similar visual content. We demonstrate how the use of codebooks of visual codewords (generated by a vector quantization process) represents an effective method to identify clusters containing shots with similar long-term consistency of chromatic compositions. The clusters, obtained by a single-link clustering algorithm, allow the further use of the well-known scene transition graph framework for logical story unit detection and pattern investigation

    Retrieval of video story units by Markov entropy rate

    Get PDF
    In this paper we propose a method to retrieve video stories from a database. Given a sample story unit, i.e., a series of contiguous and semantically related shots, the most similar clips are retrieved and ranked. Similarity is evaluated on the story structures, and it depends on the number of expressed visual concepts and the pattern in which they appear inside the story. Hidden Markov models are used to represent story units, and Markov entropy rate is adopted as a compact index for evaluating structure similarity. The effectiveness of the proposed approach is demonstrated on a large video set from different kinds of programmes, and results are evaluated by a developed prototype system for story unit retrieval

    Relationships Between Vocabulary Size, Working Memory, and Phonological Awareness in Spanish-Speaking English Language Learners

    Get PDF
    Purpose: The goals of this study were to evaluate the impact of short-term phonological awareness (PA) instruction presented in children\u27s first language (L1; Spanish) on gains in their L1 and second language (L2; English) and to determine whether relationships exist between vocabulary size, verbal working memory, and PA in Spanish-speaking English language learners (ELLs). Method: Participants included 25 kindergartners who received PA instruction and 10 controls. A 2-way within-subjects repeated measures multivariate analysis of variance (MANOVA) was conducted to evaluate gains. Relationships between PA gains, Spanish and English vocabulary, and memory, as measured using nonword repetition and experimental working memory tasks, were analyzed using correlation and regression analyses. Results: Results indicated significant and equivalent gains in both languages of children in the experimental group and no gains in the control group. Spanish vocabulary size was significantly related to PA gains in both languages and was more strongly related to English gains than was English vocabulary size. The memory tasks predicted gains in each language in distinct ways. Conclusion: Results support the conclusion that PA instruction and strong vocabulary skills in an individual\u27s L1 benefit PA development in both the L1 and L2. Results also indicate that dynamic relationships exist between vocabulary size, storage and processing components of working memory, and PA development in both languages of ELLs

    Measuring the coherence of normal and aphasic discourse production in Chinese using rhetorical structure theory (RST)

    Get PDF
    The study investigated the difference in discourse coherence between healthy speakers and speakers with anomic aphasia using Rhetorical Structure Theory (RST). The effect of genre types on coherence and potential factors contributing to the differences were also examined. Fifteen native Cantonese participants of anomic aphasia and their control matched in age, education and gender participated. Sixty language samples were obtained using the story-telling and sequential description tasks of the Cantonese AphasiaBank protocol. Twenty naïve listeners provided subjective ratings on the coherence, completeness, correctness of order, and clarity of each speech sample. Results demonstrated that the control group showed significantly higher production fluency, total number of discourse units, and fewer errors than the aphasia group. Controls used a richer set of relations than the aphasic group, particularly those to describe settings, to express causality, and to elaborate. The aphasic group tended to omit more essential information content and was rated with significantly lower coherence and clarity than controls. The findings suggested that speakers with anomic aphasia had reduced proportion of essential information content, lower degree of elaboration, and more structural disruptions than the controls, which may have contributed to the reduced overall discourse coherence.published_or_final_versionSpeech and Hearing SciencesBachelorBachelor of Science in Speech and Hearing Science

    Video Shot Clustering and Summarization through dendrograms

    Get PDF
    In the context of analysis of video documents, effective clustering of shots facilitates the access to the content and helps in understanding the associated semantics. This paper introduces a cluster analysis on video shots which employs dendrogram representation to produce hierarchical summaries of the video document. Vector quantization codebooks are used to represent the visual content and to group the shots with similar chromatic consistency. The evaluation of the cluster codebook distortions, and the exploitation of the dependency relationships on the dendrogram, allow to obtain only a few significant summaries of the whole video. Finally the user can navigate through summaries and decide which one best suites his/her needs for eventual post processing. The effectiveness of the proposed method is demonstrated, on a collection of different video programmes, in term of metrics that measure the content representational value of the summarization technique

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

    Planning strategically, designing architecturally : a framework for digital library services

    Get PDF
    In an era of unprecedented technological innovation and evolving user expectations and information seeking behaviour, we are arguably now an online society, with digital services increasingly common and increasingly preferred. As a trusted information provider, libraries are in an advantageous position to respond, but this requires integrated strategic and enterprise architecture planning, for information technology (IT) has evolved from a support role to a strategic role, providing the core management systems, communication networks, and delivery channels of the modern library. Further, IT components do not function in isolation from one another, but are interdependent elements of distributed and multidimensional systems encompassing people, processes, and technologies, which must consider social, economic, legal, organisational, and ergonomic requirements and relationships, as well as being logically sound from a technical perspective. Strategic planning provides direction, while enterprise architecture strategically aligns and holistically integrates business and information system architectures. While challenging, such integrated planning should be regarded as an opportunity for the library to evolve as an enterprise in the digital age, or at minimum, to simply keep pace with societal change and alternative service providers. Without strategy, a library risks being directed by outside forces with independent motivations and inadequate understanding of its broader societal role. Without enterprise architecture, it risks technological disparity, redundancy, and obsolescence. Adopting an interdisciplinary approach, this conceptual paper provides an integrated framework for strategic and architectural planning of digital library services. The concept of the library as an enterprise is also introduced

    Statistical Skimming of Feature Films

    Get PDF
    We present a statistical framework based on Hidden Markov Models (HMMs) for skimming feature films. A chain of HMMs is used to model subsequent story units: HMM states represent different visual-concepts, transitions model the temporal dependencies in each story unit, and stochastic observations are given by single shots. The skim is generated as an observation sequence, where, in order to privilege more informative segments for entering the skim, shots are assigned higher probability of observation if endowed with salient features related to specific film genres. The effectiveness of the method is demonstrated by skimming the first thirty minutes of a wide set of action and dramatic movies, in order to create previews for users useful for assessing whether they would like to see that movie or not, but without revealing the movie central part and plot details. Results are evaluated and compared through extensive user tests in terms of metrics that estimate the content representational value of the obtained video skims and their utility for assessing the user's interest in the observed movie
    • …
    corecore