3,061 research outputs found

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Physiological Sensing for Affective Computing

    Get PDF
    This thesis addresses two aspects related to enabling systems to recognize the affective state of people and respond sensibly to it. First, the issue of representing affective states and unambiguously assigning physiological measurements to those is addressed by suggesting a new approach based on the dimensional emotion model of valence and arousal. Second, the issue of sensing affect-related physiological data is addressed by suggesting a concept for physiological sensor systems that live up to the requirements of adaptive, user-centred systems.In dieser Arbeit wird ein Konzept zur eindeutigen Zuordnung physiologischer Messdaten zu Emotionszuständen erarbeitet, wobei Probleme klassischer Ansätze hierzu vermieden werden. Des Weiteren widmet sich die Arbeit der Erfassung emotionsbezogener physiologischer Parameter. Es wird ein Konzept für Sensorsysteme vorgestellt, welches die zuverlässige Erfassung relevanter physiologischer Parameter erlaubt, ohne jedoch den Nutzer stark zu beeinträchtigen. Der Schwerpunkt liegt hierbei auf der alltagstauglichen Gestaltung des Systems

    Audio-visual football video analysis, from structure detection to attention analysis

    Get PDF
    Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic fields. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop specific techniques for content-based sports video analysis to utilise these characteristics. For an efficient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identification. Replay segments convey the most important contents in sports videos. It is an efficient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a five-layer adaboost classifier and a logo template matching throughout an entire video. The five-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to filter out logo transition candidates. Subsequently, a logo template is constructed and employed to find all transition logo sequences. The precision and recall of this system in replay detection is 100% in a five-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identified by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a suffix tree is proposed to find the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reflection bias among modality salient signals and combines these signals by reflectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are filled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can find goal events at a high precision. Moreover, results of MAR-based highlight detection on the final game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

    Probabilistic Graphical Models for Human Interaction Analysis

    Get PDF
    The objective of this thesis is to develop probabilistic graphical models for analyzing human interaction in meetings based on multimodel cues. We use meeting as a study case of human interactions since research shows that high complexity information is mostly exchanged through face-to-face interactions. Modeling human interaction provides several challenging research issues for the machine learning community. In meetings, each participant is a multimodal data stream. Modeling human interaction involves simultaneous recording and analysis of multiple multimodal streams. These streams may be asynchronous, have different frame rates, exhibit different stationarity properties, and carry complementary (or correlated) information. In this thesis, we developed three probabilistic graphical models for human interaction analysis. The proposed models use the ``probabilistic graphical model'' formalism, a formalism that exploits the conjoined capabilities of graph theory and probability theory to build complex models out of simpler pieces. We first introduce the multi-layer framework, in which the first layer models typical individual activity from low-level audio-visual features, and the second layer models the interactions. The two layers are linked by a set of posterior probability-based features. Next, we describe the team-player influence model, which learns the influence of interacting Markov chains within a team. The team-player influence model has a two-level structure: individual-level and group-level. Individual level models actions of each player, and the group-level models actions of the team as a whole. The influence of each player on the team is jointly learned with the rest of the model parameters in a principled manner using the Expectation-Maximization (EM) algorithm. Finally, we describe the semi-supervised adapted HMMs for unusual event detection. Unusual events are characterized by a number of features (rarity, unexpectedness, and relevance) that limit the application of traditional supervised model-based approaches. We propose a semi-supervised adapted Hidden Markov Model (HMM) framework, in which usual event models are first learned from a large amount of (commonly available) training data, while unusual event models are learned by Bayesian adaptation in an unsupervised manner

    2018 Academic Excellence Showcase Proceedings

    Get PDF

    The Ethos of Dissent: Epideictic Rhetoric and the Democratic Function of American Protest and Countercultural Literature

    Get PDF
    My dissertation, “The Ethos of Dissent: Epideictic Rhetoric and the Democratic Function of American Protest and Countercultural Literature, 1940-1962,” establishes a theoretical frame-work, the literary epideictic, for reading the African American social protest literature of Richard Wright and Ralph Ellison, and the American countercultural literature of Jack Kerouac and Ken Kesey. I argue that epideictic rhetoric affords insight into how these authors’ narratives embody a post-World War II “ethos of dissent,” a counterdiscourse that emerges out of a climate of dynamism deadlocked with controlling ideologies. Epideictic, the branch of rhetoric concerned with civic matters, commends or censures a particular individual, institution, or social practice, preserves or revises value systems, and builds social cohesion. A developing postwar American society provides “epideictic exigencies” for these authors, i.e., historical events that inform each novel’s counter-narrative – the script and myth of the black male rapist in Native Son, the nonrecognition of African Americans in the social and political sphere in Invisible Man, the Cold War’s ideology of domestic containment and desire in On the Road, and the emerging measures of social control in One Flew Over The Cuckoo’s Nest. These narratives reveal how their respective social environments impede the realization of democratic freedoms for individuals who refuse to adhere to cultural codes of acquiescence, and they feature alternative values that clash with the dominant social forces attempting to control individual activity. Chapter one applies Sarah Ahmed’s “affective economy of fear” to Wright’s Native Son and helps elucidate Wright’s literary project, which reveals Bigger Thomas’s traumatic fear as the impetus for his actions, an intense fear embedded in violent histories of contact between black and white bodies. Chapter two attends to Ellison’s Invisible Man and the theme of invisibility as a rhetorical strategy calling for the social and political recognition of African Americans. In chapter three, I apply the capitalist conception of a Deleuzian desire to On the Road and argue that Kerouac recodes postwar desire and offers a vision of mobility and authenticity that is akin to a Deleuzian becoming, producing a shift in American values within a culture of containment. Finally, chapter four examines Kesey’s Cuckoo’s Nest and how the narrative captures an emerg-ing culture of surveillance and parens patriae, and counters with the notion, “play as power.” “The Ethos of Dissent” offers two new insights: 1) my dissertation contributes to literary scholarship by providing a new framework for reading authors who are not ordinarily compared, but who, as Ellison proposes, “report what is going on in their particular area of the American experience” during the postwar period; and 2) it adds to rhetorical criticism by extending epideictic rhetoric from the public civic arena (oratory) to the private literary realm, as well as contributes to a previously unexplored relationship between affect and epideictic rhetoric. While scholars have attended to the function of communal values uniting an audience, there is no work delving into the affective components of the epideictic process. These social protest and counter-cultural novels strive to affect readers emotionally by incorporating emotive discourse that relates to their targeted issues, and the novels instigate a moral examination of the narratively depicted realities against the democratic ideals by critiquing the broad values of racism, conformism, and authoritarianism. Ultimately, the authors and their texts expose failing value systems, promote positive values alluding to a democratic interdependence, and imagine alternative possibilities to the current state of social and political affairs

    Aplicação de técnicas de Clustering ao contexto da Tomada de Decisão em Grupo

    Get PDF
    Nowadays, decisions made by executives and managers are primarily made in a group. Therefore, group decision-making is a process where a group of people called participants work together to analyze a set of variables, considering and evaluating a set of alternatives to select one or more solutions. There are many problems associated with group decision-making, namely when the participants cannot meet for any reason, ranging from schedule incompatibility to being in different countries with different time zones. To support this process, Group Decision Support Systems (GDSS) evolved to what today we call web-based GDSS. In GDSS, argumentation is ideal since it makes it easier to use justifications and explanations in interactions between decision-makers so they can sustain their opinions. Aspect Based Sentiment Analysis (ABSA) is a subfield of Argument Mining closely related to Natural Language Processing. It intends to classify opinions at the aspect level and identify the elements of an opinion. Applying ABSA techniques to Group Decision Making Context results in the automatic identification of alternatives and criteria, for example. This automatic identification is essential to reduce the time decision-makers take to step themselves up on Group Decision Support Systems and offer them various insights and knowledge on the discussion they are participants. One of these insights can be arguments getting used by the decision-makers about an alternative. Therefore, this dissertation proposes a methodology that uses an unsupervised technique, Clustering, and aims to segment the participants of a discussion based on arguments used so it can produce knowledge from the current information in the GDSS. This methodology can be hosted in a web service that follows a micro-service architecture and utilizes Data Preprocessing and Intra-sentence Segmentation in addition to Clustering to achieve the objectives of the dissertation. Word Embedding is needed when we apply clustering techniques to natural language text to transform the natural language text into vectors usable by the clustering techniques. In addition to Word Embedding, Dimensionality Reduction techniques were tested to improve the results. Maintaining the same Preprocessing steps and varying the chosen Clustering techniques, Word Embedders, and Dimensionality Reduction techniques came up with the best approach. This approach consisted of the KMeans++ clustering technique, using SBERT as the word embedder with UMAP dimensionality reduction, reducing the number of dimensions to 2. This experiment achieved a Silhouette Score of 0.63 with 8 clusters on the baseball dataset, which wielded good cluster results based on their manual review and Wordclouds. The same approach obtained a Silhouette Score of 0.59 with 16 clusters on the car brand dataset, which we used as an approach validation dataset.Atualmente, as decisões tomadas por gestores e executivos são maioritariamente realizadas em grupo. Sendo assim, a tomada de decisão em grupo é um processo no qual um grupo de pessoas denominadas de participantes, atuam em conjunto, analisando um conjunto de variáveis, considerando e avaliando um conjunto de alternativas com o objetivo de selecionar uma ou mais soluções. Existem muitos problemas associados ao processo de tomada de decisão, principalmente quando os participantes não têm possibilidades de se reunirem (Exs.: Os participantes encontramse em diferentes locais, os países onde estão têm fusos horários diferentes, incompatibilidades de agenda, etc.). Para suportar este processo de tomada de decisão, os Sistemas de Apoio à Tomada de Decisão em Grupo (SADG) evoluíram para o que hoje se chamam de Sistemas de Apoio à Tomada de Decisão em Grupo baseados na Web. Num SADG, argumentação é ideal pois facilita a utilização de justificações e explicações nas interações entre decisores para que possam suster as suas opiniões. Aspect Based Sentiment Analysis (ABSA) é uma área de Argument Mining correlacionada com o Processamento de Linguagem Natural. Esta área pretende classificar opiniões ao nível do aspeto da frase e identificar os elementos de uma opinião. Aplicando técnicas de ABSA à Tomada de Decisão em Grupo resulta na identificação automática de alternativas e critérios por exemplo. Esta identificação automática é essencial para reduzir o tempo que os decisores gastam a customizarem-se no SADG e oferece aos mesmos conhecimento e entendimentos sobre a discussão ao qual participam. Um destes entendimentos pode ser os argumentos a serem usados pelos decisores sobre uma alternativa. Assim, esta dissertação propõe uma metodologia que utiliza uma técnica não-supervisionada, Clustering, com o objetivo de segmentar os participantes de uma discussão com base nos argumentos usados pelos mesmos de modo a produzir conhecimento com a informação atual no SADG. Esta metodologia pode ser colocada num serviço web que segue a arquitetura micro serviços e utiliza Preprocessamento de Dados e Segmentação Intra Frase em conjunto com o Clustering para atingir os objetivos desta dissertação. Word Embedding também é necessário para aplicar técnicas de Clustering a texto em linguagem natural para transformar o texto em vetores que possam ser usados pelas técnicas de Clustering. Também Técnicas de Redução de Dimensionalidade também foram testadas de modo a melhorar os resultados. Mantendo os passos de Preprocessamento e variando as técnicas de Clustering, Word Embedder e as técnicas de Redução de Dimensionalidade de modo a encontrar a melhor abordagem. Essa abordagem consiste na utilização da técnica de Clustering KMeans++ com o SBERT como Word Embedder e UMAP como a técnica de redução de dimensionalidade, reduzindo as dimensões iniciais para duas. Esta experiência obteve um Silhouette Score de 0.63 com 8 clusters no dataset de baseball, que resultou em bons resultados de cluster com base na sua revisão manual e visualização dos WordClouds. A mesma abordagem obteve um Silhouette Score de 0.59 com 16 clusters no dataset das marcas de carros, ao qual usamos esse dataset com validação de abordagem

    Humanist Narratology and the Suburban Ensemble Dramedy

    Get PDF
    What is a “humanistic drama”? Although we might describe narrative works as humanist, and references to the humanistic drama abound across a breadth of critical media, including film and literary theory, the parameters of these terms remain elliptical. My work attempts to clarify the narrative conditions of humanism. In particular, humanists ask how we use narrative texts to complicate our understanding of others, and question the ethics and efficacy of attempts to represent human social complexity in fiction. After historicising narrative humanism and situating it among related philosophies, I develop humanist hermeneutics as a method for reading fictive texts, and provide examples of such readings. I integrate literary Darwinism, anthropology, cognitive science and social psychology into a social narratology, which catalogues the social functions of narrative. This expansive study asks how we can unite the descriptive capabilities of social science with the more prescriptive ethical inquiry of traditional humanism, and aims to demonstrate their productive compatibility. From this groundwork, I then look at a cluster of humanistic film texts: the suburban ensemble dramedy, a phenomenon in millennial American cinema politicising the quotidian and the domestic. Popular works include The Kids Are All Right, Little Miss Sunshine, Little Children, Junebug, The Oranges, and what is arguably the inciting feature in a wave of such films entering production, American Beauty. I provide examples of humanist readings of these films at two levels: an overview of genre development as social phenomenon (including histories of suburban depiction onscreen, ensemble cinema and affective experimentation in recent American filmmaking), followed by a close reading of a progenitor text, Ron Howard's 1989 film Parenthood