1,060 research outputs found
Recommended from our members
Multimodal Indexing of Presentation Videos
This thesis presents four novel methods to help users efficiently and effectively retrieve information from unstructured and unsourced multimedia sources, in particular the increasing amount and variety of presentation videos such as those in e-learning, conference recordings, corporate talks, and student presentations. We demonstrate a system to summarize, index and cross-reference such videos, and measure the quality of the produced indexes as perceived by the end users. We introduce four major semantic indexing cues: text, speaker faces, graphics, and mosaics, going beyond standard tag based searches and simple video playbacks. This work aims at recognizing visual content "in the wild", where the system cannot rely on any additional information besides the video itself. For text, within a scene text detection and recognition framework, we present a novel locally optimal adaptive binarization algorithm, implemented with integral histograms. It determines of an optimal threshold that maximizes the between-classes variance within a subwindow, with computational complexity independent from the size of the window itself. We obtain character recognition rates of 74%, as validated against ground truth of 8 presentation videos spanning over 1 hour and 45 minutes, which almost doubles the baseline performance of an open source OCR engine. For speaker faces, we detect, track, match, and finally select a humanly preferred face icon per speaker, based on three quality measures: resolution, amount of skin, and pose. We register a 87% accordance (51 out of 58 speakers) between the face indexes automatically generated from three unstructured presentation videos of approximately 45 minutes each, and human preferences recorded through Mechanical Turk experiments. For diagrams, we locate graphics inside frames showing a projected slide, cluster them according to an on-line algorithm based on a combination of visual and temporal information, and select and color-correct their representatives to match human preferences recorded through Mechanical Turk experiments. We register 71% accuracy (57 out of 81 unique diagrams properly identified, selected and color-corrected) on three hours of videos containing five different presentations. For mosaics, we combine two existing suturing measures, to extend video images into in-the-world coordinate system. A set of frames to be registered into a mosaic are sampled according to the PTZ camera movement, which is computed through least square estimation starting from the luminance constancy assumption. A local features based stitching algorithm is then applied to estimate the homography among a set of video frames and median blending is used to render pixels in overlapping regions of the mosaic. For two of these indexes, namely faces and diagrams, we present two novel MTurk-derived user data collections to determine viewer preferences, and show that they are matched in selection by our methods. The net result work of this thesis allows users to search, inside a video collection as well as within a single video clip, for a segment of presentation by professor X on topic Y, containing graph Z
Engineering systematic musicology : methods and services for computational and empirical music research
One of the main research questions of *systematic musicology* is concerned with how people make sense of their musical environment. It is concerned with signification and meaning-formation and relates musical structures to effects of music. These fundamental aspects can be approached from many different directions. One could take a cultural perspective where music is considered a phenomenon of human expression, firmly embedded in tradition. Another approach would be a cognitive perspective, where music is considered as an acoustical signal of which perception involves categorizations linked to representations and learning. A performance perspective where music is the outcome of human interaction is also an equally valid view. To understand a phenomenon combining multiple perspectives often makes sense. The methods employed within each of these approaches turn questions into
concrete musicological research projects. It is safe to say that today many of these methods draw upon digital data and tools. Some of those general methods are feature extraction from audio and movement signals, machine learning, classification and statistics. However, the problem is that, very often, the *empirical and computational methods require technical solutions* beyond the skills of researchers that typically have a humanities background. At that point, these researchers need access to specialized technical knowledge to advance their research. My PhD-work should be seen within the context of that tradition. In many respects I adopt a problem-solving attitude to problems that are posed by research in systematic musicology. This work *explores solutions that are relevant for systematic musicology*. It does this by engineering solutions for measurement problems in empirical research and developing research software which facilitates computational research. These solutions are placed in an
engineering-humanities plane. The first axis of the plane contrasts *services* with *methods*. Methods *in* systematic musicology propose ways to generate new insights in music related phenomena or contribute to how research can be done. Services *for* systematic musicology, on the other hand, support or automate research tasks which allow to change the scope of research. A shift in scope allows researchers to cope with larger data sets which offers a broader view on the phenomenon. The
second axis indicates how important Music Information Retrieval (MIR) techniques are in a solution. MIR-techniques are contrasted with various techniques to support empirical research. My research resulted in a total of thirteen solutions which are placed in this plane. The description of seven of these are bundled in this dissertation. Three fall into the methods category and four in the services category. For example Tarsos presents a method to compare performance practice with theoretical scales on a large scale. SyncSink is an example of a service
A Computational Lexicon and Representational Model for Arabic Multiword Expressions
The phenomenon of multiword expressions (MWEs) is increasingly recognised as a serious and challenging issue that has attracted the attention of researchers in various language-related disciplines. Research in these many areas has emphasised the primary role of MWEs in the process of analysing and understanding language, particularly in the computational treatment of natural languages. Ignoring MWE knowledge in any NLP system reduces the possibility of achieving high precision outputs. However, despite the enormous wealth of MWE research and language resources available for English and some other languages, research on Arabic MWEs (AMWEs) still faces multiple challenges, particularly in key computational tasks such as extraction, identification, evaluation, language resource building, and lexical representations.
This research aims to remedy this deficiency by extending knowledge of AMWEs and making noteworthy contributions to the existing literature in three related research areas on the way towards building a computational lexicon of AMWEs. First, this study develops a general understanding of AMWEs by establishing a detailed conceptual framework that includes a description of an adopted AMWE concept and its distinctive properties at multiple linguistic levels. Second, in the use of AMWE extraction and discovery tasks, the study employs a hybrid approach that combines knowledge-based and data-driven computational methods for discovering multiple types of AMWEs. Third, this thesis presents a representative system for AMWEs which consists of multilayer encoding of extensive linguistic descriptions.
This project also paves the way for further in-depth AMWE-aware studies in NLP and linguistics to gain new insights into this complicated phenomenon in standard Arabic. The implications of this research are related to the vital role of the AMWE lexicon, as a new lexical resource, in the improvement of various ANLP tasks and the potential opportunities this lexicon provides for linguists to analyse and explore AMWE phenomena
Impact of asthma on the brain: evidence from diffusion MRI, CSF biomarkers and cognitive decline
Chronic systemic inflammation increases the risk of neurodegeneration, but the mechanisms remain unclear. Part of the challenge in reaching a nuanced understanding is the presence of multiple risk factors that interact to potentiate adverse consequences. To address modifiable risk factors and mitigate downstream effects, it is necessary, although difficult, to tease apart the contribution of an individual risk factor by accounting for concurrent factors such as advanced age, cardiovascular risk, and genetic predisposition. Using a case-control design, we investigated the influence of asthma, a highly prevalent chronic inflammatory disease of the airways, on brain health in participants recruited to the Wisconsin Alzheimer's Disease Research Center (31 asthma patients, 186 non-asthma controls, aged 45-90 years, 62.2% female, 92.2% cognitively unimpaired), a sample enriched for parental history of Alzheimer's disease. Asthma status was determined using detailed prescription information. We employed multi-shell diffusion weighted imaging scans and the three-compartment neurite orientation dispersion and density imaging model to assess white and gray matter microstructure. We used cerebrospinal fluid biomarkers to examine evidence of Alzheimer's disease pathology, glial activation, neuroinflammation and neurodegeneration. We evaluated cognitive changes over time using a preclinical Alzheimer cognitive composite. Using permutation analysis of linear models, we examined the moderating influence of asthma on relationships between diffusion imaging metrics, CSF biomarkers, and cognitive decline, controlling for age, sex, and cognitive status. We ran additional models controlling for cardiovascular risk and genetic risk of Alzheimer's disease, defined as a carrier of at least one apolipoprotein E (APOE) ε4 allele. Relative to controls, greater Alzheimer's disease pathology (lower amyloid-β42/amyloid-β40, higher phosphorylated-tau-181) and synaptic degeneration (neurogranin) biomarker concentrations were associated with more adverse white matter metrics (e.g. lower neurite density, higher mean diffusivity) in patients with asthma. Higher concentrations of the pleiotropic cytokine IL-6 and the glial marker S100B were associated with more salubrious white matter metrics in asthma, but not in controls. The adverse effects of age on white matter integrity were accelerated in asthma. Finally, we found evidence that in asthma, relative to controls, deterioration in white and gray matter microstructure was associated with accelerated cognitive decline. Taken together, our findings suggest that asthma accelerates white and gray matter microstructural changes associated with aging and increasing neuropathology, that in turn, are associated with more rapid cognitive decline. Effective asthma control, on the other hand, may be protective and slow progression of cognitive symptoms
Proceedings of the First Workshop on Computing News Storylines (CNewsStory 2015)
This volume contains the proceedings of the 1st Workshop on Computing News Storylines (CNewsStory
2015) held in conjunction with the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP
2015) at the China National Convention Center in Beijing, on July 31st 2015.
Narratives are at the heart of information sharing. Ever since people began to share their experiences,
they have connected them to form narratives. The study od storytelling and the field of literary theory
called narratology have developed complex frameworks and models related to various aspects of
narrative such as plots structures, narrative embeddings, characters’ perspectives, reader response, point
of view, narrative voice, narrative goals, and many others. These notions from narratology have been
applied mainly in Artificial Intelligence and to model formal semantic approaches to narratives (e.g.
Plot Units developed by Lehnert (1981)). In recent years, computational narratology has qualified as an
autonomous field of study and research. Narrative has been the focus of a number of workshops and
conferences (AAAI Symposia, Interactive Storytelling Conference (ICIDS), Computational Models of
Narrative). Furthermore, reference annotation schemes for narratives have been proposed (NarrativeML
by Mani (2013)).
The workshop aimed at bringing together researchers from different communities working on
representing and extracting narrative structures in news, a text genre which is highly used in NLP
but which has received little attention with respect to narrative structure, representation and analysis.
Currently, advances in NLP technology have made it feasible to look beyond scenario-driven, atomic
extraction of events from single documents and work towards extracting story structures from multiple
documents, while these documents are published over time as news streams. Policy makers, NGOs,
information specialists (such as journalists and librarians) and others are increasingly in need of tools
that support them in finding salient stories in large amounts of information to more effectively implement
policies, monitor actions of “big players” in the society and check facts. Their tasks often revolve around
reconstructing cases either with respect to specific entities (e.g. person or organizations) or events (e.g.
hurricane Katrina). Storylines represent explanatory schemas that enable us to make better selections
of relevant information but also projections to the future. They form a valuable potential for exploiting
news data in an innovative way.JRC.G.2-Global security and crisis managemen
- …