11,994 research outputs found

    A framework for evaluating stereo-based pedestrian detection techniques

    Get PDF
    Automated pedestrian detection, counting, and tracking have received significant attention in the computer vision community of late. As such, a variety of techniques have been investigated using both traditional 2-D computer vision techniques and, more recently, 3-D stereo information. However, to date, a quantitative assessment of the performance of stereo-based pedestrian detection has been problematic, mainly due to the lack of standard stereo-based test data and an agreed methodology for carrying out the evaluation. This has forced researchers into making subjective comparisons between competing approaches. In this paper, we propose a framework for the quantitative evaluation of a short-baseline stereo-based pedestrian detection system. We provide freely available synthetic and real-world test data and recommend a set of evaluation metrics. This allows researchers to benchmark systems, not only with respect to other stereo-based approaches, but also with more traditional 2-D approaches. In order to illustrate its usefulness, we demonstrate the application of this framework to evaluate our own recently proposed technique for pedestrian detection and tracking

    Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

    Get PDF
    This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.Comment: Published in Journal of AI Research (JAIR), volume 61, pp 75-170. 118 pages, 8 figures, 1 tabl

    Transforming educational experience for children, parents and teachers practitioner research from the CDI/NUIM Masters Programme 2013

    Get PDF
    The purpose of this action research thesis was to implement an evidence based initiative that could help better engage students in school. This research investigated factors that affected students' choice of Leaving Certificate Science subjects and devised actions that would enable and inform this choice. The factors affecting student choice were investigated using qualitative and quantitative methods of enquiry. The research was set against a drop in the numbers of students choosing science subjects for Leaving Certificate (Smyth and Hannan, 2006). The research took place in a community school in the south west area of Dublin.

    Grounding language in events

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 137-142).Broadcast video and virtual environments are just two of the growing number of domains in which language is embedded in multiple modalities of rich non-linguistic information. Applications for such multimodal domains are often based on traditional natural language processing techniques that ignore the connection between words and the non-linguistic context in which they are used. This thesis describes a methodology for representing these connections in models which ground the meaning of words in representations of events. Incorporating these grounded language models with text-based techniques significantly improves the performance of three multimodal applications: natural language understanding in videogames, sports video search and automatic speech recognition. Two approaches to representing the structure of events are presented and used to model the meaning of words. In the domain of virtual game worlds, a hand-designed hierarchical behavior grammar is used to explicitly represent all the various actions that an agent can take in a virtual world. This grammar is used to interpret events by parsing sequences of observed actions in order to generate hierarchical event structures. In the noisier and more open -ended domain of broadcast sports video, hierarchical temporal patterns are automatically mined from large corpora of unlabeled video data. The structure of events in video is represented by vectors of these hierarchical patterns.(cont.) Grounded language models are encoded using Hierarchical Bayesian models to represent the probability of words given elements of these event structures. These grounded language models are used to incorporate non-linguistic information into text-based approaches to multimodal applications. In the virtual game domain, this non-linguistic information improves natural language understanding for a virtual agent by nearly 10% and cuts in half the negative effects of noise caused by automatic speech recognition. For broadcast video of baseball and American football, video search systems that incorporate grounded language models are shown to perform up to 33% better than text-based systems. Further, systems for recognizing speech in baseball video that use grounded language models show 25% greater word accuracy than traditional systems.by Michael Ben Fleischman.Ph.D

    MINDtouch embodied ephemeral transference: Mobile media performance research

    Get PDF
    This is the post-print version of the final published article that is available from the link below. Copyright @ Intellect Ltd 2011.The aim of the author's media art research has been to uncover any new understandings of the sensations of liveness and presence that may emerge in participatory networked performance, using mobile phones and physiological wearable devices. To practically investigate these concepts, a mobile media performance series was created, called MINDtouch. The MINDtouch project proposed that the mobile videophone become a new way to communicate non-verbally, visually and sensually across space. It explored notions of ephemeral transference, distance collaboration and participant as performer to study presence and liveness emerging from the use of wireless mobile technologies within real-time, mobile performance contexts. Through participation by in-person and remote interactors, creating mobile video-streamed mixes, the project interweaves and embodies a daisy chain of technologies through the network space. As part of a practice-based Ph.D. research conducted at the SMARTlab Digital Media Institute at the University of East London, MINDtouch has been under the direction of Professor Lizbeth Goodman and sponsored by BBC R&D. The aim of this article is to discuss the project research, conducted and recently completed for submission, in terms of the technical and aesthetic developments from 2008 to present, as well as the final phase of staging the events from July 2009 to February 2010. This piece builds on the article (Baker 2008) which focused on the outcomes of phase 1 of the research project and initial developments in phase 2. The outcomes from phase 2 and 3 of the project are discussed in this article

    Challenges to Teaching Credibility Assessment in Contemporary Schooling

    Get PDF
    Part of the Volume on Digital Media, Youth, and CredibilityThis chapter explores several challenges that exist to teaching credibility assessment in the school environment. Challenges range from institutional barriers such as government regulation and school policies and procedures to dynamic challenges related to young people's cognitive development and the consequent difficulties of navigating a complex web environment. The chapter includes a critique of current practices for teaching kids credibility assessment and highlights some best practices for credibility education

    Identifying Web Tables - Supporting a Neglected Type of Content on the Web

    Full text link
    The abundance of the data in the Internet facilitates the improvement of extraction and processing tools. The trend in the open data publishing encourages the adoption of structured formats like CSV and RDF. However, there is still a plethora of unstructured data on the Web which we assume contain semantics. For this reason, we propose an approach to derive semantics from web tables which are still the most popular publishing tool on the Web. The paper also discusses methods and services of unstructured data extraction and processing as well as machine learning techniques to enhance such a workflow. The eventual result is a framework to process, publish and visualize linked open data. The software enables tables extraction from various open data sources in the HTML format and an automatic export to the RDF format making the data linked. The paper also gives the evaluation of machine learning techniques in conjunction with string similarity functions to be applied in a tables recognition task.Comment: 9 pages, 4 figure
    corecore