385,326 research outputs found
Object Referring in Visual Scene with Spoken Language
Object referring has important applications, especially for human-machine
interaction. While having received great attention, the task is mainly attacked
with written language (text) as input rather than spoken language (speech),
which is more natural. This paper investigates Object Referring with Spoken
Language (ORSpoken) by presenting two datasets and one novel approach. Objects
are annotated with their locations in images, text descriptions and speech
descriptions. This makes the datasets ideal for multi-modality learning. The
approach is developed by carefully taking down ORSpoken problem into three
sub-problems and introducing task-specific vision-language interactions at the
corresponding levels. Experiments show that our method outperforms competing
methods consistently and significantly. The approach is also evaluated in the
presence of audio noise, showing the efficacy of the proposed vision-language
interaction methods in counteracting background noise.Comment: 10 pages, Submitted to WACV 201
The logic of adaptive behavior : knowledge representation and algorithms for the Markov decision process framework in first-order domains
Learning and reasoning in large, structured, probabilistic worlds is at the heart of artificial intelligence. Markov decision processes have become the de facto standard in modeling and solving sequential decision making problems under uncertainty. Many efficient reinforcement learning and dynamic programming techniques exist that can solve such problems.\ud
Until recently, the representational state-of-the-art in this field was based on propositional representations.\ud
\ud
However, it is hard to imagine a truly general, intelligent system that does not conceive of the world in terms of objects and their properties and relations to other objects. To this end, this book studies lifting Markov decision processes, reinforcement learning and dynamic programming to the first-order (or, relational) setting. Based on an extensive analysis of propositional representations and techniques, a methodological translation is constructed from the propositional to the relational setting. Furthermore, this book provides a thorough and complete description of the state-of-the-art, it surveys vital, related historical developments and it contains extensive descriptions of several new model-free and model-based solution techniques
The Role of Graduality for Referring Expression Generation in Visual Scenes
International audienceReferring Expression Generation (reg) algorithms, a core component of systems that generate text from non-linguistic data, seek to identify domain objects using natural language descriptions. While reg has often been applied to visual domains, very few approaches deal with the problem of fuzziness and gradation. This paper discusses these problems and how they can be accommodated to achieve a more realistic view of the task of referring to objects in visual scenes
The role of graduality for referring expression generation in visual scenes
Referring Expression Generation (reg) algorithms, a core component of systems that generate text from non-linguistic data, seek to identify domain objects using natural language descriptions. While reg has often been applied to visual domains, very few approaches deal with the problem of fuzziness and gradation. This paper discusses these problems and how they can be accommodated to achieve a more realistic view of the task of referring to objects in visual scenes.peer-reviewe
Generating Effective Instructions: Knowing When to Stop
One aspect of Natural Language generation is describing entities so that they are distinguished from all other entities. Entities include objects, events, actions, and states. Much attention has been paid to objects and the generation of their referring expressions (descriptions meant to pick out or refer to an entity). However, a growing area of research is the automated generation of instruction manuals and an important part of generating instructions is distinguishing the actions that are to be carried out from other possible actions. One distinguishing feature is an action\u27s termination, or when the performance of the action is to stop. My dissertation work focuses on generating action descriptions from action information using the SPUD generation algorithm developed here at Penn by Matthew Stone. In my work, I concentrate on the generation of expressions of termination information as part of action descriptions. The problems I address include how termination information is represented in action information and expressed in Natural Language, how to determine when an action description allows the reader to understand how to perform the action correctly, and how to generate the appropriate description of action information
The Application of Virtual Tools in Teaching Dynamics in Engineering
Student success in Dynamics, a core subject in Mechanical Engineering courses, requires conceptual understanding of complex systems. Dynamics covers motion of particles and objects, and usually relies on 2 dimensional images and/or written descriptions to explain models and problems. This paper explores the value of visual representation of Dynamics problems with an assumption that it would facilitate student understanding of the content. Two approaches were applied for representation of Dynamics problems with the premise of Bring Your Own Device (BYOD): used with augmented reality and web animation activities. Responses from students and reflection from lecturers were collected and reviewed in relation to the applicability and the ease of use. Students and lecturers both appreciated the benefits of visual representation of complex models, and the possibility of manipulating with virtual objects. Lecturers also appreciated the easy access and use of tools during the class
Questioning and responding in Italian
Questions are design problems for both the questioner and the addressee. They must be produced as recognizable objects and must be comprehended by taking into account the context in which they occur and the local situated interests of the participants. This paper investigates how people do âquestioningâ and ârespondingâ in Italian ordinary conversations. I focus on the features of both questions and responses. I first discuss formal linguistic features that are peculiar to questions in terms of intonation contours (e.g. final rise), morphology (e.g. tags and question words) and syntax (e.g. inversion). I then show additional features that characterize their actual implementation in conversation such as their minimality (often the subject or the verb is only implied) and the usual occurrence of speaker gaze towards the recipient during questions. I then look at which social actions (e.g. requests for information, requests for confirmation) the different question types implement and which responses are regularly produced in return. The data shows that previous descriptions of âinterrogative markingsâ are neither adequate nor sufficient to comprehend the actual use of questions in natural conversation
- âŠ