435,202 research outputs found

    Improving Context Modelling in Multimodal Dialogue Generation

    Full text link
    In this work, we investigate the task of textual response generation in a multimodal task-oriented dialogue system. Our work is based on the recently released Multimodal Dialogue (MMD) dataset (Saha et al., 2017) in the fashion domain. We introduce a multimodal extension to the Hierarchical Recurrent Encoder-Decoder (HRED) model and show that this extension outperforms strong baselines in terms of text-based similarity metrics. We also showcase the shortcomings of current vision and language models by performing an error analysis on our system's output

    Developing a Web Server Platform with SAPI support for AJAX RPC using JSON

    Get PDF
    Writing a custom web server with SAPI support is a useful task which helps students and future system architects to understand the link between network programming, object oriented programming, enterprise application designing patterns and development best practices because it offers a vision upon interprocess communication and application extensibility in a distributed environmentWeb, Server, Proxy, SAPI, HTTP, RPC, AJAX, JSON, XML

    Foundation Model Based Native AI Framework in 6G with Cloud-Edge-End Collaboration

    Full text link
    Future wireless communication networks are in a position to move beyond data-centric, device-oriented connectivity and offer intelligent, immersive experiences based on task-oriented connections, especially in the context of the thriving development of pre-trained foundation models (PFM) and the evolving vision of 6G native artificial intelligence (AI). Therefore, redefining modes of collaboration between devices and servers and constructing native intelligence libraries become critically important in 6G. In this paper, we analyze the challenges of achieving 6G native AI from the perspectives of data, intelligence, and networks. Then, we propose a 6G native AI framework based on foundation models, provide a customization approach for intent-aware PFM, present a construction of a task-oriented AI toolkit, and outline a novel cloud-edge-end collaboration paradigm. As a practical use case, we apply this framework for orchestration, achieving the maximum sum rate within a wireless communication system, and presenting preliminary evaluation results. Finally, we outline research directions for achieving native AI in 6G.Comment: 8 pages, 4 figures, 1 tabl

    An Active Approach to Characterization and Recognition of Functionality and Functional Properties

    Get PDF
    Functionality in an object can be defined as its applicability toward the accomplishment of a task. We emphasize and develop an interactive and performatory approach to functionality recovery from sensor data in the context of robotic manipulatory tasks. By analyzing interaction of tool and target object and manipulation tasks as goal-oriented recognition processes, we propose to identify and characterize functionalities of objects. This interaction is not only a means of verification of the hypothesized presence of functionality in objects but also a way to actively and purposively recognize the object. The representation of functionality allows us to extend the recovery process to a hierarchy of functionalities allowing complex ones to be composed from simpler ones. A formal model, based on Discrete Event Dynamic System Theory (DEDS), is introduced to define an interactive task for recovering and describing functionality. To observe and control the recovery process we introduce the notion of piecewise observability of a task by different sensors. This allows the description of a dynamic system in which not all events nor the time of their occurrence may be predicted in advance. An experimental system, with both vision and force sensors, for carrying out the interactive functional recognition is described

    Fuzzy decision-making fuser (FDMF) for integrating human-machine autonomous (HMA) systems with adaptive evidence sources

    Full text link
    © 2017 Liu, Pal, Marathe, Wang and Lin. A brain-computer interface (BCI) creates a direct communication pathway between the human brain and an external device or system. In contrast to patient-oriented BCIs, which are intended to restore inoperative or malfunctioning aspects of the nervous system, a growing number of BCI studies focus on designing auxiliary systems that are intended for everyday use. The goal of building these BCIs is to provide capabilities that augment existing intact physical and mental capabilities. However, a key challenge to BCI research is human variability; factors such as fatigue, inattention, and stress vary both across different individuals and for the same individual over time. If these issues are addressed, autonomous systems may provide additional benefits that enhance system performance and prevent problems introduced by individual human variability. This study proposes a human-machine autonomous (HMA) system that simultaneously aggregates human and machine knowledge to recognize targets in a rapid serial visual presentation (RSVP) task. The HMA focuses on integrating an RSVP BCI with computer vision techniques in an image-labeling domain. A fuzzy decision-making fuser (FDMF) is then applied in the HMA system to provide a natural adaptive framework for evidence-based inference by incorporating an integrated summary of the available evidence (i.e., human and machine decisions) and associated uncertainty. Consequently, the HMA system dynamically aggregates decisions involving uncertainties from both human and autonomous agents. The collaborative decisions made by an HMA system can achieve and maintain superior performance more efficiently than either the human or autonomous agents can achieve independently. The experimental results shown in this study suggest that the proposed HMA system with the FDMF can effectively fuse decisions from human brain activities and the computer vision techniques to improve overall performance on the RSVP recognition task. This conclusion demonstrates the potential benefits of integrating autonomous systems with BCI systems

    From Evaluation to Verification: Towards Task-oriented Relevance Metricsfor Pedestrian Detection in Safety-critical Domains

    Get PDF
    Whenever a visual perception system is employed in safety-critical applications such as automated driving, a thorough, task-oriented experimental evaluation is necessary to guarantee safe system behavior. While most standard evaluation methods in computer vision provide a good comparability on benchmarks, they tend to fall short on assessing the system performance that is actually relevant for the given task. In our work, we consider pedestrian detection as a highly relevant perception task, and we argue that standard measures such as Intersection over Union (IoU) give insufficient results, mainly because they are insensitive to important physical cues including distance, speed, and direction of motion. Therefore, we investigate so-called relevance metrics, where specific domain knowledge is exploited to obtain a task-oriented performance measure focusing on distance in this initial work. Our experimental setup is based on the CARLA simulator and allows a controlled evaluation of the impact of that domain knowledge. Our first results indicate a linear decrease of the IoU related to the pedestrians' distance, leading to the proposal of a first relevance metric that is also conditioned on the distance
    • …
    corecore