477,225 research outputs found

    Full interpretation of minimal images

    Get PDF
    The goal in this work is to model the process of ‘full interpretation’ of object images, which is the ability to identify and localize all semantic features and parts that are recognized by human observers. The task is approached by dividing the interpretation of the complete object to the interpretation of multiple reduced but interpretable local regions. In such reduced regions, interpretation is simpler, since the number of semantic components is small, and the variability of possible configurations is low. We model the interpretation process by identifying primitive components and relations that play a useful role in local interpretation by humans. To identify useful components and relations used in the interpretation process, we consider the interpretation of ‘minimal configurations’: these are reduced local regions, which are minimal in the sense that further reduction renders them unrecognizable and uninterpretable. We show that such minimal interpretable images have useful properties, which we use to identify informative features and relations used for full interpretation. We describe our interpretation model, and show results of detailed interpretations of minimal configurations, produced automatically by the model. Finally, we discuss implications of full interpretation to difficult visual tasks, such as recognizing human activities or interactions, which are beyond the scope of current models of visual recognition.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216

    THORN: Temporal Human-Object Relation Network for Action Recognition

    Full text link
    Most action recognition models treat human activities as unitary events. However, human activities often follow a certain hierarchy. In fact, many human activities are compositional. Also, these actions are mostly human-object interactions. In this paper we propose to recognize human action by leveraging the set of interactions that define an action. In this work, we present an end-to-end network: THORN, that can leverage important human-object and object-object interactions to predict actions. This model is built on top of a 3D backbone network. The key components of our model are: 1) An object representation filter for modeling object. 2) An object relation reasoning module to capture object relations. 3) A classification layer to predict the action labels. To show the robustness of THORN, we evaluate it on EPIC-Kitchen55 and EGTEA Gaze+, two of the largest and most challenging first-person and human-object interaction datasets. THORN achieves state-of-the-art performance on both datasets

    A Modeling Paradigm for Integrating Processes and Data at the Micro Level

    Get PDF
    Despite the widespread adoption of BPM, there exist many business processes not adequately supported by existing BPM technology. In previous work we reported on the properties of these processes. As a major insight we learned that, in accordance to the data model comprising object types and object relations, the modeling and execution of processes can be based on two levels of granularity: object behavior and object interactions. This paper focuses on micro processes capturing object behavior and constituting a fundamental pillar of our framework for object-aware process management. Our approach applies the well established concept of modeling object behavior in terms of states and state transitions. Opposed to existing work, we establish a mapping between attribute values and objects states to ensure compliance between them. Finally, we provide a well-dened operational semantics enabling the automatic and dynamic generation of most end-user components at run-time (e.g., overview tables and user forms)

    Semantics of Multimedia in MPEG-7

    Get PDF
    In this paper, we present the tools standardized by MPEG-7 for describing the semantics of multimedia. In particular, we focus on the abstraction model, entities, attributes and relations of MPEG-7 semantic descriptions. MPEG-7 tools can describe the semantics of specific instances of multimedia such as one image or one video segment but can also generalize these descriptions either to multiple instances of multimedia or to a set of semantic descriptions. The key components of MPEG-7 semantic descriptions are semantic entities such as objects and events, attributes of these entities such as labels and properties, and, finally, relations of these entities such as an object being the patient of an event. The descriptive power and usability of these tools has been demonstrated in numerous experiments and applications, these make them key candidates to enable intelligent applications that deal with multimedia at human levels

    A data model for autonomous objects

    Get PDF
    Developments in distribution and networking of computing power raise questions about the feasibility of centralised control in an information system. At the same time the move towards information systems ranging over a number of different organisations brings us systems where central control is not desired. This asks for systems built of components that can function independently under local control only. A move towards autonomous components can also clearly be seen in the area of active databases. Rules are seen as part of the behaviour of objects or relations between objects. Following the encapsulation principle, this leads to encapsulation of all rules with objects. The definition of such an object is independent of other objects. To support these developments we defined the data model for autonomous objects proposed in this report. An autonomous object is an object with its own thread of control. The behaviour of an autonomous object is defined by methods, rules and dynamic constraints. The latter two refer to the complete history kept with the object. The semantics of a relation between objects is captured in relation objects. This enables us to represent arbitrary complex conditions on initiating and terminating relations and to have arbitrary actions taken on certain events. Dependent on the relations an object has, its capabilities will evolve. This is achieved through addons. An addon defines capabilities an object can be, temporarily, extended with. Structure is brought into the mass of objects at the instance level through the objects at the class level. Class objects occur for all object classes, relation object classes and addons. Their function as an object container enables the approach of groups of objects, for example for queries
    • …
    corecore