4 research outputs found
Multi-target tracking and performance evaluation on videos
PhDMulti-target tracking is the process that allows the extraction of object motion patterns of
interest from a scene. Motion patterns are often described through metadata representing object
locations and shape information. In the first part of this thesis we discuss the state-of-the-art
methods aimed at accomplishing this task on monocular views and also analyse the methods for
evaluating their performance. The second part of the thesis describes our research contribution
to these topics.
We begin presenting a method for multi-target tracking based on track-before-detect (MTTBD)
formulated as a particle filter. The novelty involves the inclusion of the target identity
(ID) into the particle state, which enables the algorithm to deal with an unknown and unlimited
number of targets. We propose a probabilistic model of particle birth and death based on Markov
Random Fields. This model allows us to overcome the problem of the mixing of IDs of close
targets.
We then propose three evaluation measures that take into account target-size variations, combine
accuracy and cardinality errors, quantify long-term tracking accuracy at different accuracy
levels, and evaluate ID changes relative to the duration of the track in which they occur. This
set of measures does not require pre-setting of parameters and allows one to holistically evaluate
tracking performance in an application-independent manner.
Lastly, we present a framework for multi-target localisation applied on scenes with a high
density of compact objects. Candidate target locations are initially generated by extracting object
features from intensity maps using an iterative method based on a gradient-climbing technique
and an isocontour slicing approach. A graph-based data association method for multi-target
tracking is then applied to link valid candidate target locations over time and to discard those
which are spurious. This method can deal with point targets having indistinguishable appearance
and unpredictable motion.
MT-TBD is evaluated and compared with state-of-the-art methods on real-world surveillanceThis work was supported by the EU, under the FP7 project APIDIS (ICT-216023) and the
Artemis JU and TSB as part of the COPCAMS project (332913)
Expanding Task Diversity in Explanation-Based Interactive Task Learning
The possibility of having artificial agents that can interact with humans and learn completely new tasks through instruction and demonstration is an exciting prospect. This is the goal of the emerging research area of Interactive Task Learning. Solving this problem requires integrating many capabilities across AI to create general robot learns that can operate in a variety of environments. One particular challenge is that the space of possible tasks is extremely large and varied. Developing approaches that cover this space is a difficult challenge, made more so by having to learn from a limited, though high-quality, number of examples given through interaction with a teacher.
In this dissertation, we identify three major dimensions of task complexity (diverse types of actions, task formulations, and task modifiers), and describe extensions that demonstrate greater learning capabilities for each dimension than previous work. First, we extend the representations and learning mechanism for innate tasks so the agent can learn tasks that utilize many different types of actions beyond physical object manipulation, such as communication and mental operations. Second, we implement a novel goal-graph representation that supports both goal-based and procedural tasks. Thus the instructor can formulate a task as achieving a goal and let the agent use planning to execute it, or can formulate the task as executing a procedure, or sequence of steps, when it is not easy to define a goal. This also allows interesting cases of a task that blends elements of a procedure and goal. Third, we added support for learning subtasks with various modifying clauses, such as temporal constraints, conditions, or looping structures. Crucially, we show that the agent can learn and generalize a canonical version of a task and then combine it with these various modifiers within a task hierarchy without requiring additional instruction.
This is done in the context of Rosie -- an agent implemented within the Soar cognitive architecture that can learn completely new tasks in one shot through situated interactive instruction. By leveraging explanation-based generalization and domain knowledge, the agent quickly learns new hierarchical tasks, including their structure, arguments, goals, execution policies, and task decompositions, through natural language instruction. It has been used with various robotic platforms, though most of the learning demonstrations and evaluations in this work use a simulated mobile robot in a multi-room, partially-observable environment. In the end, we show that the agent can combine all of these extensions while learning complex hierarchical tasks that cover extended periods of time and demonstrate significant flexibility.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/168026/1/mininger_1.pd