8 research outputs found

    Video trajectory analysis using unsupervised clustering and multi-criteria ranking

    Get PDF
    Surveillance camera usage has increased significantly for visual surveillance. Manual analysis of large video data recorded by cameras may not be feasible on a larger scale. In various applications, deep learning-guided supervised systems are used to track and identify unusual patterns. However, such systems depend on learning which may not be possible. Unsupervised methods relay on suitable features and demand cluster analysis by experts. In this paper, we propose an unsupervised trajectory clustering method referred to as t-Cluster. Our proposed method prepares indexes of object trajectories by fusing high-level interpretable features such as origin, destination, path, and deviation. Next, the clusters are fused using multi-criteria decision making and trajectories are ranked accordingly. The method is able to place abnormal patterns on the top of the list. We have evaluated our algorithm and compared it against competent baseline trajectory clustering methods applied to videos taken from publicly available benchmark datasets. We have obtained higher clustering accuracies on public datasets with significantly lesser computation overhead

    Making sense of sensory input

    Get PDF
    This paper attempts to answer a central question in unsupervised learning: what does it mean to "make sense" of a sensory sequence? In our formalization, making sense involves constructing a symbolic causal theory that both explains the sensory sequence and also satisfies a set of unity conditions. The unity conditions insist that the constituents of the causal theory -- objects, properties, and laws -- must be integrated into a coherent whole. On our account, making sense of sensory input is a type of program synthesis, but it is unsupervised program synthesis. Our second contribution is a computer implementation, the Apperception Engine, that was designed to satisfy the above requirements. Our system is able to produce interpretable human-readable causal theories from very small amounts of data, because of the strong inductive bias provided by the unity conditions. A causal theory produced by our system is able to predict future sensor readings, as well as retrodict earlier readings, and impute (fill in the blanks of) missing sensory readings, in any combination. We tested the engine in a diverse variety of domains, including cellular automata, rhythms and simple nursery tunes, multi-modal binding problems, occlusion tasks, and sequence induction intelligence tests. In each domain, we test our engine's ability to predict future sensor values, retrodict earlier sensor values, and impute missing sensory data. The engine performs well in all these domains, significantly out-performing neural net baselines. We note in particular that in the sequence induction intelligence tests, our system achieved human-level performance. This is notable because our system is not a bespoke system designed specifically to solve intelligence tests, but a general-purpose system that was designed to make sense of any sensory sequence

    Semi-supervised online structure learning for composite event recognition

    No full text
    Online structure learning approaches, such as those stemming from statistical relational learning, enable the discovery of complex relations in noisy data streams. However, these methods assume the existence of fully-labelled training data, which is unrealistic for most real-world applications. We present a novel approach for completing the supervision of a semi-supervised structure learning task. We incorporate graph-cut minimisation, a technique that derives labels for unlabelled data, based on their distance to their labelled counterparts. In order to adapt graph-cut minimisation to first order logic, we employ a suitable structural distance for measuring the distance between sets of logical atoms. The labelling process is achieved online (single-pass) by means of a caching mechanism and the Hoeffding bound, a statistical tool to approximate globally-optimal decisions from locally-optimal ones. We evaluate our approach on the task of composite event recognition by using a benchmark dataset for human activity recognition, as well as a real dataset for maritime monitoring. The evaluation suggests that our approach can effectively complete the missing labels and eventually, improve the accuracy of the underlying structure learning system. © 2019, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature

    Kant's cognitive architecture

    Get PDF
    Imagine a machine, equipped with sensors, receiving a stream of sensory information. It must, somehow, make sense of this stream of sensory data. But what, exactly, does this involve? We have an intuitive understanding of what is involved in “making sense” of sensory data – but can we specify precisely what is involved? Can this intuitive notion be formalized? In this thesis, we make three contributions. First, we provide a precise formalization of what it means to “make sense” of a sensory sequence. According to our definition, making sense means constructing a symbolic causal theory that explains the sensory sequence and satisfies a set of unity conditions that were inspired by Kant’s discussion in the first half of the Critique of Pure Reason. According to our interpretation, making sense of sensory input is a type of program synthesis, but it is unsupervised program synthesis. Our second contribution is a computer implementation, the Apperception Engine, that was designed to satisfy our requirements for making sense of a sensory sequence. Our system is able to produce interpretable human-readable causal theories from very small amounts of data, because of the strong inductive bias provided by the Kantian unity constraints. A causal theory produced by our system is able to predict future sensor readings, as well as retrodict earlier readings, and impute missing sensory readings. In fact, it is able to do all three tasks simultaneously. The engine is implemented in Answer Set Programming (ASP) and induces theories expressed in an extension of Datalog that includes causal rules and constraints. We test the engine in a diverse variety of domains, including cellular automata, rhythms and simple nursery tunes, multi-modal binding problems, occlusion tasks, and sequence induction IQ tests. In each domain, we test our engine’s ability to predict future sensor values, retrodict earlier sensor values, and impute missing sensory data. The Apperception Engine performs well in all these domains, significantly out-performing neural net baselines. These results are significant because neural nets typically struggle to solve the binding problem (where information from different modalities must somehow be combined together into different aspects of one unified object) and fail to solve occlusion tasks (in which objects are sometimes visible and sometimes obscured from view). We note in particular that in the sequence induction IQ tasks, our system achieves human-level performance. This is notable because the Apperception Engine was not designed to solve these IQ tasks; it is not a bespoke hand-engineered solution to this particular domain. – Rather, it is a general purpose system that attempts to make sense of any sensory sequence, that just happens to be able to solve these IQ tasks “out of the box”. Our third contribution is a major extension of the engine to handle noisy and ambiguous data. While the initial implementation assumes the sensory input has already been preprocessed into ground atoms of first-order logic, our extension makes sense of raw unprocessed input – a sequence of pixel images from a video camera, for example. The resulting system is a neuro-symbolic framework for distilling interpretable theories out of streams of raw, unprocessed sensory experience.Open Acces

    Semi-Supervised Online Structure Learning for Composite Event Recognition

    No full text
    Online structure learning approaches, such as those stemming from Statistical Relational Learning, enable the discovery of complex relations in noisy data streams. However, these methods assume the existence of fully-labelled training data, which is unrealistic for most real-world applications. We present a novel approach for completing the supervision of a semi-supervised structure learning task. We incorporate graph-cut minimisation, a technique that derives labels for unlabelled data, based on their distance to their labelled counterparts. In order to adapt graph-cut minimisation to first order logic, we employ a suitable structural distance for measuring the distance between sets of logical atoms. The labelling process is achieved online (single-pass) by means of a caching mechanism and the Hoeffding bound, a statistical tool to approximate globally-optimal decisions from locally-optimal ones. We evaluate our approach on the task of composite event recognition by using a benchmark dataset for human activity recognition, as well as a real dataset for maritime monitoring. The evaluation suggests that our approach can effectively complete the missing labels and eventually, improve the accuracy of the underlying structure learning system
    corecore