98 research outputs found

    Evolving integrated multi-model framework for on line multiple time series prediction

    Get PDF
    Time series prediction has been extensively researched in both the statistical and computational intelligence literature with robust methods being developed that can be applied across any given application domain. A much less researched problem is multiple time series prediction where the objective is to simultaneously forecast the values of multiple variables which interact with each other in time varying amounts continuously over time. In this paper we describe the use of a novel Integrated Multi-Model Framework (IMMF) that combined models developed at three di erent levels of data granularity, namely the Global, Local and Transductive models to perform multiple time series prediction. The IMMF is implemented by training a neural network to assign relative weights to predictions from the models at the three di erent levels of data granularity. Our experimental results indicate that IMMF signi cantly outperforms well established methods of time series prediction when applied to the multiple time series prediction problem

    Face recognition in video surveillance from a single reference sample through domain adaptation

    Get PDF
    Face recognition (FR) has received significant attention during the past decades in many applications, such as law enforcement, forensics, access controls, information security and video surveillance (VS), due to its covert and non-intrusive nature. FR systems specialized for VS seek to accurately detect the presence of target individuals of interest over a distributed network of video cameras under uncontrolled capture conditions. Therefore, recognizing faces of target individuals in such environment is a challenging problem because the appearance of faces varies due to changes in pose, scale, illumination, occlusion, blur, etc. The computational complexity is also an important consideration because of the growing number of cameras, and the processing time of state-of-the-art face detection, tracking and matching algorithms. In this thesis, adaptive systems are proposed for accurate still-to-video FR, where a single (or very few) reference still or a mug-shot is available to design a facial model for the target individual. This is a common situation in real-world watch-list screening applications due to the cost and feasibility of capturing reference stills, and managing facial models over time. The limited number of reference stills can adversely affect the robustness of facial models to intra-class variations, and therefore the performance of still-to-video FR systems. Moreover, a specific challenge in still-to-video FR is the shift between the enrollment domain, where high-quality reference faces are captured under controlled conditions from still cameras, and the operational domain, where faces are captured with video cameras under uncontrolled conditions. To overcome the challenges of such single sample per person (SSPP) problems, 3 new systems are proposed for accurate still-to-video FR that are based on multiple face representations and domain adaptation. In particular, this thesis presents 3 contributions. These contributions are described with more details in the following statements. In Chapter 3, a multi-classifier framework is proposed for robust still-to-video FR based on multiple and diverse face representations of a single reference face still. During enrollment of a target individual, the single reference face still is modeled using an ensemble of SVM classifiers based on different patches and face descriptors. Multiple feature extraction techniques are applied to patches isolated in the reference still to generate a diverse SVM pool that provides robustness to common nuisance factors (e.g., variations in illumination and pose). The estimation of discriminant feature subsets, classifier parameters, decision thresholds, and ensemble fusion functions is achieved using the high-quality reference still and a large number of faces captured in lower quality video of non-target individuals in the scene. During operations, the most competent subset of SVMs are dynamically selected according to capture conditions. Finally, a head-face tracker gradually regroups faces captured from different people appearing in a scene, while each individual-specific ensemble performs face matching. The accumulation of matching scores per face track leads to a robust spatio-temporal FR when accumulated ensemble scores surpass a detection threshold. Experimental results obtained with the Chokepoint and COX-S2V datasets show a significant improvement in performance w.r.t. reference systems, especially when individual-specific ensembles (1) are designed using exemplar-SVMs rather than one-class SVMs, and (2) exploit score-level fusion of local SVMs (trained using features extracted from each patch), rather than using either decision-level or feature-level fusion with a global SVM (trained by concatenating features extracted from patches). In Chapter 4, an efficient multi-classifier system (MCS) is proposed for accurate still-to-video FR based on multiple face representations and domain adaptation (DA). An individual-specific ensemble of exemplar-SVM (e-SVM) classifiers is thereby designed to improve robustness to intra-class variations. During enrollment of a target individual, an ensemble is used to model the single reference still, where multiple face descriptors and random feature subspaces allow to generate a diverse pool of patch-wise classifiers. To adapt these ensembles to the operational domains, e-SVMs are trained using labeled face patches extracted from the reference still versus patches extracted from cohort and other non-target stills mixed with unlabeled patches extracted from the corresponding face trajectories captured with surveillance cameras. During operations, the most competent classifiers per given probe face are dynamically selected and weighted based on the internal criteria determined in the feature space of e-SVMs. This chapter also investigates the impact of using different training schemes for DA, as well as, the validation set of non-target faces extracted from stills and video trajectories of unknown individuals in the operational domain. The results indicate that the proposed system can surpass state-of-the-art accuracy, yet with a significantly lower computational complexity. In Chapter 5, a deep convolutional neural network (CNN) is proposed to cope with the discrepancies between facial regions of interest (ROIs) isolated in still and video faces for robust still-to-video FR. To that end, a face-flow autoencoder CNN called FFA-CNN is trained using both still and video ROIs in a supervised end-to-end multi-task learning. A novel loss function containing a weighted combination of pixel-wise, symmetry-wise and identity preserving losses is introduced to optimize the network parameters. The proposed FFA-CNN incorporates a reconstruction network and a fully-connected classification network, where the former reconstructs a well-illuminated frontal ROI with neutral expression from a pair of low-quality non-frontal video ROIs and the latter is utilized to compare the still and video representations to provide matching scores. Thus, integrating the proposed weighted loss function with a supervised end-to-end training approach leads to generate high-quality frontal faces and learn discriminative face representations similar for the same identities. Simulation results obtained over challenging COX Face DB confirm the effectiveness of the proposed FFA-CNN to achieve convincing performance compared to current state-of-the-art CNN-based FR systems

    A Novel Machine Learning Classifier Based on a Qualia Modeling Agent (QMA)

    Get PDF
    This dissertation addresses a problem found in supervised machine learning (ML) classification, that the target variable, i.e., the variable a classifier predicts, has to be identified before training begins and cannot change during training and testing. This research develops a computational agent, which overcomes this problem. The Qualia Modeling Agent (QMA) is modeled after two cognitive theories: Stanovich\u27s tripartite framework, which proposes learning results from interactions between conscious and unconscious processes; and, the Integrated Information Theory (IIT) of Consciousness, which proposes that the fundamental structural elements of consciousness are qualia. By modeling the informational relationships of qualia, the QMA allows for retaining and reasoning-over data sets in a non-ontological, non-hierarchical qualia space (QS). This novel computational approach supports concept drift, by allowing the target variable to change ad infinitum without re-training while achieving classification accuracy comparable to or greater than benchmark classifiers. Additionally, the research produced a functioning model of Stanovich\u27s framework, and a computationally tractable working solution for a representation of qualia, which when exposed to new examples, is able to match the causal structure and generate new inferences

    The End Signs! Are We Getting the Message?

    Full text link
    The problem addressed in this dissertation has three dimensions: imminent global catastrophe, the elitist tyranny responsible for it, and Christian detachment from both. The purpose of this dissertation is not to solve the problem in any of those three dimensions. The aim is threefold—to deconstructively demonstrate the reality of the problem; to expose its historical roots in philosophy, science, and theology; and to offer a case-study example of how it the problem may be clearly viewed and understood for the purposes of 21st century Christian life. The case study is not simple or easy, but neither is the problem it addresses. Semiotics—theory of signs—is the philosophical frame of reference, as pioneered by American philosopher Charles Sanders Peirce (1839-1914). James H. Fetzer provides intensional realism as a Peircean semiotic philosophy of science. Christian realism based on Peirce’s theory of signs is a key theme, drawn from Leonard Sweet’s Christianity. The constructive example that finishes the dissertation it represents an individual’s apologetic Christian realism as a single-case study example, including philosophical and scientific foundations. At the same time, it also represents a viable de-secularized immanent frame and social imaginary for individual as well as relational Christian being and presence in 21st century reality.35 35 Sweet, So Beautiful and Leonard Sweet, Giving Blood: A Fresh Paradigm for Preaching (Grand Rapids, MI: Zondervan, 2014), Kindle; James H. Fetzer, Scientific Knowledge: Causation, Explanation, and Corroboration, Boston Studies in the Philosophy of Science, vol. 69 (Dordrecht, NL: Springer Netherlands, 1981); James H. Fetzer, Computers and Cognition: Why Minds Are Not Machines, Studies in Cognitive Systems vol. 25 (Dordrecht, NL: Kluwer Academic Publishers, 2001); .Iain McGilchrist, The Master and His Emissary: The Divided Brain and the Making of the Western World (New Haven, CT: Yale University Press, 2009); the works of Charles Sanders Peirce (see APPENDICES: Abbreviations, Citing Charles Sanders Peirce). Taylor, Modern Social Imaginaries; Taylor, A Secular Age; Taylor, “Buffered and Porous Selves.

    Rendering computational conditions: experiencing the relationality of data and algorithm through multisensory digital artworks

    Full text link
    This practice-based doctoral research develops audiovisual artworks to position computational processes in relation with conditions and materialities of their generation, such as data centres and the networked transmission and storage of data; and the conditioning interdependencies of data and algorithm. Although often considered as discrete entities and functions, this research proposes instead that data and algorithm are processually bound together within computation’s complex processes. Computation operates at multiple scales of nonsensibility: from voltage differences; to the imperceptibility of data and algorithmic operations; to the secrecy surrounding data centres. Due partly to this inaccessibility, many contemporary data rendering practices fail to deal with the conditioning aspects of data. They deal, rather, with how semantic information can be extracted from data; how data (alone) might be considered materially; or how data can be rendered to produce immersive, sometimes sublime experiences. Instead, this doctoral research develops techniques of working with data transversally across its materials, processes, and affects. I develop my ideas and practice alongside the work of artists who likewise foreground a transversal approach to data and algorithm: Ryoji Ikeda, Addie Wagenknecht, John Gerrard, Ryoichi Kurokawa, and Norimichi Hirakawa. Drawing on Alfred North Whitehead’s modes of perception, I explore computation’s nonsensibility, and data and algorithm’s entanglement. I consider how relations, conditions, and affects of computation are always disclosed in perception through the conjunction of the modes of sense perception (‘presentational immediacy’) and the prehension of associative relations in ‘causal efficacy.’ I also deploy Gilbert Simondon's concept of transduction to negotiate how data processes integrate aspects from larger systems. Together, the practice and dissertation contribute artistic techniques for engaging with the affective and relational resonances of these processes. Techniques of non-visualisation are explored to emphasise the lack of sensibility or cognitive insight of data itself. Simulations of data centres, data archives, and hard drive destruction articulate data’s paradoxical ephemerality and materiality. Multisensory relations are explored through transducing data across sense modalities. As an ensemble of techniques, this research enacts the intrinsic relationality of data and algorithm

    Semantic Spaces for Video Analysis of Behaviour

    Get PDF
    PhDThere are ever growing interests from the computer vision community into human behaviour analysis based on visual sensors. These interests generally include: (1) behaviour recognition - given a video clip or specific spatio-temporal volume of interest discriminate it into one or more of a set of pre-defined categories; (2) behaviour retrieval - given a video or textual description as query, search for video clips with related behaviour; (3) behaviour summarisation - given a number of video clips, summarise out representative and distinct behaviours. Although countless efforts have been dedicated into problems mentioned above, few works have attempted to analyse human behaviours in a semantic space. In this thesis, we define semantic spaces as a collection of high-dimensional Euclidean space in which semantic meaningful events, e.g. individual word, phrase and visual event, can be represented as vectors or distributions which are referred to as semantic representations. With the semantic space, semantic texts, visual events can be quantitatively compared by inner product, distance and divergence. The introduction of semantic spaces can bring lots of benefits for visual analysis. For example, discovering semantic representations for visual data can facilitate semantic meaningful video summarisation, retrieval and anomaly detection. Semantic space can also seamlessly bridge categories and datasets which are conventionally treated independent. This has encouraged the sharing of data and knowledge across categories and even datasets to improve recognition performance and reduce labelling effort. Moreover, semantic space has the ability to generalise learned model beyond known classes which is usually referred to as zero-shot learning. Nevertheless, discovering such a semantic space is non-trivial due to (1) semantic space is hard to define manually. Humans always have a good sense of specifying the semantic relatedness between visual and textual instances. But a measurable and finite semantic space can be difficult to construct with limited manual supervision. As a result, constructing semantic space from data is adopted to learn in an unsupervised manner; (2) It is hard to build a universal semantic space, i.e. this space is always contextual dependent. So it is important to build semantic space upon selected data such that it is always meaningful within the context. Even with a well constructed semantic space, challenges are still present including; (3) how to represent visual instances in the semantic space; and (4) how to mitigate the misalignment of visual feature and semantic spaces across categories and even datasets when knowledge/data are generalised. This thesis tackles the above challenges by exploiting data from different sources and building contextual semantic space with which data and knowledge can be transferred and shared to facilitate the general video behaviour analysis. To demonstrate the efficacy of semantic space for behaviour analysis, we focus on studying real world problems including surveillance behaviour analysis, zero-shot human action recognition and zero-shot crowd behaviour recognition with techniques specifically tailored for the nature of each problem. Firstly, for video surveillances scenes, we propose to discover semantic representations from the visual data in an unsupervised manner. This is due to the largely availability of unlabelled visual data in surveillance systems. By representing visual instances in the semantic space, data and annotations can be generalised to new events and even new surveillance scenes. Specifically, to detect abnormal events this thesis studies a geometrical alignment between semantic representation of events across scenes. Semantic actions can be thus transferred to new scenes and abnormal events can be detected in an unsupervised way. To model multiple surveillance scenes simultaneously, we show how to learn a shared semantic representation across a group of semantic related scenes through a multi-layer clustering of scenes. With multi-scene modelling we show how to improve surveillance tasks including scene activity profiling/understanding, crossscene query-by-example, behaviour classification, and video summarisation. Secondly, to avoid extremely costly and ambiguous video annotating, we investigate how to generalise recognition models learned from known categories to novel ones, which is often termed as zero-shot learning. To exploit the limited human supervision, e.g. category names, we construct the semantic space via a word-vector representation trained on large textual corpus in an unsupervised manner. Representation of visual instance in semantic space is obtained by learning a visual-to-semantic mapping. We notice that blindly applying the mapping learned from known categories to novel categories can cause bias and deteriorating the performance which is termed as domain shift. To solve this problem we employed techniques including semisupervised learning, self-training, hubness correction, multi-task learning and domain adaptation. All these methods in combine achieve state-of-the-art performance in zero-shot human action task. In the last, we study the possibility to re-use known and manually labelled semantic crowd attributes to recognise rare and unknown crowd behaviours. This task is termed as zero-shot crowd behaviours recognition. Crucially we point out that given the multi-labelled nature of semantic crowd attributes, zero-shot recognition can be improved by exploiting the co-occurrence between attributes. To summarise, this thesis studies methods for analysing video behaviours and demonstrates that exploring semantic spaces for video analysis is advantageous and more importantly enables multi-scene analysis and zero-shot learning beyond conventional learning strategies

    The Translocal Event and the Polyrhythmic Diagram

    Get PDF
    This thesis identifies and analyses the key creative protocols in translocal performance practice, and ends with suggestions for new forms of transversal live and mediated performance practice, informed by theory. It argues that ontologies of emergence in dynamic systems nourish contemporary practice in the digital arts. Feedback in self-organised, recursive systems and organisms elicit change, and change transforms. The arguments trace concepts from chaos and complexity theory to virtual multiplicity, relationality, intuition and individuation (in the work of Bergson, Deleuze, Guattari, Simondon, Massumi, and other process theorists). It then examines the intersection of methodologies in philosophy, science and art and the radical contingencies implicit in the technicity of real-time, collaborative composition. Simultaneous forces or tendencies such as perception/memory, content/ expression and instinct/intellect produce composites (experience, meaning, and intuition- respectively) that affect the sensation of interplay. The translocal event is itself a diagram - an interstice between the forces of the local and the global, between the tendencies of the individual and the collective. The translocal is a point of reference for exploring the distribution of affect, parameters of control and emergent aesthetics. Translocal interplay, enabled by digital technologies and network protocols, is ontogenetic and autopoietic; diagrammatic and synaesthetic; intuitive and transductive. KeyWorx is a software application developed for realtime, distributed, multimodal media processing. As a technological tool created by artists, KeyWorx supports this intuitive type of creative experience: a real-time, translocal “jamming” that transduces the lived experience of a “biogram,” a synaesthetic hinge-dimension. The emerging aesthetics are processual – intuitive, diagrammatic and transversal

    Active Residues

    Get PDF
    My PhD studies the aftermath of the museum collection to show how the removal of the object leaves behind the multiplicity of its conditions. As an entry point, I probe a set of questions that arise from a sequence of events that happen in the autumn of 2018. It's a story that begins with an error: in six short hours in September, a disastrous fire brought an end to two centuries' worth of treasures held in Brazil's National Museum. Only a handful of artifacts of the 20 million items that were housed at the museum survived the fire. At the age of algorithmic reproduction, it feels almost unimaginable that so many valuable objects were simply wiped off the face of the earth without leaving any digital trace. I propose that although the museum's objects no longer operate within their inherited institutional orders or colonial indexes, some of their constitutions, temperaments, and affordances are "dragged" with them from their original matter to the digital and information realm. The residues are unordered strata of matter, bio-form, and digital information that remained unclaimed by the institution. The museum's residues do not have form, like objects. Instead, they are the surplus of affects, tools, and affordances that arrive with the objects. They enunciate the futurity of the museum apparatus in its state of afterness. Museum afterness applies to the incomplete state between the "no longer" and the "not yet". Afterness is the state that comes after an event or an institutional structure has ended but the orders and relations that conditioned its existence are still active. I argue that the state of afterness not only stands for what comes after the institution but can potentially represent knowledge based on continuity of transformation between technical systems, matter formations, and biological life forms. Active Residues is a practice-theory research project where I use theoretical frameworks and performance-based methods to speculate on several "modes of afterness," which is how I define a set of modalities and practices stirred up in the wake of the museum that can become active sites for unlearning it

    Scalable computing for earth observation - Application on Sea Ice analysis

    Get PDF
    In recent years, Deep learning (DL) networks have shown considerable improvements and have become a preferred methodology in many different applications. These networks have outperformed other classical techniques, particularly in large data settings. In earth observation from the satellite field, for example, DL algorithms have demonstrated the ability to learn complicated nonlinear relationships in input data accurately. Thus, it contributed to advancement in this field. However, the training process of these networks has heavy computational overheads. The reason is two-fold: The sizable complexity of these networks and the high number of training samples needed to learn all parameters comprising these architectures. Although the quantity of training data enhances the accuracy of the trained models in general, the computational cost may restrict the amount of analysis that can be done. This issue is particularly critical in satellite remote sensing, where a myriad of satellites generate an enormous amount of data daily, and acquiring in-situ ground truth for building a large training dataset is a fundamental prerequisite. This dissertation considers various aspects of deep learning based sea ice monitoring from SAR data. In this application, labeling data is very costly and time-consuming. Also, in some cases, it is not even achievable due to challenges in establishing the required domain knowledge, specifically when it comes to monitoring Arctic Sea ice with Synthetic Aperture Radar (SAR), which is the application domain of this thesis. Because the Arctic is remote, has long dark seasons, and has a very dynamic weather system, the collection of reliable in-situ data is very demanding. In addition to the challenges of interpreting SAR data of sea ice, this issue makes SAR-based sea ice analysis with DL networks a complicated process. We propose novel DL methods to cope with the problems of scarce training data and address the computational cost of the training process. We analyze DL network capabilities based on self-designed architectures and learn strategies, such as transfer learning for sea ice classification. We also address the scarcity of training data by proposing a novel deep semi-supervised learning method based on SAR data for incorporating unlabeled data information into the training process. Finally, a new distributed DL method that can be used in a semi-supervised manner is proposed to address the computational complexity of deep neural network training
    • 

    corecore