135 research outputs found

    Mechanisms of memory consolidation : Analyzing the coordinated activity of concept neurons in the human medial temporal lobe during waking and sleep

    Get PDF
    The aim of this thesis is to investigate the role of human concept neurons in memory consolidation during sleep. Memory consolidation is a process by which memories initially dependent on the hippocampus are transferred to cortical areas, thereby gradually becoming independent of the hippocampus. Theories of memory consolidation posit that memory traces encoding autobiographic episodes are rapidly formed in the hippocampus during waking, and reactivated during subsequent slow-wave sleep to be transformed into a long-lasting form. Concept neurons in the human medial temporal lobe are neurons tuned to semantic concepts in a selective, sparse, and invariant manner. These neurons respond to pictures or written and spoken words representing their preferred concept (for example, a person, an animal, an object), regardless of physical stimulus properties. Concept neurons have been speculated to be building blocks for episodic memory. We used whole-night recordings from concept neurons in the medial temporal lobe of epilepsy patients implanted with depth electrodes for presurgical monitoring to test the hypothesis that the coordinated activity of concept neurons during sleep is a neurophysiological correlate of memory consolidation in humans. To conduct this study, we developed software methods for artifact removal and spike sorting of long-term recordings from single neurons. In an evaluation on both simulated model data and visual stimulus presentation experiments, our software outperformed previous methods. Starting from the conceptual analogy between rodent place cells and human concept neurons, we developed an episodic memory task in which participants learned a story eliciting sequential activity in concept neurons. We found that concept neurons preserved their semantic tuning across whole-night recordings. Hippocampal concept neurons had, on average, lower firing rates during rapid-eye-movement (REM) sleep than during waking. During slow-wave sleep, firing rates did not significantly differ from waking. The activity of concept neurons increased during ripples in the local field potential. Furthermore, concept neurons whose preferred stimuli participated in the memorized story were conjointly reactivated after learning, most pronouncedly during slow-wave sleep. Cross-correlations of concept neurons were most asymmetric during slow-wave sleep. Cross-correlation peak times were often in the range believed to be relevant for spike-timing-dependent plasticity. However, time lags of peak cross-correlations did not correlate with the positional order of stimuli in the memorized story. Our findings support the hypothesis that concept neurons rapidly encode a memory trace during learning, and that the reactivation of the same neurons during subsequent slow-wave sleep and ripples contributes to the consolidation of the memory episode. However, the consolidation of the temporal order of events in humans appears to differ from what rodent research suggests.Mechanismen der Gedächtniskonsolidierung : Analyse der Aktivität von Konzeptzellen im menschlichen Schläfenlappen während Wachheit und Schlaf In dieser Arbeit wird die Rolle von Konzeptzellen ("concept neurons") im Gehirn des Menschen bei der Gedächtniskonsolidierung im Schlaf untersucht. Gedächtniskonsolidierung ist ein Prozess, durch den Gedächtnisinhalte, die zunächst vom Hippokampus abhängen, in die Großhirnrinde übertragen werden. Dadurch reduziert sich im Laufe der Zeit die Abhängigkeit der Gedächtnisinhalte vom Hippokampus. In der Theorie der Gedächtniskonsolidierung wird angenommen, dass während wachem Erleben sehr schnell Gedächtnisspuren im Hippokampus entstehen, welche im darauffolgenden Tiefschlaf reaktiviert werden, um so eine langfristig stabile Gedächtnisspur zu erzeugen. Konzeptzellen im Schläfenlappen des Menschen sind Nervenzellen, die auf den semantischen Inhalt eines Stimulus selektiv und semantisch invariant reagieren. Konzeptzellen antworten auf Abbildungen ihres präferierten Konzepts (zum Beispiel einer Person, eines Tieres oder eines Objekts) oder auf geschriebene und gesprochene Wörter, die das gleiche Konzept darstellen, unabhängig von den speziellen Eigenschaften des Stimulus, wie zum Beispiel Bildgröße oder -farbe. Auf jedes Konzept reagiert dabei nur ein sehr kleiner Teil dieser Zellen. Man vermutet, dass Konzeptzellen Bausteine des episodischen Gedächtnisses sind. Die vorliegende Studie nutzt Aufzeichnungen der Aktivität einzelner Konzeptzellen während ganzer Nächte, um zu untersuchen, inwiefern die koordinierte Aktivität von Konzeptzellen im Schlaf ein neurophysiologisches Korrelat der Gedächtniskonsolidierung darstellt. Die Teilnehmer der Studie waren Epilepsiepatienten, in deren mediale Schläfenlappen aus klinischen Gründen Tiefenelektroden zur Anfallsaufzeichnung implantiert worden waren. Zur Analyse der Daten wurde zunächst eine Software entwickelt, die eine Artefaktbereinigung und das Spike-Sorting von neuronalen Langzeitaufzeichnungen leistet. Diese Software zeigte deutliche Vorteile gegenüber vorhandenen Methoden, und zwar sowohl in Tests mit simulierten Modelldatensätzen als auch im Falle tatsächlicher Aufzeichnungen (hier Experimente, in denen visuelle Stimuli auf einem Laptop dargestellt wurden). Ausgehend von einer Analogie zwischen Ortszellen ("place cells") bei Nagetieren und Konzeptzellen bei Menschen wurde ein Experiment entwickelt, das episodisches Gedächtnis operationalisierte. Darin lernten die Teilnehmer eine kurze Geschichte auswendig, was sequentielle Aktivität von Konzeptzellen auslöste. Konzeptzellen zeigten ein stabiles Antwortverhalten: am Abend und nächsten Morgen antworteten sie auf die gleichen Stimuli. Konzeptzellen im Hippokampus hatten im Mittel im Rapid-Eye-Movement-Schlaf (REM-Schlaf) niedrigere Feuerraten als während Wachheit. Im Tiefschlaf unterschieden sich die Feuerraten nicht signifikant von Wachheit. Die Aktivität der Konzeptzellen war während "ripples" im lokalen Feldpotential erhöht, und Konzeptzellen, deren präferierte Stimuli in der erinnerten Geschichte auftauchten, feuerten im darauffolgenden Schlaf gemeinsam, ein Effekt, der im Tiefschlaf besonders ausgeprägt war. Die Kreuzkorrelationen von Konzeptzellen waren im Tiefschlaf asymmetrischer als während Wachheit und REM-Schlaf, und die typischen Zeitabstände des Feuerns von Konzeptzellen lagen in einem Bereich, der als relevant für "spike-timing-dependent plasticity" gilt. Die Zeitabstände waren jedoch nicht mit dem Abstand der präferierten Stimuli in der erinnerten Geschichte korreliert. Diese Befunde stützen die Theorie, dass die Aktivität von Konzeptzellen während des Lernens instantan eine Gedächtnisspur erzeugt, und dass die Reaktivierung der gleichen Nervenzellen im Tiefschlaf nach dem Lernen zur Konsolidierung der Gedächtnisinhalte beiträgt. Die zeitliche Reihenfolge von Ereignissen wird offenbar im menschlichen Gehirn nicht auf die Weise konsolidiert, die sich aus der Forschung an Nagetieren nahelegte

    A Multiscale Region-Based Motion Detection and Background Subtraction Algorithm

    Get PDF
    This paper presents a region-based method for background subtraction. It relies on color histograms, texture information, and successive division of candidate rectangular image regions to model the background and detect motion. Our proposed algorithm uses this principle and combines it with Gaussian Mixture background modeling to produce a new method which outperforms the classic Gaussian Mixture background subtraction method. Our method has the advantages of filtering noise during image differentiation and providing a selectable level of detail for the contour of the moving shapes. The algorithm is tested on various video sequences and is shown to outperform state-of-the-art background subtraction methods

    IIR modeling of interpositional transfer functions with a genetic algorithm aided by an adaptive filter for the purpose of altering free-field sound localization

    Get PDF
    The psychoacoustic process of sound localization is a system of complex analysis. Scientists have found evidence that both binaural and monaural cues are responsible for determining the angles of elevation and azimuth which represent a sound source. Engineers have successfully used these cues to build mathematical localization systems. Research has indicated that spectral cues play an important role in 3-d localization. Therefore, it seems conceivable to design a filtering system which can alter the localization of a sound source, either for correctional purposes or listener preference. Such filters, known as Interpositional Transfer Functions, can be formed from division in the z-domain of Head-related Transfer Functions. HRTF’s represent the free-field response of the human body to sound processed by the ears. In filtering applications, the use of IIR filters is often favored over that of FIR filters due to their preservation of resolution while minimizing the number of required coefficients. Several methods exist for creating IIR filters from their representative FIR counterparts. For complicated filters, genetic algorithms (GAs) have proven effective. The research summarized in this thesis combines the past efforts of researchers in the fields of sound localization, genetic algorithms, and adaptive filtering. It represents the initial stage in the development of a practical system for future hardware implementation which uses a genetic algorithm as a driving engine. Under ideal conditions, an IIR filter design system has been demonstrated to successfully model several IPTF pairs which alter sound localization when applied to non-minimum phase HRTF’s obtained from free-field measurement

    Video foreground extraction for mobile camera platforms

    Get PDF
    Foreground object detection is a fundamental task in computer vision with many applications in areas such as object tracking, event identification, and behavior analysis. Most conventional foreground object detection methods work only in a stable illumination environments using fixed cameras. In real-world applications, however, it is often the case that the algorithm needs to operate under the following challenging conditions: drastic lighting changes, object shape complexity, moving cameras, low frame capture rates, and low resolution images. This thesis presents four novel approaches for foreground object detection on real-world datasets using cameras deployed on moving vehicles.The first problem addresses passenger detection and tracking tasks for public transport buses investigating the problem of changing illumination conditions and low frame capture rates. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modelling method with a human shape model into a weighted Bayesian framework to detect passengers. To deal with the problem of tracking multiple targets, we employ the Reversible Jump Monte Carlo Markov Chain tracking algorithm. Using the SVM classifier, the appearance transformation models capture changes in the appearance of the foreground objects across two consecutives frames under low frame rate conditions. In the second problem, we present a system for pedestrian detection involving scenes captured by a mobile bus surveillance system. It integrates scene localization, foreground-background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data.In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarity, and the second stage further clusters these aligned frames according to consistency in illumination. This produces clusters of images that are differential in viewpoint and lighting. A kernel density estimation (KDE) technique for colour and gradient is then used to construct background models for each image cluster, which is further used to detect candidate foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be detected.In addition to the second problem, we present three direct pedestrian detection methods that extend the HOG (Histogram of Oriented Gradient) techniques (Dalal and Triggs, 2005) and provide a comparative evaluation of these approaches. The three approaches include: a) a new histogram feature, that is formed by the weighted sum of both the gradient magnitude and the filter responses from a set of elongated Gaussian filters (Leung and Malik, 2001) corresponding to the quantised orientation, which we refer to as the Histogram of Oriented Gradient Banks (HOGB) approach; b) the codebook based HOG feature with branch-and-bound (efficient subwindow search) algorithm (Lampert et al., 2008) and; c) the codebook based HOGB approach.In the third problem, a unified framework that combines 3D and 2D background modelling is proposed to detect scene changes using a camera mounted on a moving vehicle. The 3D scene is first reconstructed from a set of videos taken at different times. The 3D background modelling identifies inconsistent scene structures as foreground objects. For the 2D approach, foreground objects are detected using the spatio-temporal MRF algorithm. Finally, the 3D and 2D results are combined using morphological operations.The significance of these research is that it provides basic frameworks for automatic large-scale mobile surveillance applications and facilitates many higher-level applications such as object tracking and behaviour analysis

    An algorithm for multiple object tracking

    Get PDF
    Background for multiple object tracking -- Data association -- The model of object

    Feature based dynamic intra-video indexing

    Get PDF
    A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate

    Automatic object classification for surveillance videos.

    Get PDF
    PhDThe recent popularity of surveillance video systems, specially located in urban scenarios, demands the development of visual techniques for monitoring purposes. A primary step towards intelligent surveillance video systems consists on automatic object classification, which still remains an open research problem and the keystone for the development of more specific applications. Typically, object representation is based on the inherent visual features. However, psychological studies have demonstrated that human beings can routinely categorise objects according to their behaviour. The existing gap in the understanding between the features automatically extracted by a computer, such as appearance-based features, and the concepts unconsciously perceived by human beings but unattainable for machines, or the behaviour features, is most commonly known as semantic gap. Consequently, this thesis proposes to narrow the semantic gap and bring together machine and human understanding towards object classification. Thus, a Surveillance Media Management is proposed to automatically detect and classify objects by analysing the physical properties inherent in their appearance (machine understanding) and the behaviour patterns which require a higher level of understanding (human understanding). Finally, a probabilistic multimodal fusion algorithm bridges the gap performing an automatic classification considering both machine and human understanding. The performance of the proposed Surveillance Media Management framework has been thoroughly evaluated on outdoor surveillance datasets. The experiments conducted demonstrated that the combination of machine and human understanding substantially enhanced the object classification performance. Finally, the inclusion of human reasoning and understanding provides the essential information to bridge the semantic gap towards smart surveillance video systems

    Segmentation d'images et suivi d'objets en vidéos approches par estimation, sélection de caractéristiques et contours actifs

    Get PDF
    Cette thèse aborde deux problèmes parmi les plus importants et les plus complexes dans la vision artificielle, qui sont la segmentation d'images et le suivi d'objets dans les vidéos. Nous proposons plusieurs approches, traitant de ces deux problèmes, qui sont basées sur la modélisation variationnelle (contours actifs) et statistique. Ces approches ont pour but de surmonter différentes limites théoriques et pratiques (algorithmiques) de ces deux problèmes. En premier lieu, nous abordons le problème d'automatisation de la segmentation par contours actifs"ensembles de niveaux", et sa généralisation pour le cas de plusieurs régions. Pour cela, un modèle permettant d'estimer l'information de régions de manière automatique, et adaptative au contenu de l'image, est proposé. Ce modèle n'utilise aucune information a priori sur les régions, et traite également les images de couleur et de texture, avec un nombre arbitraire de régions. Nous introduisons ensuite une approche statistique pour estimer et intégrer la pertinence des caractéristiques et la sémantique dans la segmentation d'objets d'intérêt. En deuxième lieu, nous abordons le problème du suivi d'objets dans les vidéos en utilisant les contours actifs. Nous proposons pour cela deux modèles différents. Le premier suppose que les propriétés photométriques des objets suivis sont invariantes dans le temps, mais le modèle est capable de suivre des objets en présence de bruit, et au milieu de fonds de vidéos non-statiques et encombrés. Ceci est réalisé grâce à l'intégration de l'information de régions, de frontières et de formes des objets suivis. Le deuxième modèle permet de prendre en charge les variations photométriques des objets suivis, en utilisant un modèle statistique adaptatif à l'apparence de ces derniers. Finalement, nous proposons un nouveau modèle statistique, basé sur la Gaussienne généralisée, pour une représentation efficace de données bruitées et de grandes dimensions en segmentation. Ce modèle est utilisé pour assurer la robustesse de la segmentation des images de couleur contenant du bruit, ainsi que des objets en mouvement dans les vidéos (acquises par des caméras statiques) contenant de l'ombrage et/ou des changements soudains d'illumination
    • …
    corecore