40 research outputs found

    Towards Robust, Interpretable and Scalable Visual Representations

    Get PDF
    Visual representation is one of the central problems in computer vision. The essential problem is to develop a unified representation that effectively encodes both visual appearance and spatial information so that it can be easily applied to various vision applications such as face recognition, image matching, and multimodal image retrieval. Along with the history of computer vision research, there are four major levels of visual representations, i.e., geometric, low-level, mid-level and high-level. The dissertation comprises four works studying effective visual representations in the four different levels. Multiple approaches are proposed with the aim of improving the robustness, interpretability, and scalability of visual representations. Geometric features are effective in matching images under spatial transformations however their performance is sensitive to the noises. In the first part, we propose to model the uncertainty of geometric representation based on line segments and propose to equip these features with uncertainty modeling so that they could be robustly applied in the image-based geolocation application. We study in the second part the robustness of feature encoding to noisy keypoints. We show that traditional feature encoding is sensitive to background or noisy features. We propose the Selective Encoding framework which learns the relevance distribution of each codeword and incorporate such information with the original codebook model. Our approach is more robust to the localization errors or uncertainty in the active face authentication application. The mission of visual understanding is to express and describe the image content which is essentially relating images to human language. That typically involves finding a common representation inferable from both domains of data. In the third part, we propose a framework to extract a mid-level spatial representation directly from language descriptions and match such spatial layouts to the detected object bounding boxes for retrieving indoor scene images from user text queries. Modern high-level visual features are typically learned from supervised datasets, whose scalability is largely limited by the requirement of dedicated human annotation. In the last part, we propose to learn visual representations from large-scale weakly supervised data for a large number of natural language-based concepts, i.e., n-gram phrases. We propose the differentiable Jelinek-Mercer smoothing loss and train a deep convolutional neural network from images with associated user comments. We show that the learned model can predict a large number of phrase-based concepts from images, can be effectively applied to image-caption applications and transfers well to other visual recognition datasets

    Prediction-related neural response alterations in the ventral visual stream

    Get PDF
    Theories of predictive coding (PC; Rao & Ballard, 1999) have dominated neurocognitive research in explaining thought and perception processes in various domains. The basic principle is that perception relies not only on bottom-up processing of sensory input but also on top-down predictions. The current thesis describes several neuronal response alterations in cortical visual areas measured with neuroimaging methods. The so-called repetition suppression (RS) effect was connected to predictive coding as repetitions make stimuli more expected, which results in a smaller prediction error and therefore attenuated neuronal activity. Still, it is questioned whether RS reflects the PE or is a local process by neuronal populations that occurs without top-down influences (Grill-Spector et al., 2006). Another often investigated effect is the reduced neuronal response to expected or predicted visual input called expectation suppression (ES). A considerable body of research on contextual response changes, such as RS and ES, relates to the visual system and the face-processing network in particular. Overall, we demonstrate the importance of stimulus predictability for studies using RS to uncover expectancy-related effects. Furthermore, we suggest that the influence of sensory precision on measures of RS and ES needs more attention in future research. Concerning the stimulus material in the presented studies - unfamiliar, visually familiar, and famous familiar faces - we also emphasize the importance of thoroughly considering the characteristics of faces in terms of prior belief and sensory input precision and predictability when using them for testing prediction-related effects

    Tracking nucleotide-binding-site-leucine-rich-repeat resistance gene analogues in the wheat genome complex

    Get PDF
    Investigations into plant-pathogen interactions have provided us with several models underlying the genetic basis of host resistance in plants. In the past decade, tens of resistance genes have been isolated from numerous crop and model plant species and these form a few distinct classes when classified by domain structure, the majority being nucleotide-bindingsite- leucine-rich-repeat (NBS-LRR) genes. The NBS-LRR family consists of two sub-families based on the N-terminal domain: the coiled-coil (CC) NBS-LRRs and the Toll Interleukin Receptor homology domain (TIR) NBS-LRRs. The potential of these genes for future and current agricultural breeding programs has driven a large number of studies exploring the members of these gene families in the genomes of a variety of crop species. In the present study I focused on the NBS-LRR family in the allohexaploid wheat genome and obtained a comprehensive set of Triticeae NBS-LRR homologues using a combination of data-mining approaches. As starting point I detected conserved motifs in the dataset, finding all six previously characterized in the core-NBS domain of other plant NBS-LRRs. Phylogenetic analysis was performed to study relationships between the Triticeae NBS-LRR family and the 25 CC-NBS-LRR (CNL) R genes identified to date. I found the Triticeae CNL family to be highly divergent, containing ancient clade lineages, as seen in all angiosperm 120 taxa previously studied, and found a number of “ancient” dicotyl R genes grouped with Triticeae clades. The evolution of recent NBS-LRR gene duplications in the Triticeae was studied at the hand of two modes of duplication - firstly individual gene duplications yielding paralogous loci and secondly gene duplication by allopolyploidy. Current models of NBS-LRR family evolution predict that functional divergence occurs after gene duplication. An alternative is that divergence takes place at allele level, followed by a locus duplication that fixes heterozygosity in a single haplotype by unequal recombination. I investigated this hypothesis by studying the evolution of gene duplicates in two different contexts – paralogous duplications in the diploid barley genome and homeologous duplications in the allohexaploid genome of wheat. Nonsynonymous to synonymous substitution rate ratios were estimated for paralogous gene duplications in three recently diverged NBS-LRR clades. All pairwise comparisons yielded Ka:Ks ratios strongly indicative of purifying selection. Given that R gene mediated resistance is inherited qualitatively rather than quantitatively, I interpret this as evidence that even closely related paralogous copies (90-95% identity) should have independent recognition specificities maintained by purifying selection. Homeologous duplications were studied in allohexaploid wheat (AABBDD) using a section of the go35 NBS-LRR gene (2L) of the B and D diploid donor species of wheat. Numerous synonymous substitutions distinguished the B and D genome copies, with an absence of nonsynonymous substitutions. In contrast, single unique nonsynonymous substitutions were found in four out of five polyploid wheat go35 alleles, indicating that selection pressure was indeed relaxed across the homeolocus. Recent studies on polyploid genomes have shown that duplicated resistance genes are far more likely to be eliminated than highly transcribed genes such as tRNAs and rRNAs. These results are in agreement with the view that functional divergence takes place before duplication for NBS-LRR genes, as the loci duplicated by polyploidy appear not to evolve under purifying selection, as I found for the paralogous loci investigated.Dissertation (MSc)--University of Pretoria, 2008.Geneticsunrestricte

    The neurobiology of cortical music representations

    Get PDF
    Music is undeniable one of humanity’s defining traits, as it has been documented since the earliest days of mankind, is present in all knowcultures and perceivable by all humans nearly alike. Intrigued by its omnipresence, researchers of all disciplines started the investigation of music’s mystical relationship and tremendous significance to humankind already several hundred years ago. Since comparably recently, the immense advancement of neuroscientific methods also enabled the examination of cognitive processes related to the processing of music. Within this neuroscience ofmusic, the vast majority of research work focused on how music, as an auditory stimulus, reaches the brain and howit is initially processed, aswell as on the tremendous effects it has on and can evoke through the human brain. However, intermediate steps, that is how the human brain achieves a transformation of incoming signals to a seemingly specialized and abstract representation of music have received less attention. Aiming to address this gap, the here presented thesis targeted these transformations, their possibly underlying processes and how both could potentially be explained through computational models. To this end, four projects were conducted. The first two comprised the creation and implementation of two open source toolboxes to first, tackle problems inherent to auditory neuroscience, thus also affecting neuroscientific music research and second, provide the basis for further advancements through standardization and automation. More precisely, this entailed deteriorated hearing thresholds and abilities in MRI settings and the aggravated localization and parcellation of the human auditory cortex as the core structure involved in auditory processing. The third project focused on the human’s brain apparent tuning to music by investigating functional and organizational principles of the auditory cortex and network with regard to the processing of different auditory categories of comparable social importance, more precisely if the perception of music evokes a is distinct and specialized pattern. In order to provide an in depth characterization of the respective patterns, both the segregation and integration of auditory cortex regions was examined. In the fourth and final project, a highly multimodal approach that included fMRI, EEG, behavior and models of varying complexity was utilized to evaluate how the aforementioned music representations are generated along the cortical hierarchy of auditory processing and how they are influenced by bottom-up and top-down processes. The results of project 1 and 2 demonstrated the necessity for the further advancement of MRI settings and definition of working models of the auditory cortex, as hearing thresholds and abilities seem to vary as a function of the used data acquisition protocol and the localization and parcellation of the human auditory cortex diverges drastically based on the approach it is based one. Project 3 revealed that the human brain apparently is indeed tuned for music by means of a specialized representation, as it evoked a bilateral network with a right hemispheric weight that was not observed for the other included categories. The result of this specialized and hierarchical recruitment of anterior and posterior auditory cortex regions was an abstract music component ix x SUMMARY that is situated in anterior regions of the superior temporal gyrus and preferably encodes music, regardless of sung or instrumental. The outcomes of project 4 indicated that even though the entire auditory cortex, again with a right hemispheric weight, is involved in the complex processing of music in particular, anterior regions yielded an abstract representation that varied excessively over time and could not sufficiently explained by any of the tested models. The specialized and abstract properties of this representation was furthermore underlined by the predictive ability of the tested models, as models that were either based on high level features such as behavioral representations and concepts or complex acoustic features always outperformed models based on single or simpler acoustic features. Additionally, factors know to influence auditory and thus music processing, like musical training apparently did not alter the observed representations. Together, the results of the projects suggest that the specialized and stable cortical representation of music is the outcome of sophisticated transformations of incoming sound signals along the cortical hierarchy of auditory processing that generate a music component in anterior regions of the superior temporal gyrus by means of top-down processes that interact with acoustic features, guiding their processing.Musik ist unbestreitbarer Weise eine der definierenden Eigenschaften des Menschen. Dokumentiert seit den frühesten Tagen der Menschheit und in allen bekannten Kulturen vorhanden, ist sie von allenMenschen nahezu gleichwahrnehmbar. Fasziniert von ihrerOmniprĂ€senz haben Wissenschaftler aller Disziplinen vor einigen hundert Jahren begonnen die mystische Beziehung zwischen Musik und Mensch, sowie ihre enorme Bedeutung für selbigen zu untersuchen. Seit einem vergleichsweise kurzem Zeitraum ist es durch den immensen Fortschritt neurowissenschafticher Methoden auch möglich die kognitiven Prozesse, welche an der Verarbeitung von Musik beteiligt, sind zu untersuchen. Innerhalb dieser Neurowissenschaft der Musik hat sich ein Großteil der Forschungsarbeit darauf konzentriert wie Musik, als auditorischer Stimulus, das menschliche Gehirn erreicht und wie sie initial verarbeitet wird, als auch welche kolossallen Effekte sie auf selbiges hat und auch dadurch bewirken kann. Jedoch haben die Zwischenschritte, also wie das menschliche Gehirn eintreffende Signale in eine scheinbar spezialisierte und abstrakte ReprĂ€sentation vonMusik umwandelt, vergleichsweise wenig Aufmerksamkeit erhalten. Um die dadurch entstandene Lücke zu adressieren, hat die hier vorliegende Dissertation diese Prozesse und wie selbige durch Modelle erklĂ€rt werden können in vier Projekten untersucht. Die ersten beiden Projekte beinhalteten die Herstellung und Implementierung von zwei Toolboxen um erstens, inhĂ€rente Probleme der auditorischen Neurowissenschaft, daher auch neurowissenschaftlicher Untersuchungen von Musik, zu verbessern und zweitens, eine Basis für weitere Fortschritte durch Standardisierung und Automatisierung zu schaffen. Im genaueren umfasste dies die stark beeintrĂ€chtigten Hörschwellen und –fĂ€higkeiten in MRT-Untersuchungen und die erschwerte Lokalisation und Parzellierung des menschlichen auditorischen Kortex als Kernstruktur auditiver Verarbeitung. Das dritte Projekt befasste sich mit der augenscheinlichen Spezialisierung von Musik im menschlichen Gehirn durch die Untersuchung funktionaler und organisatorischer Prinzipien des auditorischen Kortex und Netzwerks bezüglich der Verarbeitung verschiedener auditorischer Kategorien vergleichbarer sozialer Bedeutung, im genaueren ob die Wahrnehmung von Musik ein distinktes und spezialisiertes neuronalenMuster hervorruft. Umeine ausführliche Charakterisierung der entsprechenden neuronalen Muster zu ermöglichen wurde die Segregation und Integration der Regionen des auditorischen Kortex untersucht. Im vierten und letzten Projekt wurde ein hochmultimodaler Ansatz,welcher fMRT, EEG, Verhalten undModelle verschiedener KomplexitĂ€t beinhaltete, genutzt, umzu evaluieren, wie die zuvor genannten ReprĂ€sentationen von Musik entlang der kortikalen Hierarchie der auditorischen Verarbeitung generiert und wie sie möglicherweise durch Bottom-up- und Top-down-AnsĂ€tze beeinflusst werden. Die Ergebnisse von Projekt 1 und 2 demonstrierten die Notwendigkeit für weitere Verbesserungen von MRTUntersuchungen und die Definition eines Funktionsmodells des auditorischen Kortex, daHörxi xii ZUSAMMENFASSUNG schwellen und –fĂ€higkeiten stark in AbhĂ€ngigkeit der verwendeten Datenerwerbsprotokolle variierten und die Lokalisation, sowie Parzellierung des menschlichen auditorischen Kortex basierend auf den zugrundeliegenden AnsĂ€tzen drastisch divergiert. Projekt 3 zeigte, dass das menschliche Gehirn tatsĂ€chlich eine spezialisierte ReprĂ€sentation vonMusik enthĂ€lt, da selbige als einzige auditorische Kategorie ein bilaterales Netzwerk mit rechtshemisphĂ€rischer Gewichtung evozierte. Aus diesemNetzwerk, welches die Rekrutierung anteriorer und posteriorer Teile des auditorischen Kortex beinhaltete, resultierte eine scheinbar abstrakte ReprĂ€sentation von Musik in anterioren Regionen des Gyrus temporalis superior, welche prĂ€feriert Musik enkodiert, ungeachtet ob gesungen oder instrumental. Die Resultate von Projekt 4 deuten darauf hin, dass der gesamte auditorische Kortex, erneut mit rechtshemisphĂ€rischer Gewichtung, an der komplexen Verarbeitung vonMusik beteiligt ist, besonders aber anteriore Regionen, die bereits genannten abstrakte ReprĂ€sentation hervorrufen, welche sich exzessiv über die Zeitdauer derWahrnehmung verĂ€ndert und nicht hinreichend durch eines der getestetenModelle erklĂ€rt werden kann. Die spezialisierten und abstrakten Eigenschaften dieser ReprĂ€sentationen wurden weiterhin durch die prĂ€diktiven FĂ€higkeiten der getestetenModelle unterstrichen, daModelle, welche entweder auf höheren Eigenschaften wie VerhaltensreprĂ€sentationen und mentalen Konzepten oder komplexen akustischen Eigenschaften basierten, stets Modelle, welche auf niederen Attributen wie simplen akustischen Eigenschaften basierten, übertrafen. ZusĂ€tzlich konnte kein Effekt von Faktoren, wie z.B. musikalisches Training, welche bekanntermaßen auditorische und daherMusikverarbeitung beeinflussen, nachgewiesen werden. Zusammengefasst deuten die Ergebnisse der Projekte darauf, hin dass die spezialisierte und stabile kortikale ReprĂ€sentation vonMusik ein Resultat komplexer Prozesse ist, welche eintreffende Signale entlang der kortikalen Hierarchie auditorischer Verarbeitung in eine abstrakte ReprĂ€sentation vonMusik innerhalb anteriorer Regionen des Gyrus temporalis superior durch Top-Down-Prozesse, welche mit akustischen Eigenschaften interagieren und deren Verarbeitung steuern, umwandeln

    Investigating the mechanisms underlying fixation durations during the first year of life: a computational account

    Get PDF
    Infants’ eye-movements provide a window onto the development of cognitive functions over the first years of life. Despite considerable advances in the past decade, studying the mechanisms underlying infant fixation duration and saccadic control remains a challenge due to practical and technical constraints in infant testing. This thesis addresses these issues and investigates infant oculomotor control by presenting novel software and methods for dealing with low-quality infant data (GraFIX), a series of behavioural studies involving novel gaze-contingent and sceneviewing paradigms, and computational modelling of fixation timing throughout development. In a cross-sectional study and two longitudinal studies, participants were eye-tracked while viewing dynamic and static complex scenes, and performed gap-overlap and double-step paradigms. Fixation data from these studies were modelled in a number of simulation studies with the CRISP model of fixation durations in adults in scene viewing. Empirical results showed how fixation durations decreased with age for all viewing conditions but at different rates. Individual differences between long- and short-lookers were found across visits and viewing conditions, with static images being the most stable viewing condition. Modelling results confirmed the CRISP theoretical framework’s applicability to infant data and highlighted the influence of both cognitive processing and the developmental state of the visuo-motor system on fixation durations during the first few months of life. More specifically, while the present work suggests that infant fixation durations reflect on-line perceptual and cognitive activity similarly to adults, the individual developmental state of the visuo-motor system still affects this relationship until 10 months of age. Furthermore, results suggested that infants are already able to program saccades in two stages at 3.5 months: (1) an initial labile stage subject to cancellation and (2) a subsequent non-labile stage that cannot be cancelled. The length of the non-labile stage decreased relative to the labile stage especially from 3.5 to 5 months, indicating a greater ability to cancel saccade programs as infants grew older. In summary, the present work provides unprecedented insights into the development of fixation durations and saccadic control during the first year of life and demonstrates the benefits of mixing behavioural and computational approaches to investigate methodologically challenging research topics such as oculomotor control in infancy

    Factors predictive of emotional and behavioural difficulties in children with refractory focal epilepsy

    Get PDF
    Focal epilepsy in childhood is associated with increased risk for developing behavioral, emotional, cognitive and social–adaptive impairments. The present thesis focused on mental health difficulties in paediatric refractory focal epilepsy. It undertook a detailed evaluation of the predictive power of several demographic (gender, age at assessment), clinical (age at onset and duration of epilepsy, seizure frequency), localization (lobe and lateralization of pathology) and cognitive variables (performance in intellectual, memory and academic attainment measures) for mood, conduct, inattention/hyperactivity and peer relationship difficulties, as assessed by parental report. Data from a population of 282 children and adolescents, previously collected for clinical purposes, were examined, using a series of univariate and multivariate analyses. Mental health difficulties were found to be highly prevalent, with peer relationships the most frequently reported area of difficulty, followed by inattention/hyperactivity and emotional difficulties. Different patterns of associations between the variables examined here and individual emotional/behavioural difficulties were revealed, partially confirming and extending previous findings in the literature. Longer duration of epilepsy was found to increase the risk for developing emotional difficulties; male gender and earlier age at onset the risk for conduct difficulties; male gender, earlier age at onset, longer duration and frontal lobe localization the risk for attention/hyperactivity difficulties; and finally longer duration, higher seizure frequency and right hemisphere lateralization the risk for peer difficulties. Lower cognitive functioning was found associated with overall increased mental health difficulties and a lower VIQ was predictive of all types of difficulties. Developing a firm understanding of the risk factors that contribute to mental health comorbidities in focal paediatric epilepsy can help identify and provide assessment and intervention to children who are at higher risk earlier, thus significantly improving quality of life

    Engineering planetary lasers for interstellar communication

    Get PDF
    Transmitting large amounts of data efficiently among neighboring stars will vitally support any eventual contact with extrasolar intelligence, whether alien or human. Laser carriers are particularly suitable for high-quality, targeted links. Space laser transmitter systems designed by this work, based on both demonstrated and imminent advanced space technology, could achieve reliable data transfer rates as high as 1 kb/s to matched receivers as far away as 25 pc, a distance including over 700 approximately solar-type stars. The centerpiece of this demonstration study is a fleet of automated spacecraft incorporating adaptive neural-net optical processing active structures, nuclear electric power plants, annular momentum control devices, and ion propulsion. Together the craft sustain, condition, modulate, and direct to stellar targets an infrared laser beam extracted from the natural mesospheric, solar-pumped, stimulated CO2 emission recently discovered at Venus. For a culture already supported by mature interplanetary industry, the cost of building planetary or high-power space laser systems for interstellar communication would be marginal, making such projects relevant for the next human century. Links using high-power lasers might support data transfer rates as high as optical frequencies could ever allow. A nanotechnological society such as we might become would inevitably use 10 to the 20th power b/yr transmission to promote its own evolutionary expansion out of the galaxy

    Neural dynamics in cortical populations

    Get PDF
    Many essential neural computations are implemented by large populations of neurons working in concert. Recent studies have sought both to monitor increasingly large groups of neurons and to characterise their collective behaviour, but the standard computational approaches available to identify the collective dynamics scale poorly with the size of the dataset. We develop new efficient methods for discovering the low-dimensional dynamics that underlie simultaneously-recorded spike trains from a neural population. We use the new models to analyze two different sets of population recordings, one from motor cortex and another from auditory cortex. In motor cortex, we describe the nature of the trial-by-trial spontaneous fluctuations identified by the model and connect these fluctuations to behavioral events. The spatio-temporal structure of the spontaneous events was tracked by three trajectories identified by the model. These trajectories followed similar dynamics during hand reaches as they did when the hands were stationary. The structure of the models we developed allow them to be used as decoders of hand position from neural activity, significantly improving upon previous state-of-the-art methods. The decoders were able to predict information about the direction, onset time and speed profile of movements. In auditory cortex, we use the statistical models to identify population dynamics under different brain states. We report major differences in dynamics and stimulus coding between synchronized and desychronized brain states. Synchronized but not desynchronized brain states imposed constraints on neural dynamics such that a four-dimensional system accounted for most of the dynamical structure of population events. We used the low-dimensional representation of the data to construct network simulations that reproduced the patterns present in the recordings. The simulations suggest that the overall level of feedback inhibition controls the stability of each local cortical network, with unstable dynamics resulting in synchronized brain states. Finally we propose a functional role for dynamics in the representation of visual motion in visual cortex

    Ambiguous Recognition: Recursion, Cognitive Blending, and the Problem of Interpretation in Twenty-First-Century Fiction

    Get PDF
    This dissertation uses theories of cognitive conceptual integration (as outlined by Gilles Fauconnier and Mark Turner) to propose a model of narrative reading that mediates between narratology and theories of reception. I use this model to demonstrate how new experimental narratives achieve a potent balance between a determinate and open story-form. Where the high postmodernists of the 1970s and 80s created ironic, undecidable story-worlds, the novels considered here allow readers to embrace seemingly opposite propositions without retreating into ironic suspension, trading the postmodernist “neither/nor” for a new “both/and.” This technique demands significant revision of both descriptions of radical experimentation in twenty-first-century novels, and of earlier narratological accounts of the distinction between story and discourse. Each novel considered in this dissertation encourages its reader to recognize combined concepts in the course of the reading process. Shelley Jackson’s Half Life combines singular and plural identity, reimagining individualist subjectivity and the literary treatment of (dis)ability. Mark Z. Danielewski’s Only Revolutions combines objective and subjective temporality, offering a new perspective on American myth-making in the popular post-Kerouac road-novel tradition. Percival Everett’s Erasure combines reliable and unreliable narration to create a complex critique of the idea of an African American novel tradition. M.D. Coverley’s hypertext novel Califia involves the reader in all three of these discursive dimensions at once, updating the marginalized art of hypertext fiction by inviting the reader to see his or her role in navigating the text as both creative and determined—the epitome of open-and-closed form. My analysis demonstrates how cognitive blending is a precise method for describing how a reader interprets complex narrative structures. I propose this blending-model as a new approach to contemporary experimental fiction from the perspective of the reader’s cognitive work, and show how it offers new readings of important contemporary fiction. I argue that twenty-first-century authors attempt simultaneously to construct “open” forms, and to address real socio-cultural concerns in the world; I also argue that a narratology founded on theories of cognitive processes is best-equipped to describe the operations of reading and understanding these complex narrative forms
    corecore