40 research outputs found
Towards Robust, Interpretable and Scalable Visual Representations
Visual representation is one of the central problems in computer vision. The essential problem is to develop a unified representation that effectively encodes both visual appearance and spatial information so that it can be easily applied to various vision applications such as face recognition, image matching, and multimodal image retrieval. Along with the history of computer vision research, there are four major levels of visual representations, i.e., geometric, low-level, mid-level and high-level. The dissertation comprises four works studying effective visual representations in the four different levels. Multiple approaches are proposed with the aim of improving the robustness, interpretability, and scalability of visual representations.
Geometric features are effective in matching images under spatial transformations however their performance is sensitive to the noises. In the first part, we propose to model the uncertainty of geometric representation based on line segments and propose to equip these features with uncertainty modeling so that they could be robustly applied in the image-based geolocation application.
We study in the second part the robustness of feature encoding to noisy keypoints. We show that traditional feature encoding is sensitive to background or noisy features. We propose the Selective Encoding framework which learns the relevance distribution of each codeword and incorporate such information with the original codebook model. Our approach is more robust to the localization errors or uncertainty in the active face authentication application.
The mission of visual understanding is to express and describe the image content which is essentially relating images to human language. That typically involves finding a common representation inferable from both domains of data. In the third part, we propose a framework to extract a mid-level spatial representation directly from language descriptions and match such spatial layouts to the detected object bounding boxes for retrieving indoor scene images from user text queries.
Modern high-level visual features are typically learned from supervised datasets, whose scalability is largely limited by the requirement of dedicated human annotation. In the last part, we propose to learn visual representations from large-scale weakly supervised data for a large number of natural language-based concepts, i.e., n-gram phrases. We propose the differentiable Jelinek-Mercer smoothing loss and train a deep convolutional neural network from images with associated user comments. We show that the learned model can predict a large number of phrase-based concepts from images, can be effectively applied to image-caption applications and transfers well to other visual recognition datasets
Prediction-related neural response alterations in the ventral visual stream
Theories of predictive coding (PC; Rao & Ballard, 1999) have dominated neurocognitive research in explaining thought and perception processes in various domains. The basic principle is that perception relies not only on bottom-up processing of sensory input but also on top-down predictions. The current thesis describes several neuronal response alterations in cortical visual areas measured with neuroimaging methods. The so-called repetition suppression (RS) effect was connected to predictive coding as repetitions make stimuli more expected, which results in a smaller prediction error and therefore attenuated neuronal activity. Still, it is questioned whether RS reflects the PE or is a local process by neuronal populations that occurs without top-down influences (Grill-Spector et al., 2006). Another often investigated effect is the reduced neuronal response to expected or predicted visual input called expectation suppression (ES). A considerable body of research on contextual response changes, such as RS and ES, relates to the visual system and the face-processing network in particular. Overall, we demonstrate the importance of stimulus predictability for studies using RS to uncover expectancy-related effects. Furthermore, we suggest that the influence of sensory precision on measures of RS and ES needs more attention in future research. Concerning the stimulus material in the presented studies - unfamiliar, visually familiar, and famous familiar faces - we also emphasize the importance of thoroughly considering the characteristics of faces in terms of prior belief and sensory input precision and predictability when using them for testing prediction-related effects
Tracking nucleotide-binding-site-leucine-rich-repeat resistance gene analogues in the wheat genome complex
Investigations into plant-pathogen interactions have provided us with several models underlying the genetic basis of host resistance in plants. In the past decade, tens of resistance genes have been isolated from numerous crop and model plant species and these form a few distinct classes when classified by domain structure, the majority being nucleotide-bindingsite- leucine-rich-repeat (NBS-LRR) genes. The NBS-LRR family consists of two sub-families based on the N-terminal domain: the coiled-coil (CC) NBS-LRRs and the Toll Interleukin Receptor homology domain (TIR) NBS-LRRs. The potential of these genes for future and current agricultural breeding programs has driven a large number of studies exploring the members of these gene families in the genomes of a variety of crop species. In the present study I focused on the NBS-LRR family in the allohexaploid wheat genome and obtained a comprehensive set of Triticeae NBS-LRR homologues using a combination of data-mining approaches. As starting point I detected conserved motifs in the dataset, finding all six previously characterized in the core-NBS domain of other plant NBS-LRRs. Phylogenetic analysis was performed to study relationships between the Triticeae NBS-LRR family and the 25 CC-NBS-LRR (CNL) R genes identified to date. I found the Triticeae CNL family to be highly divergent, containing ancient clade lineages, as seen in all angiosperm 120 taxa previously studied, and found a number of âancientâ dicotyl R genes grouped with Triticeae clades. The evolution of recent NBS-LRR gene duplications in the Triticeae was studied at the hand of two modes of duplication - firstly individual gene duplications yielding paralogous loci and secondly gene duplication by allopolyploidy. Current models of NBS-LRR family evolution predict that functional divergence occurs after gene duplication. An alternative is that divergence takes place at allele level, followed by a locus duplication that fixes heterozygosity in a single haplotype by unequal recombination. I investigated this hypothesis by studying the evolution of gene duplicates in two different contexts â paralogous duplications in the diploid barley genome and homeologous duplications in the allohexaploid genome of wheat. Nonsynonymous to synonymous substitution rate ratios were estimated for paralogous gene duplications in three recently diverged NBS-LRR clades. All pairwise comparisons yielded Ka:Ks ratios strongly indicative of purifying selection. Given that R gene mediated resistance is inherited qualitatively rather than quantitatively, I interpret this as evidence that even closely related paralogous copies (90-95% identity) should have independent recognition specificities maintained by purifying selection. Homeologous duplications were studied in allohexaploid wheat (AABBDD) using a section of the go35 NBS-LRR gene (2L) of the B and D diploid donor species of wheat. Numerous synonymous substitutions distinguished the B and D genome copies, with an absence of nonsynonymous substitutions. In contrast, single unique nonsynonymous substitutions were found in four out of five polyploid wheat go35 alleles, indicating that selection pressure was indeed relaxed across the homeolocus. Recent studies on polyploid genomes have shown that duplicated resistance genes are far more likely to be eliminated than highly transcribed genes such as tRNAs and rRNAs. These results are in agreement with the view that functional divergence takes place before duplication for NBS-LRR genes, as the loci duplicated by polyploidy appear not to evolve under purifying selection, as I found for the paralogous loci investigated.Dissertation (MSc)--University of Pretoria, 2008.Geneticsunrestricte
The neurobiology of cortical music representations
Music is undeniable one of humanityâs defining traits, as it has been documented since the earliest
days of mankind, is present in all knowcultures and perceivable by all humans nearly alike.
Intrigued by its omnipresence, researchers of all disciplines started the investigation of musicâs
mystical relationship and tremendous significance to humankind already several hundred
years ago. Since comparably recently, the immense advancement of neuroscientific methods
also enabled the examination of cognitive processes related to the processing of music. Within
this neuroscience ofmusic, the vast majority of research work focused on how music, as an auditory
stimulus, reaches the brain and howit is initially processed, aswell as on the tremendous
effects it has on and can evoke through the human brain. However, intermediate steps, that is
how the human brain achieves a transformation of incoming signals to a seemingly specialized
and abstract representation of music have received less attention. Aiming to address this gap,
the here presented thesis targeted these transformations, their possibly underlying processes
and how both could potentially be explained through computational models. To this end, four
projects were conducted. The first two comprised the creation and implementation of two
open source toolboxes to first, tackle problems inherent to auditory neuroscience, thus also affecting
neuroscientific music research and second, provide the basis for further advancements
through standardization and automation. More precisely, this entailed deteriorated hearing
thresholds and abilities in MRI settings and the aggravated localization and parcellation of the
human auditory cortex as the core structure involved in auditory processing. The third project
focused on the humanâs brain apparent tuning to music by investigating functional and organizational
principles of the auditory cortex and network with regard to the processing of different
auditory categories of comparable social importance, more precisely if the perception of music
evokes a is distinct and specialized pattern. In order to provide an in depth characterization
of the respective patterns, both the segregation and integration of auditory cortex regions was
examined. In the fourth and final project, a highly multimodal approach that included fMRI,
EEG, behavior and models of varying complexity was utilized to evaluate how the aforementioned
music representations are generated along the cortical hierarchy of auditory processing
and how they are influenced by bottom-up and top-down processes. The results of project 1
and 2 demonstrated the necessity for the further advancement of MRI settings and definition
of working models of the auditory cortex, as hearing thresholds and abilities seem to vary as
a function of the used data acquisition protocol and the localization and parcellation of the
human auditory cortex diverges drastically based on the approach it is based one. Project 3
revealed that the human brain apparently is indeed tuned for music by means of a specialized
representation, as it evoked a bilateral network with a right hemispheric weight that was not
observed for the other included categories. The result of this specialized and hierarchical recruitment
of anterior and posterior auditory cortex regions was an abstract music component
ix
x SUMMARY
that is situated in anterior regions of the superior temporal gyrus and preferably encodes music,
regardless of sung or instrumental. The outcomes of project 4 indicated that even though
the entire auditory cortex, again with a right hemispheric weight, is involved in the complex
processing of music in particular, anterior regions yielded an abstract representation that varied
excessively over time and could not sufficiently explained by any of the tested models. The
specialized and abstract properties of this representation was furthermore underlined by the
predictive ability of the tested models, as models that were either based on high level features
such as behavioral representations and concepts or complex acoustic features always outperformed
models based on single or simpler acoustic features. Additionally, factors know to influence
auditory and thus music processing, like musical training apparently did not alter the
observed representations. Together, the results of the projects suggest that the specialized and
stable cortical representation of music is the outcome of sophisticated transformations of incoming
sound signals along the cortical hierarchy of auditory processing that generate a music
component in anterior regions of the superior temporal gyrus by means of top-down processes
that interact with acoustic features, guiding their processing.Musik ist unbestreitbarer Weise eine der definierenden Eigenschaften des Menschen. Dokumentiert
seit den fruÌhesten Tagen der Menschheit und in allen bekannten Kulturen vorhanden,
ist sie von allenMenschen nahezu gleichwahrnehmbar. Fasziniert von ihrerOmniprÀsenz
haben Wissenschaftler aller Disziplinen vor einigen hundert Jahren begonnen die mystische
Beziehung zwischen Musik und Mensch, sowie ihre enorme Bedeutung fuÌr selbigen zu untersuchen.
Seit einem vergleichsweise kurzem Zeitraum ist es durch den immensen Fortschritt
neurowissenschafticher Methoden auch möglich die kognitiven Prozesse, welche an der Verarbeitung
von Musik beteiligt, sind zu untersuchen. Innerhalb dieser Neurowissenschaft der
Musik hat sich ein GroĂteil der Forschungsarbeit darauf konzentriert wie Musik, als auditorischer
Stimulus, das menschliche Gehirn erreicht und wie sie initial verarbeitet wird, als auch
welche kolossallen Effekte sie auf selbiges hat und auch dadurch bewirken kann. Jedoch haben
die Zwischenschritte, also wie das menschliche Gehirn eintreffende Signale in eine scheinbar
spezialisierte und abstrakte ReprÀsentation vonMusik umwandelt, vergleichsweise wenig Aufmerksamkeit
erhalten. Um die dadurch entstandene LuÌcke zu adressieren, hat die hier vorliegende
Dissertation diese Prozesse und wie selbige durch Modelle erklÀrt werden können in
vier Projekten untersucht. Die ersten beiden Projekte beinhalteten die Herstellung und Implementierung
von zwei Toolboxen um erstens, inhÀrente Probleme der auditorischen Neurowissenschaft,
daher auch neurowissenschaftlicher Untersuchungen von Musik, zu verbessern
und zweitens, eine Basis fuÌr weitere Fortschritte durch Standardisierung und Automatisierung
zu schaffen. Im genaueren umfasste dies die stark beeintrÀchtigten Hörschwellen und
âfĂ€higkeiten in MRT-Untersuchungen und die erschwerte Lokalisation und Parzellierung des
menschlichen auditorischen Kortex als Kernstruktur auditiver Verarbeitung. Das dritte Projekt
befasste sich mit der augenscheinlichen Spezialisierung von Musik im menschlichen Gehirn
durch die Untersuchung funktionaler und organisatorischer Prinzipien des auditorischen
Kortex und Netzwerks bezuÌglich der Verarbeitung verschiedener auditorischer Kategorien vergleichbarer
sozialer Bedeutung, im genaueren ob die Wahrnehmung von Musik ein distinktes
und spezialisiertes neuronalenMuster hervorruft. Umeine ausfuÌhrliche Charakterisierung
der entsprechenden neuronalen Muster zu ermöglichen wurde die Segregation und Integration
der Regionen des auditorischen Kortex untersucht. Im vierten und letzten Projekt wurde
ein hochmultimodaler Ansatz,welcher fMRT, EEG, Verhalten undModelle verschiedener KomplexitÀt
beinhaltete, genutzt, umzu evaluieren, wie die zuvor genannten ReprÀsentationen von
Musik entlang der kortikalen Hierarchie der auditorischen Verarbeitung generiert und wie sie
möglicherweise durch Bottom-up- und Top-down-AnsÀtze beeinflusst werden. Die Ergebnisse
von Projekt 1 und 2 demonstrierten die Notwendigkeit fuÌr weitere Verbesserungen von MRTUntersuchungen
und die Definition eines Funktionsmodells des auditorischen Kortex, daHörxi
xii ZUSAMMENFASSUNG
schwellen und âfĂ€higkeiten stark in AbhĂ€ngigkeit der verwendeten Datenerwerbsprotokolle
variierten und die Lokalisation, sowie Parzellierung des menschlichen auditorischen Kortex
basierend auf den zugrundeliegenden AnsÀtzen drastisch divergiert. Projekt 3 zeigte, dass das
menschliche Gehirn tatsÀchlich eine spezialisierte ReprÀsentation vonMusik enthÀlt, da selbige
als einzige auditorische Kategorie ein bilaterales Netzwerk mit rechtshemisphÀrischer Gewichtung
evozierte. Aus diesemNetzwerk, welches die Rekrutierung anteriorer und posteriorer
Teile des auditorischen Kortex beinhaltete, resultierte eine scheinbar abstrakte ReprÀsentation
von Musik in anterioren Regionen des Gyrus temporalis superior, welche prÀferiert Musik enkodiert,
ungeachtet ob gesungen oder instrumental. Die Resultate von Projekt 4 deuten darauf
hin, dass der gesamte auditorische Kortex, erneut mit rechtshemisphÀrischer Gewichtung, an
der komplexen Verarbeitung vonMusik beteiligt ist, besonders aber anteriore Regionen, die bereits
genannten abstrakte ReprĂ€sentation hervorrufen, welche sich exzessiv uÌber die Zeitdauer
derWahrnehmung verÀndert und nicht hinreichend durch eines der getestetenModelle erklÀrt
werden kann. Die spezialisierten und abstrakten Eigenschaften dieser ReprÀsentationen wurden
weiterhin durch die prÀdiktiven FÀhigkeiten der getestetenModelle unterstrichen, daModelle,
welche entweder auf höheren Eigenschaften wie VerhaltensreprÀsentationen und mentalen
Konzepten oder komplexen akustischen Eigenschaften basierten, stets Modelle, welche
auf niederen Attributen wie simplen akustischen Eigenschaften basierten, uÌbertrafen. ZusĂ€tzlich
konnte kein Effekt von Faktoren, wie z.B. musikalisches Training, welche bekanntermaĂen
auditorische und daherMusikverarbeitung beeinflussen, nachgewiesen werden.
Zusammengefasst deuten die Ergebnisse der Projekte darauf, hin dass die spezialisierte und
stabile kortikale ReprÀsentation vonMusik ein Resultat komplexer Prozesse ist, welche eintreffende
Signale entlang der kortikalen Hierarchie auditorischer Verarbeitung in eine abstrakte
ReprÀsentation vonMusik innerhalb anteriorer Regionen des Gyrus temporalis superior durch
Top-Down-Prozesse, welche mit akustischen Eigenschaften interagieren und deren Verarbeitung
steuern, umwandeln
Investigating the mechanisms underlying fixation durations during the first year of life: a computational account
Infantsâ eye-movements provide a window onto the development of cognitive functions over the
first years of life. Despite considerable advances in the past decade, studying the mechanisms
underlying infant fixation duration and saccadic control remains a challenge due to practical and
technical constraints in infant testing. This thesis addresses these issues and investigates infant
oculomotor control by presenting novel software and methods for dealing with low-quality infant
data (GraFIX), a series of behavioural studies involving novel gaze-contingent and sceneviewing
paradigms, and computational modelling of fixation timing throughout development. In a
cross-sectional study and two longitudinal studies, participants were eye-tracked while viewing
dynamic and static complex scenes, and performed gap-overlap and double-step paradigms.
Fixation data from these studies were modelled in a number of simulation studies with the
CRISP model of fixation durations in adults in scene viewing. Empirical results showed how
fixation durations decreased with age for all viewing conditions but at different rates. Individual
differences between long- and short-lookers were found across visits and viewing conditions,
with static images being the most stable viewing condition. Modelling results confirmed the
CRISP theoretical frameworkâs applicability to infant data and highlighted the influence of both
cognitive processing and the developmental state of the visuo-motor system on fixation
durations during the first few months of life. More specifically, while the present work suggests
that infant fixation durations reflect on-line perceptual and cognitive activity similarly to adults,
the individual developmental state of the visuo-motor system still affects this relationship until 10
months of age. Furthermore, results suggested that infants are already able to program
saccades in two stages at 3.5 months: (1) an initial labile stage subject to cancellation and (2) a
subsequent non-labile stage that cannot be cancelled. The length of the non-labile stage
decreased relative to the labile stage especially from 3.5 to 5 months, indicating a greater ability
to cancel saccade programs as infants grew older. In summary, the present work provides
unprecedented insights into the development of fixation durations and saccadic control during
the first year of life and demonstrates the benefits of mixing behavioural and computational
approaches to investigate methodologically challenging research topics such as oculomotor
control in infancy
Factors predictive of emotional and behavioural difficulties in children with refractory focal epilepsy
Focal epilepsy in childhood is associated with increased risk for developing behavioral, emotional, cognitive and socialâadaptive impairments. The present thesis focused on mental health difficulties in paediatric refractory focal epilepsy. It undertook a detailed evaluation of the predictive power of several demographic (gender, age at assessment), clinical (age at onset and duration of epilepsy, seizure frequency), localization (lobe and lateralization of pathology) and cognitive variables (performance in intellectual, memory and academic attainment measures) for mood, conduct, inattention/hyperactivity and peer relationship difficulties, as assessed by parental report. Data from a population of 282 children and adolescents, previously collected for clinical purposes, were examined, using a series of univariate and multivariate analyses. Mental health difficulties were found to be highly prevalent, with peer relationships the most frequently reported area of difficulty, followed by inattention/hyperactivity and emotional difficulties. Different patterns of associations between the variables examined here and individual emotional/behavioural difficulties were revealed, partially confirming and extending previous findings in the literature. Longer duration of epilepsy was found to increase the risk for developing emotional difficulties; male gender and earlier age at onset the risk for conduct difficulties; male gender, earlier age at onset, longer duration and frontal lobe localization the risk for attention/hyperactivity difficulties; and finally longer duration, higher seizure frequency and right hemisphere lateralization the risk for peer difficulties. Lower cognitive functioning was found associated with overall increased mental health difficulties and a lower VIQ was predictive of all types of difficulties. Developing a firm understanding of the risk factors that contribute to mental health comorbidities in focal paediatric epilepsy can help identify and provide assessment and intervention to children who are at higher risk earlier, thus significantly improving quality of life
Engineering planetary lasers for interstellar communication
Transmitting large amounts of data efficiently among neighboring stars will vitally support any eventual contact with extrasolar intelligence, whether alien or human. Laser carriers are particularly suitable for high-quality, targeted links. Space laser transmitter systems designed by this work, based on both demonstrated and imminent advanced space technology, could achieve reliable data transfer rates as high as 1 kb/s to matched receivers as far away as 25 pc, a distance including over 700 approximately solar-type stars. The centerpiece of this demonstration study is a fleet of automated spacecraft incorporating adaptive neural-net optical processing active structures, nuclear electric power plants, annular momentum control devices, and ion propulsion. Together the craft sustain, condition, modulate, and direct to stellar targets an infrared laser beam extracted from the natural mesospheric, solar-pumped, stimulated CO2 emission recently discovered at Venus. For a culture already supported by mature interplanetary industry, the cost of building planetary or high-power space laser systems for interstellar communication would be marginal, making such projects relevant for the next human century. Links using high-power lasers might support data transfer rates as high as optical frequencies could ever allow. A nanotechnological society such as we might become would inevitably use 10 to the 20th power b/yr transmission to promote its own evolutionary expansion out of the galaxy
Neural dynamics in cortical populations
Many essential neural computations are implemented by large populations of neurons working in concert. Recent studies have sought both to monitor increasingly large groups of neurons and to characterise their collective behaviour, but the standard computational approaches available to identify the collective dynamics scale poorly with the size of the dataset. We develop new efficient methods for discovering the low-dimensional dynamics that underlie simultaneously-recorded spike trains from a neural population. We use the new models to analyze two different sets of population recordings, one from motor cortex and another from auditory cortex. In motor cortex, we describe the nature of the trial-by-trial spontaneous fluctuations identified by the model and connect these fluctuations to behavioral events. The spatio-temporal structure of the spontaneous events was tracked by three trajectories identified by the model. These trajectories followed similar dynamics during hand reaches as they did when the hands were stationary. The structure of the models we developed allow them to be used as decoders of hand position from neural activity, significantly improving upon previous state-of-the-art methods. The decoders were able to predict information about the direction, onset time and speed profile of movements. In auditory cortex, we use the statistical models to identify population dynamics under different brain states. We report major differences in dynamics and stimulus coding between synchronized and desychronized brain states. Synchronized but not desynchronized brain states imposed constraints on neural dynamics such that a four-dimensional system accounted for most of the dynamical structure of population events. We used the low-dimensional representation of the data to construct network simulations that reproduced the patterns present in the recordings. The simulations suggest that the overall level of feedback inhibition controls the stability of each local cortical network, with unstable dynamics resulting in synchronized brain states. Finally we propose a functional role for dynamics in the representation of visual motion in visual cortex
Ambiguous Recognition: Recursion, Cognitive Blending, and the Problem of Interpretation in Twenty-First-Century Fiction
This dissertation uses theories of cognitive conceptual integration (as outlined by Gilles Fauconnier and Mark Turner) to propose a model of narrative reading that mediates between narratology and theories of reception. I use this model to demonstrate how new experimental narratives achieve a potent balance between a determinate and open story-form. Where the high postmodernists of the 1970s and 80s created ironic, undecidable story-worlds, the novels considered here allow readers to embrace seemingly opposite propositions without retreating into ironic suspension, trading the postmodernist âneither/norâ for a new âboth/and.â This technique demands significant revision of both descriptions of radical experimentation in twenty-first-century novels, and of earlier narratological accounts of the distinction between story and discourse.
Each novel considered in this dissertation encourages its reader to recognize combined concepts in the course of the reading process. Shelley Jacksonâs Half Life combines singular and plural identity, reimagining individualist subjectivity and the literary treatment of (dis)ability. Mark Z. Danielewskiâs Only Revolutions combines objective and subjective temporality, offering a new perspective on American myth-making in the popular post-Kerouac road-novel tradition. Percival Everettâs Erasure combines reliable and unreliable narration to create a complex critique of the idea of an African American novel tradition. M.D. Coverleyâs hypertext novel Califia involves the reader in all three of these discursive dimensions at once, updating the marginalized art of hypertext fiction by inviting the reader to see his or her role in navigating the text as both creative and determinedâthe epitome of open-and-closed form.
My analysis demonstrates how cognitive blending is a precise method for describing how a reader interprets complex narrative structures. I propose this blending-model as a new approach to contemporary experimental fiction from the perspective of the readerâs cognitive work, and show how it offers new readings of important contemporary fiction. I argue that twenty-first-century authors attempt simultaneously to construct âopenâ forms, and to address real socio-cultural concerns in the world; I also argue that a narratology founded on theories of cognitive processes is best-equipped to describe the operations of reading and understanding these complex narrative forms