568 research outputs found

    Spatially Prioritized and Persistent Text Detection and Decoding

    Get PDF
    Abstract—We show how to exploit temporal and spatial coherence to achieve efficient and effective text detection and decoding for a sensor suite moving through an environment in which text occurs at a variety of locations, scales and orientations with respect to the observer. Our method uses simultaneous localization and mapping (SLAM) to extract planar “tiles ” representing scene surfaces. It then fuses multiple observations of each tile, captured from different observer poses, using homography transformations. Text is detected using Discrete Cosine Transform (DCT) and Maximally Stable Extremal Regions (MSER) methods; MSER enables fusion of multiple observations of blurry text regions in a component tree. The observations from SLAM and MSER are then decoded by an Optical Character Recognition (OCR) engine. The decoded characters are then clustered into character blocks to obtain an MLE word configuration. This paper’s contributions include: 1) spatiotemporal fusion of tile observations via SLAM, prior to inspection, thereby improving the quality of the input data; and 2) combination of multiple noisy text observations into a single higher-confidence estimate of environmental text

    Oscillatory Control over Representational States in Working Memory

    Get PDF
    In the visual world, attention is guided by perceptual goals activated in visual working memory (VWM). However, planning multiple-task sequences also requires VWM to store representations for future goals. These future goals need to be prevented from interfering with the current perceptual task. Recent findings have implicated neural oscillations as a control mechanism serving the implementation and switching of different states of prioritization of VWM representations. We review recent evidence that posterior alpha-band oscillations underlie the flexible activation and deactivation of VWM representations and that frontal delta-to-theta-band oscillations play a role in the executive control of this process. That is, frontal delta-to-theta appears to orchestrate posterior alpha through long-range oscillatory networks to flexibly set up and change VWM states during multitask sequences

    A Review on Text Detection Techniques

    Get PDF
    Text detection in image is an important field. Reading text is challenging because of the variations in images. Text detection is useful for many navigational purposes e.g. text on google API’s and traffic panels etc. This paper analyzes the work done on text detection by many researchers and critically evaluates the techniques designed for text detection and states the limitation of each approach. We have integrated the work of many researchers for getting a brief over view of multiple available techniques and their strengths and limitations are also discussed to give readers a clear picture. The major dataset discussed in all these papers are ICDAR 2003, 2005, 2011, 2013 and SVT(street view text).

    Bridging text spotting and SLAM with junction features

    Get PDF
    Navigating in a previously unknown environment and recognizing naturally occurring text in a scene are two important autonomous capabilities that are typically treated as distinct. However, these two tasks are potentially complementary, (i) scene and pose priors can benefit text spotting, and (ii) the ability to identify and associate text features can benefit navigation accuracy through loop closures. Previous approaches to autonomous text spotting typically require significant training data and are too slow for real-time implementation. In this work, we propose a novel high-level feature descriptor, the “junction”, which is particularly well-suited to text representation and is also fast to compute. We show that we are able to improve SLAM through text spotting on datasets collected with a Google Tango, illustrating how location priors enable improved loop closure with text features.Andrea Bocelli FoundationEast Japan Railway CompanyUnited States. Office of Naval Research (N00014-10-1-0936, N00014-11-1-0688, N00014-13-1-0588)National Science Foundation (U.S.) (IIS-1318392

    The cognitive neuroscience of visual working memory

    Get PDF
    Visual working memory allows us to temporarily maintain and manipulate visual information in order to solve a task. The study of the brain mechanisms underlying this function began more than half a century ago, with Scoville and Milner’s (1957) seminal discoveries with amnesic patients. This timely collection of papers brings together diverse perspectives on the cognitive neuroscience of visual working memory from multiple fields that have traditionally been fairly disjointed: human neuroimaging, electrophysiological, behavioural and animal lesion studies, investigating both the developing and the adult brain

    Domain-Specific Computing Architectures and Paradigms

    Full text link
    We live in an exciting era where artificial intelligence (AI) is fundamentally shifting the dynamics of industries and businesses around the world. AI algorithms such as deep learning (DL) have drastically advanced the state-of-the-art cognition and learning capabilities. However, the power of modern AI algorithms can only be enabled if the underlying domain-specific computing hardware can deliver orders of magnitude more performance and energy efficiency. This work focuses on this goal and explores three parts of the domain-specific computing acceleration problem; encapsulating specialized hardware and software architectures and paradigms that support the ever-growing processing demand of modern AI applications from the edge to the cloud. This first part of this work investigates the optimizations of a sparse spatio-temporal (ST) cognitive system-on-a-chip (SoC). This design extracts ST features from videos and leverages sparse inference and kernel compression to efficiently perform action classification and motion tracking. The second part of this work explores the significance of dataflows and reduction mechanisms for sparse deep neural network (DNN) acceleration. This design features a dynamic, look-ahead index matching unit in hardware to efficiently discover fine-grained parallelism, achieving high energy efficiency and low control complexity for a wide variety of DNN layers. Lastly, this work expands the scope to real-time machine learning (RTML) acceleration. A new high-level architecture modeling framework is proposed. Specifically, this framework consists of a set of high-performance RTML-specific architecture design templates, and a Python-based high-level modeling and compiler tool chain for efficient cross-stack architecture design and exploration.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162870/1/lchingen_1.pd

    Neural network mechanisms of working memory interference

    Get PDF
    [eng] Our ability to memorize is at the core of our cognitive abilities. How could we effectively make decisions without considering memories of previous experiences? Broadly, our memories can be divided in two categories: long-term and short-term memories. Sometimes, short-term memory is also called working memory and throughout this thesis I will use both terms interchangeably. As the names suggest, long-term memory is the memory you use when you remember concepts for a long time, such as your name or age, while short-term memory is the system you engage while choosing between different wines at the liquor store. As your attention jumps from one bottle to another, you need to hold in memory characteristics of previous ones to pick your favourite. By the time you pick your favourite bottle, you might remember the prices or grape types of the other bottles, but you are likely to forget all of those details an hour later at home, opening the wine in front of your guests. The overall goal of this thesis is to study the neural mechanisms that underlie working memory interference, as reflected in quantitative, systematic behavioral biases. Ultimately, the goal of each chapter, even when focused exclusively on behavioral experiments, is to nail down plausible neural mechanisms that can produce specific behavioral and neurophysiological findings. To this end, we use the bump-attractor model as our working hypothesis, with which we often contrast the synaptic working memory model. The work performed during this thesis is described here in 3 main chapters, encapsulation 5 broad goals: In Chapter 4.1, we aim at testing behavioral predictions of a bump-attractor (1) network when used to store multiple items. Moreover, we connected two of such networks aiming to model feature-binding through selectivity synchronization (2). In Chapter 4.2, we aim to clarify the mechanisms of working memory interference from previous memories (3), the so-called serial biases. These biases provide an excellent opportunity to contrast activity-based and activity-silent mechanisms because both mechanisms have been proposed to be the underlying cause of those biases. In Chapter 4.3, armed with the same techniques used to seek evidence for activity-silent mechanisms, we test a prediction of the bump-attractor model with short-term plasticity (4). Finally, in light of the results from aim 4 and simple computer simulations, we reinterpret previous studies claiming evidence for activity-silent mechanisms (5)

    Pinging the brain to reveal hidden working memory states

    Get PDF
    Maintaining information for short periods of time in working memory, without its existence in the outer world, is crucial for everyday life, allowing us to move beyond simple, reflexive actions, and towards complex, goal-directed behaviours. It has been the consensus that the continuous activity of specific neurons are responsible to keep these information “online” until they are no longer required. However, this classic theory has been questioned more recently. Working memories that are not actively rehearsed seem to be maintained in an “activity-silent” network, eliciting no measurable neural activity, suggesting that it is the short-term changes in the neural wiring patterns that is responsible for their maintenance. These memories are thus hidden from conventional measuring techniques making it difficult to research them.This thesis proposes an approach to reveal hidden working memories that is analogues to active sonar: Hidden structures can be inferred from the echo of a “ping”. Similarly, by pushing a wave of activity through the silent neural network via external stimulation (for example a white flash), the resulting recording patterns expose the previously hidden memories held in said network. This approach is demonstrated in a series of experiments where both visual and auditory working memories are revealed. It is also used to reconstruct specific working memories with high-fidelity after different maintenance periods, showing that the maintenance of even a single piece of information is by no means perfect, as it tends to randomly and gradually transform within 1 to 2 seconds (for example purple becomes blue)

    Updating spatial working memory in a dynamic visual environment

    Get PDF
    The present review describes recent developments regarding the role of the eye movement system in representing spatial information and keeping track of locations of relevant objects. First, we discuss the active vision perspective and why eye movements are considered crucial for perception and attention. The second part focuses on the question of how the oculomotor system is used to represent spatial attentional priority, and the role of the oculomotor system in maintenance of this spatial information. Lastly, we discuss recent findings demonstrating rapid updating of information across saccadic eye movements. We argue that the eye movement system plays a key role in maintaining and rapidly updating spatial information. Furthermore, we suggest that rapid updating emerges primarily to make sure actions are minimally affected by intervening eye movements, allowing us to efficiently interact with the world around us
    • 

    corecore