762 research outputs found

    Using AI and Robotics for EV battery cable detection.: Development and implementation of end-to-end model-free 3D instance segmentation for industrial purposes

    Get PDF
    Master's thesis in Information- and communication technology (IKT590)This thesis describes a novel method for capturing point clouds and segmenting instances of cabling found on electric vehicle battery packs. The use of cutting-edge perception algorithm architectures, such as graph-based and voxel-based convolution, in industrial autonomous lithium-ion battery pack disassembly is being investigated. The thesis focuses on the challenge of getting a desirable representation of any battery pack using an ABB robot in conjunction with a high-end structured light camera, with "end-to-end" and "model-free" as design constraints. The thesis employs self-captured datasets comprised of several battery packs that have been captured and labeled. Following that, the datasets are used to create a perception system. This thesis recommends using HDR functionality in an industrial application to capture the full dynamic range of the battery packs. To adequately depict 3D features, a three-point-of-view capture sequence is deemed necessary. A general capture process for an entire battery pack is also presented, but a next-best-scan algorithm is likely required to ensure a "close to complete" representation. Graph-based deep-learning algorithms have been shown to be capable of being scaled up to50,000inputs while still exhibiting strong performance in terms of accuracy and processing time. The results show that an instance segmenting system can be implemented in less than two seconds. Using off-the-shelf hardware, demonstrate that a 3D perception system is industrially viable and competitive with a 2D perception system

    The emergence of structure from continuous speech: Multiple cues and constraints for speech segmentation and its neural bases

    Get PDF
    This thesis studies learning mechanisms and cognitive biases present from birth involved in language acquisition, in particular in speech segmentation and the extraction of linguistic regularities. Due to the sequential nature of speech, uncovering language structure is closely related with how infants segment speech. We investigated infant abilities to track distributional properties on the stimuli, and the role of prosodic cues and of memory constraints. In two experiments we investigated neonates\u2019 capacities to segment and extract words from continuous speech by using fNIRS. Experiment 1 demonstrates that neonates can segment and extract words from continuous speech based on distributional cues alone; whereas Experiment 2 shows that newborns can extract words when they are marked only by prosodic contours. Additionally we implemented a method for the study of the dynamics of the functional connectivity of the neonatal brain during speech segmentation tasks. We identi\ufb01ed stable and reproducible functional networks with small-world properties that were task independent. Moreover, we observed periods of high global and low global connectivity, which remarkably, were task dependent, with stronger values when neonates listen to speech with structure. In another set of experiments we studied memory constraints on the encoding of six-syllabic words in newborns using fNIRS. Experiment 4 demonstrates that the edge syllables of a sequence are better encoded, and Experiment 5 goes beyond by showing that a subtle pause enhances the encoding of intermediate syllables, which evidences the role of prosodic cues in speech processing. A \ufb01nal group of experiments explore how information is encoded when it is presented continuously across different modalities; speci\ufb01cally if an abstract encoding of the sequences\u2019 constituents is generated. Experiments 6-9 suggest that adults form an abstract representation of words based on the position of the syllables, but only in the speech modality. In Experiments 10 and 11 we used pupillometry to test the same in 5-month-old infants. Nevertheless results were not conclusive, we did not \ufb01nd evidence of an abstract encoding

    MINIMAL BASIS REPRESENTATION FOR GENERAL MOTION SEGMENTATION

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    IMAGE RETRIEVAL BASED ON COMPLEX DESCRIPTIVE QUERIES

    Get PDF
    The amount of visual data such as images and videos available over web has increased exponentially over the last few years. In order to efficiently organize and exploit these massive collections, a system, apart from being able to answer simple classification based questions such as whether a specific object is present (or absent) in an image, should also be capable of searching images and videos based on more complex descriptive questions. There is also a considerable amount of structure present in the visual world which, if effectively utilized, can help achieve this goal. To this end, we first present an approach for image ranking and retrieval based on queries consisting of multiple semantic attributes. We further show that there are significant correlations present between these attributes and accounting for them can lead to superior performance. Next, we extend this by proposing an image retrieval framework for descriptive queries composed of object categories, semantic attributes and spatial relationships. The proposed framework also includes a unique multi-view hashing technique, which enables query specification in three different modalities - image, sketch and text. We also demonstrate the effectiveness of leveraging contextual information to reduce the supervision requirements for learning object and scene recognition models. We present an active learning framework to simultaneously learn appearance and contextual models for scene understanding. Within this framework we introduce new kinds of labeling questions that are designed to collect appearance as well as contextual information and which mimic the way in which humans actively learn about their environment. Furthermore we explicitly model the contextual interactions between the regions within an image and select the question which leads to the maximum reduction in the combined entropy of all the regions in the image (image entropy)

    Neural Network Dynamics of Visual Processing in the Higher-Order Visual System

    Get PDF
    Vision is one of the most important human senses that facilitate rich interaction with the external environment. For example, optimal spatial localization and subsequent motor contact with a specific physical object amongst others requires a combination of visual attention, discrimination, and sensory-motor coordination. The mammalian brain has evolved to elegantly solve this problem of transforming visual input into an efficient motor output to interact with an object of interest. The frontal and parietal cortices are two higher-order (i.e. processes information beyond simple sensory transformations) brain areas that are intimately involved in assessing how an animal’s internal state or prior experiences should influence cognitive-behavioral output. It is well known that activity within each region and functional interactions between both regions are correlated with visual attention, decision-making, and memory performance. Therefore, it is not surprising that impairment in the fronto-parietal circuit is often observed in many psychiatric disorders. Network- and circuit-level fronto-parietal involvement in sensory-based behavior is well studied; however, comparatively less is known about how single neuron activity in each of these areas can give rise to such macroscopic activity. The goal of the studies in this dissertation is to address this gap in knowledge through simultaneous recordings of cellular and population activity during sensory processing and behavioral paradigms. Together, the combined narrative builds on several themes in neuroscience: variability of single cell function, population-level encoding of stimulus properties, and state and context-dependent neural dynamics.Doctor of Philosoph
    corecore