36 research outputs found

    Advances in humanoid control and perception

    Get PDF
    One day there will be humanoid robots among us doing our boring, time-consuming, or dangerous tasks. They might cook a delicious meal for us or do the groceries. For this to become reality, many advances need to be made to the artificial intelligence of humanoid robots. The ever-increasing available computational processing power opens new doors for such advances. In this thesis we develop novel algorithms for humanoid control and vision that harness this power. We apply these methods on an iCub humanoid upper-body with 41 degrees of freedom. For control, we develop Natural Gradient Inverse Kinematics (NGIK), a sampling-based optimiser that applies natural evolution strategies to perform inverse kinematics. The resulting algorithm makes very few assumptions and gives much more freedom in definable constraints than its Jacobian-based counterparts. A special graph-building procedure is introduced to build Task-Relevant Roadmaps (TRM) by iteratively applying NGIK and storing the results. TRMs form searchable graphs of kinematic configurations on which a wide range of task-relevant humanoid movements can be planned. Through coordinating several instances of NGIK, a fast parallelised version of the TRM building algorithm is developed. To contrast the offline TRM algorithms, we also develop Natural Gradient Control which directly uses the optimisation pass in NGIK as an online control signal. For vision, we develop dynamic vision algorithms that form cyclic information flows that affect their own processing. Deep Attention Selective Networks (dasNet) implement feedback in convolutional neural networks through a gating mechanism that is steered by a policy. Through this feedback, dasNet can focus on different features in the image in light of previously gathered information and improve classification, with state-of-the- art results at the time of publication. Then, we develop PyraMiD-LSTM, which processes 3D volumetric data by employing a novel convolutional Long Short-Term Memory network (C-LSTM) to compute pyramidal contexts for every voxel, and combine them to perform segmentation. This resulted in state-of-the-art performance on a segmentation benchmark. The work on control and vision is integrated into an application on the iCub robot. A Fast-Weight PyraMiD-LSTM is developed that dynamically generates weights for a C-LSTM layer given actions of the robot. An explorative policy using NGC generates a stream of data, which the Fast-Weight PyraMiD-LSTM has to predict. The resulting integrated system learns to model the effects of head and hand movements and their effects on future visual input. To our knowledge, this is the first effective visual prediction system on an iCub

    Embracing Hellman: A Simple Proof-of-Space Search consensus algorithm with stable block times using Logarithmic Embargo

    Get PDF
    Cryptocurrencies have become tremendously popular since the creation of Bitcoin. However, its central Proof-of-Work consensus mechanism is very power hungry. As an alternative, Proof-of-Space (PoS) was introduced that uses storage instead of computations to create a consensus. However, current PoS implementations are complex and sensitive to the Nothing-at-Stake problem, and use mitigations that affect their permissionless and decentralised nature. We introduce Proof-of-Space Search (PoSS) which embraces Hellman\u27s time-memory trade-off to create a much simpler algorithm that avoids the Nothing-at-Stake problem. Additionally, we greatly stabilise block-times using a novel dynamic Logarithmic Embargo (LE) rule. Combined, we show that PoSSLE is a simple and stable alternative to PoW with many of its properties, while being an estimated 10 times more energy efficient and sustaining consistent block times

    Взаємодія напівпровідників типу АІІІВV з розчинами Н2О2 - НВr

    Get PDF
    To plan complex motions of robots with many degrees of freedom, our novel, very flexible framework builds task-relevant roadmaps (TRMs), using a new sampling-based optimizer called Natural Gradient Inverse Kinematics (NGIK) based on natural evolution strategies (NES). To build TRMs, NGIK iteratively optimizes postures covering task-spaces expressed by arbitrary task-functions, subject to constraints expressed by arbitrary cost-functions, transparently dealing with both hard and soft constraints. TRMs are grown to maximally cover the task-space while minimizing costs. Unlike Jacobian-based methods, our algorithm does not rely on calculation of gradients, making application of the algorithm much simpler. We show how NGIK outperforms recent related sampling algorithms. A <font color="blue"><a href="http://youtu.be/N6x2e1Zf_yg">video demo</a></font> successfully applies TRMs to an iCub humanoid robot with 41 DOF in its upper body, arms, hands, head, and eyes. To our knowledge, no similar methods exhibit such a degree of flexibility in defining movements

    Stroke lesion outcome prediction based on MRI imaging combined with clinical information

    Get PDF
    In developed countries, the second leading cause of death is stroke, which has the ischemic stroke as the most common type. The preferred diagnosis procedure involves the acquisition of multi-modal Magnetic Resonance Imaging. Besides detecting and locating the stroke lesion, Magnetic Resonance Imaging captures blood flow dynamics that guides the physician in evaluating the risks and benefits of the reperfusion procedure. However, the decision process is an intricate task due to the variability of lesion size, shape, and location, as well as the complexity of the underlying cerebral hemodynamic process. Therefore, an automatic method that predicts the stroke lesion outcome, at a 3-month follow-up, would provide an important support to the physicians' decision process. In this work, we propose an automatic deep learning-based method for stroke lesion outcome prediction. Our main contribution resides in the combination of multi-modal Magnetic Resonance Imaging maps with non-imaging clinical meta-data: the thrombolysis in cerebral infarction scale, which categorizes the success of recanalization, achieved through mechanical thrombectomy. In our proposal, this clinical information is considered at two levels. First, at a population level by embedding the clinical information in a custom loss function used during training of our deep learning architecture. Second, at a patient-level through an extra input channel of the neural network used at testing time for a given patient case. By merging imaging with non-imaging clinical information, we aim to obtain a model aware of the principal and collateral blood flow dynamics for cases where there is no perfusion beyond the point of occlusion and for cases where the perfusion is complete after the occlusion point.AP was supported by a scholarship from the Fundacao para a Ciencia e Tecnologia (FCT), Portugal (scholarship number PD/BD/113968/2015). This work is supported by FCT with the reference project UID/EEA/04436/2013, by FEDER funds through the COMPETE 2020 Programa Operacional Competitividade e Internacionalizacao (POCI) with the reference project POCI-01-0145-FEDER-006941. We acknowledge support from the Swiss National Science Foundation - DACH320030L_163363

    Geometric deep learning

    Get PDF
    The goal of these course notes is to describe the main mathematical ideas behind geometric deep learning and to provide implementation details for several applications in shape analysis and synthesis, computer vision and computer graphics. The text in the course materials is primarily based on previously published work. With these notes we gather and provide a clear picture of the key concepts and techniques that fall under the umbrella of geometric deep learning, and illustrate the applications they enable. We also aim to provide practical implementation details for the methods presented in these works, as well as suggest further readings and extensions of these ideas

    How biological attention mechanisms improve task performance in a large-scale visual system model

    Get PDF
    How does attentional modulation of neural activity enhance performance? Here we use a deep convolutional neural network as a large-scale model of the visual system to address this question. We model the feature similarity gain model of attention, in which attentional modulation is applied according to neural stimulus tuning. Using a variety of visual tasks, we show that neural modulations of the kind and magnitude observed experimentally lead to performance changes of the kind and magnitude observed experimentally. We find that, at earlier layers, attention applied according to tuning does not successfully propagate through the network, and has a weaker impact on performance than attention applied according to values computed for optimally modulating higher areas. This raises the question of whether biological attention might be applied at least in part to optimize function rather than strictly according to tuning. We suggest a simple experiment to distinguish these alternatives

    Using Guided Autoencoders on Face Recognition

    No full text
    De begeleider en/of auteur heeft geen toestemming gegeven tot het openbaar maken van de scriptie. The supervisor and/or the author did not authorize public publication of the thesis.
    corecore