733 research outputs found

    Finding any Waldo: zero-shot invariant and efficient visual search

    Full text link
    Searching for a target object in a cluttered scene constitutes a fundamental challenge in daily vision. Visual search must be selective enough to discriminate the target from distractors, invariant to changes in the appearance of the target, efficient to avoid exhaustive exploration of the image, and must generalize to locate novel target objects with zero-shot training. Previous work has focused on searching for perfect matches of a target after extensive category-specific training. Here we show for the first time that humans can efficiently and invariantly search for natural objects in complex scenes. To gain insight into the mechanisms that guide visual search, we propose a biologically inspired computational model that can locate targets without exhaustive sampling and generalize to novel objects. The model provides an approximation to the mechanisms integrating bottom-up and top-down signals during search in natural scenes.Comment: Number of figures: 6 Number of supplementary figures: 1

    Perception de la géométrie de l'environnement pour la navigation autonome

    Get PDF
    Le but de de la recherche en robotique mobile est de donner aux robots la capacité d'accomplir des missions dans un environnement qui n'est pas parfaitement connu. Mission, qui consiste en l'exécution d'un certain nombre d'actions élémentaires (déplacement, manipulation d'objets...) et qui nécessite une localisation précise, ainsi que la construction d'un bon modèle géométrique de l'environnement, a partir de l'exploitation de ses propres capteurs, des capteurs externes, de l'information provenant d'autres robots et de modèle existant, par exemple d'un système d'information géographique. L'information commune est la géométrie de l'environnement. La première partie du manuscrit couvre les différents méthodes d'extraction de l'information géométrique. La seconde partie présente la création d'un modèle géométrique en utilisant un graphe, ainsi qu'une méthode pour extraire de l'information du graphe et permettre au robot de se localiser dans l'environnement.The goal of the mobile robotic research is to give robots the capability to accomplish missions in an environment that might be unknown. To accomplish his mission, the robot need to execute a given set of elementary actions (movement, manipulation of objects...) which require an accurate localisation of the robot, as well as a the construction of good geometric model of the environment. Thus, a robot will need to take the most out of his own sensors, of external sensors, of information coming from an other robot and of existing model coming from a Geographic Information System. The common information is the geometry of the environment. The first part of the presentation will be about the different methods to extract geometric information. The second part will be about the creation of the geometric model using a graph structure, along with a method to retrieve information in the graph to allow the robot to localise itself in the environment

    Event-Driven Technologies for Reactive Motion Planning: Neuromorphic Stereo Vision and Robot Path Planning and Their Application on Parallel Hardware

    Get PDF
    Die Robotik wird immer mehr zu einem Schlüsselfaktor des technischen Aufschwungs. Trotz beeindruckender Fortschritte in den letzten Jahrzehnten, übertreffen Gehirne von Säugetieren in den Bereichen Sehen und Bewegungsplanung noch immer selbst die leistungsfähigsten Maschinen. Industrieroboter sind sehr schnell und präzise, aber ihre Planungsalgorithmen sind in hochdynamischen Umgebungen, wie sie für die Mensch-Roboter-Kollaboration (MRK) erforderlich sind, nicht leistungsfähig genug. Ohne schnelle und adaptive Bewegungsplanung kann sichere MRK nicht garantiert werden. Neuromorphe Technologien, einschließlich visueller Sensoren und Hardware-Chips, arbeiten asynchron und verarbeiten so raum-zeitliche Informationen sehr effizient. Insbesondere ereignisbasierte visuelle Sensoren sind konventionellen, synchronen Kameras bei vielen Anwendungen bereits überlegen. Daher haben ereignisbasierte Methoden ein großes Potenzial, schnellere und energieeffizientere Algorithmen zur Bewegungssteuerung in der MRK zu ermöglichen. In dieser Arbeit wird ein Ansatz zur flexiblen reaktiven Bewegungssteuerung eines Roboterarms vorgestellt. Dabei wird die Exterozeption durch ereignisbasiertes Stereosehen erreicht und die Pfadplanung ist in einer neuronalen Repräsentation des Konfigurationsraums implementiert. Die Multiview-3D-Rekonstruktion wird durch eine qualitative Analyse in Simulation evaluiert und auf ein Stereo-System ereignisbasierter Kameras übertragen. Zur Evaluierung der reaktiven kollisionsfreien Online-Planung wird ein Demonstrator mit einem industriellen Roboter genutzt. Dieser wird auch für eine vergleichende Studie zu sample-basierten Planern verwendet. Ergänzt wird dies durch einen Benchmark von parallelen Hardwarelösungen wozu als Testszenario Bahnplanung in der Robotik gewählt wurde. Die Ergebnisse zeigen, dass die vorgeschlagenen neuronalen Lösungen einen effektiven Weg zur Realisierung einer Robotersteuerung für dynamische Szenarien darstellen. Diese Arbeit schafft eine Grundlage für neuronale Lösungen bei adaptiven Fertigungsprozesse, auch in Zusammenarbeit mit dem Menschen, ohne Einbußen bei Geschwindigkeit und Sicherheit. Damit ebnet sie den Weg für die Integration von dem Gehirn nachempfundener Hardware und Algorithmen in die Industrierobotik und MRK

    Parallel Weighted Random Sampling

    Get PDF
    Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutations, subsets, and reservoirs. We also give improved sequential algorithms for alias table construction and for sampling with replacement. Experiments on shared-memory parallel machines with up to 158 threads show near linear speedups both for construction and queries

    Spatially Coherent RANSAC for Multi-Model Fitting

    Get PDF
    RANSAC [15, 38, 1] is a reliable method for fitting parametric models to sparse data with many outliers. Originally designed for extracting a single model, RANSAC also has variants for fitting multiple models when supported by data. Our main insight is that, in practice, inliers for each model are often spatially coherent — all previous RANSAC-based methods ignore this. Our new method fits an unspecified number of models to data by combining ideas of random sampling and spatial regularization. As in basic RANSAC, we randomly sample data points to generate a set of proposed models (labels). We formulate model selection and inlier classification as a single problem — labeling of triangulated data points. Geometric fit errors and spatial coherence are combined in one MRF-based energy. In contrast to basic RANSAC, inlier classification does not depend on a fixed threshold. Moreover, our optimization framework allows iterative re-estimation of models/inliers with a clear stopping criteria and convergence guarantees. We show that our new method, SCO- RANSAC, can significantly improve results on synthetic and real data supporting multiple linear, affine, and homographic models

    Target Apps Selection: Towards a Unified Search Framework for Mobile Devices

    Full text link
    With the recent growth of conversational systems and intelligent assistants such as Apple Siri and Google Assistant, mobile devices are becoming even more pervasive in our lives. As a consequence, users are getting engaged with the mobile apps and frequently search for an information need in their apps. However, users cannot search within their apps through their intelligent assistants. This requires a unified mobile search framework that identifies the target app(s) for the user's query, submits the query to the app(s), and presents the results to the user. In this paper, we take the first step forward towards developing unified mobile search. In more detail, we introduce and study the task of target apps selection, which has various potential real-world applications. To this aim, we analyze attributes of search queries as well as user behaviors, while searching with different mobile apps. The analyses are done based on thousands of queries that we collected through crowdsourcing. We finally study the performance of state-of-the-art retrieval models for this task and propose two simple yet effective neural models that significantly outperform the baselines. Our neural approaches are based on learning high-dimensional representations for mobile apps. Our analyses and experiments suggest specific future directions in this research area.Comment: To appear at SIGIR 201

    Exploratory and predictive methods for multivariate time series data analysis in healthcare

    Full text link
    Ce mémoire s'inscrit dans l'émergente globalisation de l'intelligence artificielle aux domaines de la santé. Par le biais de l'application d'algorithmes modernes d'apprentissage automatique à deux études de cas concrètes, l'objectif est d'exposer de manière rigoureuse et intelligible aux experts de la santé comment l'intelligence artificielle exploite des données cliniques à la fois multivariées et longitudinales à des fins de visualisation et de prognostic de populations de patients en situation d'urgence médicale. Nos résultats montrent que la récente méthode de réduction de la dimensionalité PHATE couplée à un algorithme de regroupement surpasse d'autres méthodes plus établies dans la projection en deux dimensions de trajectoires multidimensionelles et aide ainsi les experts à mieux visualiser l'évolution de certaines sous-populations. Nous mettons aussi en évidence l'efficacité des réseaux de neurones récurrents traditionnels et conditionnels dans le prognostic précoce de patients malades. Enfin, nous évoquons l'analyse topologique de données comme piste de solution adéquate aux problèmes usuels de données incomplètes et irrégulières auxquels nous faisons face inévitablement au cours de la seconde étude de cas.This thesis aligns with the trending globalization of artificial intelligence in healthcare. Through two real-world applications of recent machine learning approaches, our fundamental goal is to rigorously and intelligibly expose to the domain experts how artificial intelligence uses clinical multivariate time series to provide visualizations and predictions related to populations of patients in an emergency condition. Our results demonstrate that the recent dimensionality reduction tool PHATE combined with a clustering algorithm outperforms other more established methods in projecting multivariate time series in two dimensions and thus help the experts visualize sub-populations' trajectories. We also highlight traditional and conditional recurrent neural networks' proficiency in the early prognosis of ill patients. Finally, we allude to topological data analysis as a suitable solution to common problems related to data irregularities and incompleteness we inevitably face in the second case study

    Formation of Morphable 3D­model of Large Scale Natural Sites by Using Image Based Modeling and Rendering Techniques

    Get PDF
    No global 3D model of the environment needs to be assembled, a process which can be extremely cumbersome and error prone for large scale scenes e.g. the global registration of multiple local models can accumulate a great amount of error, while it also presumes a very accurate extraction of the underlying geometry. On the contrary, neither any such accurate geometric reconstruction of the individual local 3D models nor a very precise registration between them is required by our framework in order that it can produce satisfactory results. This paper presents an application of LP based MRF optimization techniques and also we have turned our attention to a different re­ search topic: the proposal of novel image based modeling and rendering methods, which are capable of automatically reproducing faithful (i.e. photorealistic) digital copies of complex 3D virtual environments, while also allowing the virtual exploration of these environments at interactive frame rates
    corecore