11,346 research outputs found
Characterizing Visual Localization and Mapping Datasets
Benchmarking mapping and motion estimation algorithms is established practice in robotics and computer vision. As the diversity of datasets increases, in terms of the trajectories, models, and scenes, it becomes a challenge to select datasets for a given benchmarking purpose. Inspired by the Wasserstein distance, this paper addresses this concern by developing novel metrics to evaluate trajectories and the environments without relying on any SLAM or motion estimation algorithm. The metrics, which so far have been missing in the research community, can be applied to the plethora of datasets that exist. Additionally, to improve the robotics SLAM benchmarking, the paper presents a new dataset for visual localization and mapping algorithms. A broad range of real-world trajectories is used in very high-quality scenes and a rendering framework to create a set of synthetic datasets with ground-truth trajectory and dense map which are representative of key SLAM applications such as virtual reality (VR), micro aerial vehicle (MAV) flight, and ground robotics
Seizure-onset mapping based on time-variant multivariate functional connectivity analysis of high-dimensional intracranial EEG : a Kalman filter approach
The visual interpretation of intracranial EEG (iEEG) is the standard method used in complex epilepsy surgery cases to map the regions of seizure onset targeted for resection. Still, visual iEEG analysis is labor-intensive and biased due to interpreter dependency. Multivariate parametric functional connectivity measures using adaptive autoregressive (AR) modeling of the iEEG signals based on the Kalman filter algorithm have been used successfully to localize the electrographic seizure onsets. Due to their high computational cost, these methods have been applied to a limited number of iEEG time-series (< 60). The aim of this study was to test two Kalman filter implementations, a well-known multivariate adaptive AR model (Arnold et al. 1998) and a simplified, computationally efficient derivation of it, for their potential application to connectivity analysis of high-dimensional (up to 192 channels) iEEG data. When used on simulated seizures together with a multivariate connectivity estimator, the partial directed coherence, the two AR models were compared for their ability to reconstitute the designed seizure signal connections from noisy data. Next, focal seizures from iEEG recordings (73-113 channels) in three patients rendered seizure-free after surgery were mapped with the outdegree, a graph-theory index of outward directed connectivity. Simulation results indicated high levels of mapping accuracy for the two models in the presence of low-to-moderate noise cross-correlation. Accordingly, both AR models correctly mapped the real seizure onset to the resection volume. This study supports the possibility of conducting fully data-driven multivariate connectivity estimations on high-dimensional iEEG datasets using the Kalman filter approach
Bimodal network architectures for automatic generation of image annotation from text
Medical image analysis practitioners have embraced big data methodologies.
This has created a need for large annotated datasets. The source of big data is
typically large image collections and clinical reports recorded for these
images. In many cases, however, building algorithms aimed at segmentation and
detection of disease requires a training dataset with markings of the areas of
interest on the image that match with the described anomalies. This process of
annotation is expensive and needs the involvement of clinicians. In this work
we propose two separate deep neural network architectures for automatic marking
of a region of interest (ROI) on the image best representing a finding
location, given a textual report or a set of keywords. One architecture
consists of LSTM and CNN components and is trained end to end with images,
matching text, and markings of ROIs for those images. The output layer
estimates the coordinates of the vertices of a polygonal region. The second
architecture uses a network pre-trained on a large dataset of the same image
types for learning feature representations of the findings of interest. We show
that for a variety of findings from chest X-ray images, both proposed
architectures learn to estimate the ROI, as validated by clinical annotations.
There is a clear advantage obtained from the architecture with pre-trained
imaging network. The centroids of the ROIs marked by this network were on
average at a distance equivalent to 5.1% of the image width from the centroids
of the ground truth ROIs.Comment: Accepted to MICCAI 2018, LNCS 1107
Large-Scale Mapping of Human Activity using Geo-Tagged Videos
This paper is the first work to perform spatio-temporal mapping of human
activity using the visual content of geo-tagged videos. We utilize a recent
deep-learning based video analysis framework, termed hidden two-stream
networks, to recognize a range of activities in YouTube videos. This framework
is efficient and can run in real time or faster which is important for
recognizing events as they occur in streaming video or for reducing latency in
analyzing already captured video. This is, in turn, important for using video
in smart-city applications. We perform a series of experiments to show our
approach is able to accurately map activities both spatially and temporally. We
also demonstrate the advantages of using the visual content over the
tags/titles.Comment: Accepted at ACM SIGSPATIAL 201
Compressive Embedding and Visualization using Graphs
Visualizing high-dimensional data has been a focus in data analysis
communities for decades, which has led to the design of many algorithms, some
of which are now considered references (such as t-SNE for example). In our era
of overwhelming data volumes, the scalability of such methods have become more
and more important. In this work, we present a method which allows to apply any
visualization or embedding algorithm on very large datasets by considering only
a fraction of the data as input and then extending the information to all data
points using a graph encoding its global similarity. We show that in most
cases, using only samples is sufficient to diffuse the
information to all data points. In addition, we propose quantitative
methods to measure the quality of embeddings and demonstrate the validity of
our technique on both synthetic and real-world datasets
Encoderless Gimbal Calibration of Dynamic Multi-Camera Clusters
Dynamic Camera Clusters (DCCs) are multi-camera systems where one or more
cameras are mounted on actuated mechanisms such as a gimbal. Existing methods
for DCC calibration rely on joint angle measurements to resolve the
time-varying transformation between the dynamic and static camera. This
information is usually provided by motor encoders, however, joint angle
measurements are not always readily available on off-the-shelf mechanisms. In
this paper, we present an encoderless approach for DCC calibration which
simultaneously estimates the kinematic parameters of the transformation chain
as well as the unknown joint angles. We also demonstrate the integration of an
encoderless gimbal mechanism with a state-of-the art VIO algorithm, and show
the extensions required in order to perform simultaneous online estimation of
the joint angles and vehicle localization state. The proposed calibration
approach is validated both in simulation and on a physical DCC composed of a
2-DOF gimbal mounted on a UAV. Finally, we show the experimental results of the
calibrated mechanism integrated into the OKVIS VIO package, and demonstrate
successful online joint angle estimation while maintaining localization
accuracy that is comparable to a standard static multi-camera configuration.Comment: ICRA 201
Magnetic-Visual Sensor Fusion-based Dense 3D Reconstruction and Localization for Endoscopic Capsule Robots
Reliable and real-time 3D reconstruction and localization functionality is a
crucial prerequisite for the navigation of actively controlled capsule
endoscopic robots as an emerging, minimally invasive diagnostic and therapeutic
technology for use in the gastrointestinal (GI) tract. In this study, we
propose a fully dense, non-rigidly deformable, strictly real-time,
intraoperative map fusion approach for actively controlled endoscopic capsule
robot applications which combines magnetic and vision-based localization, with
non-rigid deformations based frame-to-model map fusion. The performance of the
proposed method is demonstrated using four different ex-vivo porcine stomach
models. Across different trajectories of varying speed and complexity, and four
different endoscopic cameras, the root mean square surface reconstruction
errors 1.58 to 2.17 cm.Comment: submitted to IROS 201
ERP correlates of word production before and after stroke in an aphasic patient
No abstract available
- …