7,617 research outputs found
Advanced Mobile Robotics: Volume 3
Mobile robotics is a challenging field with great potential. It covers disciplines including electrical engineering, mechanical engineering, computer science, cognitive science, and social science. It is essential to the design of automated robots, in combination with artificial intelligence, vision, and sensor technologies. Mobile robots are widely used for surveillance, guidance, transportation and entertainment tasks, as well as medical applications. This Special Issue intends to concentrate on recent developments concerning mobile robots and the research surrounding them to enhance studies on the fundamental problems observed in the robots. Various multidisciplinary approaches and integrative contributions including navigation, learning and adaptation, networked system, biologically inspired robots and cognitive methods are welcome contributions to this Special Issue, both from a research and an application perspective
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images
Real-time magnetic resonance imaging (RT-MRI) of human speech production is
enabling significant advances in speech science, linguistics, bio-inspired
speech technology development, and clinical applications. Easy access to RT-MRI
is however limited, and comprehensive datasets with broad access are needed to
catalyze research across numerous domains. The imaging of the rapidly moving
articulators and dynamic airway shaping during speech demands high
spatio-temporal resolution and robust reconstruction methods. Further, while
reconstructed images have been published, to-date there is no open dataset
providing raw multi-coil RT-MRI data from an optimized speech production
experimental setup. Such datasets could enable new and improved methods for
dynamic image reconstruction, artifact correction, feature extraction, and
direct extraction of linguistically-relevant biomarkers. The present dataset
offers a unique corpus of 2D sagittal-view RT-MRI videos along with
synchronized audio for 75 subjects performing linguistically motivated speech
tasks, alongside the corresponding first-ever public domain raw RT-MRI data.
The dataset also includes 3D volumetric vocal tract MRI during sustained speech
sounds and high-resolution static anatomical T2-weighted upper airway MRI for
each subject.Comment: 27 pages, 6 figures, 5 tables, submitted to Nature Scientific Dat
Diver Interest via Pointing in Three Dimensions: 3D Pointing Reconstruction for Diver-AUV Communication
This paper presents Diver Interest via Pointing in Three Dimensions (DIP-3D),
a method to relay an object of interest from a diver to an autonomous
underwater vehicle (AUV) by pointing that includes three-dimensional distance
information to discriminate between multiple objects in the AUV's camera image.
Traditional dense stereo vision for distance estimation underwater is
challenging because of the relative lack of saliency of scene features and
degraded lighting conditions. Yet, including distance information is necessary
for robotic perception of diver pointing when multiple objects appear within
the robot's image plane. We subvert the challenges of underwater distance
estimation by using sparse reconstruction of keypoints to perform pose
estimation on both the left and right images from the robot's stereo camera.
Triangulated pose keypoints, along with a classical object detection method,
enable DIP-3D to infer the location of an object of interest when multiple
objects are in the AUV's field of view. By allowing the scuba diver to point at
an arbitrary object of interest and enabling the AUV to autonomously decide
which object the diver is pointing to, this method will permit more natural
interaction between AUVs and human scuba divers in underwater-human robot
collaborative tasks.Comment: Under Review International Conference of Robotics and Automation 202
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues
Recent progress in fine-grained gesture and action classification, and
machine translation, point to the possibility of automated sign language
recognition becoming a reality. A key stumbling block in making progress
towards this goal is a lack of appropriate training data, stemming from the
high complexity of sign annotation and a limited supply of qualified
annotators. In this work, we introduce a new scalable approach to data
collection for sign recognition in continuous videos. We make use of
weakly-aligned subtitles for broadcast footage together with a keyword spotting
method to automatically localise sign-instances for a vocabulary of 1,000 signs
in 1,000 hours of video. We make the following contributions: (1) We show how
to use mouthing cues from signers to obtain high-quality annotations from video
data - the result is the BSL-1K dataset, a collection of British Sign Language
(BSL) signs of unprecedented scale; (2) We show that we can use BSL-1K to train
strong sign recognition models for co-articulated signs in BSL and that these
models additionally form excellent pretraining for other sign languages and
benchmarks - we exceed the state of the art on both the MSASL and WLASL
benchmarks. Finally, (3) we propose new large-scale evaluation sets for the
tasks of sign recognition and sign spotting and provide baselines which we hope
will serve to stimulate research in this area.Comment: Appears in: European Conference on Computer Vision 2020 (ECCV 2020).
28 page
The UNC/UMN Baby Connectome Project (BCP): An overview of the study design and protocol development
The human brain undergoes extensive and dynamic growth during the first years of life. The UNC/UMN Baby Connectome Project (BCP), one of the Lifespan Connectome Projects funded by NIH, is an ongoing study jointly conducted by investigators at the University of North Carolina at Chapel Hill and the University of Minnesota. The primary objective of the BCP is to characterize brain and behavioral development in typically developing infants across the first 5 years of life. The ultimate goals are to chart emerging patterns of structural and functional connectivity during this period, map brain-behavior associations, and establish a foundation from which to further explore trajectories of health and disease. To accomplish these goals, we are combining state of the art MRI acquisition and analysis techniques, including high-resolution structural MRI (T1-and T2-weighted images), diffusion imaging (dMRI), and resting state functional connectivity MRI (rfMRI). While the overall design of the BCP largely is built on the protocol developed by the Lifespan Human Connectome Project (HCP), given the unique age range of the BCP cohort, additional optimization of imaging parameters and consideration of an age appropriate battery of behavioral assessments were needed. Here we provide the overall study protocol, including approaches for subject recruitment, strategies for imaging typically developing children 0–5 years of age without sedation, imaging protocol and optimization, a description of the battery of behavioral assessments, and QA/QC procedures. Combining HCP inspired neuroimaging data with well-established behavioral assessments during this time period will yield an invaluable resource for the scientific community
Functional Magnetic Resonance Imaging
"Functional Magnetic Resonance Imaging - Advanced Neuroimaging Applications" is a concise book on applied methods of fMRI used in assessment of cognitive functions in brain and neuropsychological evaluation using motor-sensory activities, language, orthographic disabilities in children. The book will serve the purpose of applied neuropsychological evaluation methods in neuropsychological research projects, as well as relatively experienced psychologists and neuroscientists. Chapters are arranged in the order of basic concepts of fMRI and physiological basis of fMRI after event-related stimulus in first two chapters followed by new concepts of fMRI applied in constraint-induced movement therapy; reliability analysis; refractory SMA epilepsy; consciousness states; rule-guided behavioral analysis; orthographic frequency neighbor analysis for phonological activation; and quantitative multimodal spectroscopic fMRI to evaluate different neuropsychological states
Playing Charades in the fMRI: Are Mirror and/or Mentalizing Areas Involved in Gestural Communication?
Communication is an important aspect of human life, allowing us to powerfully coordinate our behaviour with that of others. Boiled down to its mere essentials, communication entails transferring a mental content from one brain to another. Spoken language obviously plays an important role in communication between human individuals. Manual gestures however often aid the semantic interpretation of the spoken message, and gestures may have played a central role in the earlier evolution of communication. Here we used the social game of charades to investigate the neural basis of gestural communication by having participants produce and interpret meaningful gestures while their brain activity was measured using functional magnetic resonance imaging. While participants decoded observed gestures, the putative mirror neuron system (pMNS: premotor, parietal and posterior mid-temporal cortex), associated with motor simulation, and the temporo-parietal junction (TPJ), associated with mentalizing and agency attribution, were significantly recruited. Of these areas only the pMNS was recruited during the production of gestures. This suggests that gestural communication relies on a combination of simulation and, during decoding, mentalizing/agency attribution brain areas. Comparing the decoding of gestures with a condition in which participants viewed the same gestures with an instruction not to interpret the gestures showed that although parts of the pMNS responded more strongly during active decoding, most of the pMNS and the TPJ did not show such significant task effects. This suggests that the mere observation of gestures recruits most of the system involved in voluntary interpretation
Novel Hybrid-Learning Algorithms for Improved Millimeter-Wave Imaging Systems
Increasing attention is being paid to millimeter-wave (mmWave), 30 GHz to 300
GHz, and terahertz (THz), 300 GHz to 10 THz, sensing applications including
security sensing, industrial packaging, medical imaging, and non-destructive
testing. Traditional methods for perception and imaging are challenged by novel
data-driven algorithms that offer improved resolution, localization, and
detection rates. Over the past decade, deep learning technology has garnered
substantial popularity, particularly in perception and computer vision
applications. Whereas conventional signal processing techniques are more easily
generalized to various applications, hybrid approaches where signal processing
and learning-based algorithms are interleaved pose a promising compromise
between performance and generalizability. Furthermore, such hybrid algorithms
improve model training by leveraging the known characteristics of radio
frequency (RF) waveforms, thus yielding more efficiently trained deep learning
algorithms and offering higher performance than conventional methods. This
dissertation introduces novel hybrid-learning algorithms for improved mmWave
imaging systems applicable to a host of problems in perception and sensing.
Various problem spaces are explored, including static and dynamic gesture
classification; precise hand localization for human computer interaction;
high-resolution near-field mmWave imaging using forward synthetic aperture
radar (SAR); SAR under irregular scanning geometries; mmWave image
super-resolution using deep neural network (DNN) and Vision Transformer (ViT)
architectures; and data-level multiband radar fusion using a novel
hybrid-learning architecture. Furthermore, we introduce several novel
approaches for deep learning model training and dataset synthesis.Comment: PhD Dissertation Submitted to UTD ECE Departmen
- …