Search CORE

821,032 research outputs found

Gesture-aware Interactive Machine Teaching with In-situ Object Annotations

Author: Yatani Koji
Zhou Zhongyi
Publication venue
Publication date: 01/08/2022
Field of study

Interactive Machine Teaching (IMT) systems allow non-experts to easily create Machine Learning (ML) models. However, existing vision-based IMT systems either ignore annotations on the objects of interest or require users to annotate in a post-hoc manner. Without the annotations on objects, the model may misinterpret the objects using unrelated features. Post-hoc annotations cause additional workload, which diminishes the usability of the overall model building process. In this paper, we develop LookHere, which integrates in-situ object annotations into vision-based IMT. LookHere exploits users' deictic gestures to segment the objects of interest in real time. This segmentation information can be additionally used for training. To achieve the reliable performance of this object segmentation, we utilize our custom dataset called HuTics, including 2040 front-facing images of deictic gestures toward various objects by 170 people. The quantitative results of our user study showed that participants were 16.3 times faster in creating a model with our system compared to a standard IMT system with a post-hoc annotation process while demonstrating comparable accuracies. Additionally, models created by our system showed a significant accuracy improvement (

\Delta mIoU=0.466

) in segmenting the objects of interest compared to those without annotations.Comment: UIST 202

arXiv.org e-Print Archive

Weed Identification by Single-Stage and Two-Stage Neural Networks: A Study on the Impact of Image Resizers and Weights Optimization Algorithms.

Author: Arif KM
Potgieter J
Saleem MH
Velayudhan KK
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

The accurate identification of weeds is an essential step for a site-specific weed management system. In recent years, deep learning (DL) has got rapid advancements to perform complex agricultural tasks. The previous studies emphasized the evaluation of advanced training techniques or modifying the well-known DL models to improve the overall accuracy. In contrast, this research attempted to improve the mean average precision (mAP) for the detection and classification of eight classes of weeds by proposing a novel DL-based methodology. First, a comprehensive analysis of single-stage and two-stage neural networks including Single-shot MultiBox Detector (SSD), You look only Once (YOLO-v4), EfficientDet, CenterNet, RetinaNet, Faster Region-based Convolutional Neural Network (RCNN), and Region-based Fully Convolutional Network (RFCN), has been performed. Next, the effects of image resizing techniques along with four image interpolation methods have been studied. It led to the final stage of the research through optimization of the weights of the best-acquired model by initialization techniques, batch normalization, and DL optimization algorithms. The effectiveness of the proposed work is proven due to a high mAP of 93.44% and validated by the stratified k-fold cross-validation technique. It was 5.8% improved as compared to the results obtained by the default settings of the best-suited DL architecture (Faster RCNN ResNet-101). The presented pipeline would be a baseline study for the research community to explore several tasks such as real-time detection and reducing the computation/training time. All the relevant data including the annotated dataset, configuration files, and inference graph of the final model are provided with this article. Furthermore, the selection of the DeepWeeds dataset shows the robustness/practicality of the study because it contains images collected in a real/complex agricultural environment. Therefore, this research would be a considerable step toward an efficient and automatic weed control system.Published onlin

Massey Research Online

PubMed Central

Workload-Aware Scheduling using Markov Decision Process for Infrastructure-Assisted Learning-Based Multi-UAV Surveillance Networks

Author: Jung Soyi
Kim Jae-Hyun
Kim Joongheon
Park Chanyoung
Park Soohyun
Publication venue
Publication date: 14/02/2023
Field of study

In modern networking research, infrastructure-assisted unmanned autonomous vehicles (UAVs) are actively considered for real-time learning-based surveillance and aerial data-delivery under unexpected 3D free mobility and coordination. In this system model, it is essential to consider the power limitation in UAVs and autonomous object recognition (for abnormal behavior detection) deep learning performance in infrastructure/towers. To overcome the power limitation of UAVs, this paper proposes a novel aerial scheduling algorithm between multi-UAVs and multi-towers where the towers conduct wireless power transfer toward UAVs. In addition, to take care of the high-performance learning model training in towers, we also propose a data delivery scheme which makes UAVs deliver the training data to the towers fairly to prevent problems due to data imbalance (e.g., huge computation overhead caused by larger data delivery or overfitting from less data delivery). Therefore, this paper proposes a novel workload-aware scheduling algorithm between multi-towers and multi-UAVs for joint power-charging from towers to their associated UAVs and training data delivery from UAVs to their associated towers. To compute the workload-aware optimal scheduling decisions in each unit time, our solution approach for the given scheduling problem is designed based on Markov decision process (MDP) to deal with (i) time-varying low-complexity computation and (ii) pseudo-polynomial optimality. As shown in performance evaluation results, our proposed algorithm ensures (i) sufficient times for resource exchanges between towers and UAVs, (ii) the most even and uniform data collection during the processes compared to the other algorithms, and (iii) the performance of all towers convergence to optimal levels.Comment: 15 pages, 10 figure

arXiv.org e-Print Archive

Recommended from our members

Explainable and Advisable Learning for Self-driving Vehicles

Author: Kim Jinkyu
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly

eScholarship - University of California

Exploring Real-Time Bio-Behaviorally-Aware Feedback Interventions for Mitigating Public Speaking Anxiety

Author: Agarwal Akansha
Publication venue
Publication date: 24/02/2022
Field of study

Effective public speaking skills are crucial to one’s academic and professional success. Individuals who are good in public speaking are more likely to graduate college and obtain a leadership position compared to their counter-peers. Yet, public speaking anxiety (PSA) is one of the most common social fear faced by people directly affecting one’s academic and professional success. This Master’s thesis investigates the effectiveness of in-the-moment bio-behaviorally aware feedback in mitigating public speaking anxiety in a virtual training environment. The training environment exposes participants to various virtual stimuli and at the same time, captures their audio and physiological signals. These signals are used to extract bio-behavioral measures (e.g., speech intonation, electrodermal activity mean) and serve as an input to a machine learning model that provides real-time estimates of state anxiety. Based on these state anxiety estimates, the system provides real-time feedback of positive reinforcement and cognitive restructuring–grounded on theoretical rationale from behavioral sciences–when an increase in state anxiety is detected. The system is evaluated through a small-scale study of participants using a pre/post evaluation design. Results indicate that in-the-moment feedback prompts provided to the participants affect their in-the-moment state-based anxiety. Statistical analysis indicates significant differences of biobehavioral measures before and after the in-the-moment feedback prompts. The self-reported POST-study results from the participants of the user study also indicate that 5 out of 7 participants found this study beneficial for their public speaking skills. Results of this work also highlight the effect of type of audience on the positive reinforcement feedback provided to the participants. It is observed that when the audience is negative, the positive reinforcement feedback prompts provided to the participants by the real-time model were more compared to a positive audience. Findings from this work provide a foundation toward designing artificial intelligence systems for personalized in-the-moment interventions for mitigating adverse behavioral outcomes

Texas A&M Repository

Cortical Learning of Recognition Categories: A Resolution of the Exemplar Vs. Prototype Debate

Author: Amis Gregory P.
Carpenter Gail A.
Ersoy Bilgin
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/03/2009
Field of study

Do humans and animals learn exemplars or prototypes when they categorize objects and events in the world? How are different degrees of abstraction realized through learning by neurons in inferotemporal and prefrontal cortex? How do top-down expectations influence the course of learning? Thirty related human cognitive experiments (the 5-4 category structure) have been used to test competing views in the prototype-exemplar debate. In these experiments, during the test phase, subjects unlearn in a characteristic way items that they had learned to categorize perfectly in the training phase. Many cognitive models do not describe how an individual learns or forgets such categories through time. Adaptive Resonance Theory (ART) neural models provide such a description, and also clarify both psychological and neurobiological data. Matching of bottom-up signals with learned top-down expectations plays a key role in ART model learning. Here, an ART model is used to learn incrementally in response to 5-4 category structure stimuli. Simulation results agree with experimental data, achieving perfect categorization in training and a good match to the pattern of errors exhibited by human subjects in the testing phase. These results show how the model learns both prototypes and certain exemplars in the training phase. ART prototypes are, however, unlike the ones posited in the traditional prototype-exemplar debate. Rather, they are critical patterns of features to which a subject learns to pay attention based on past predictive success and the order in which exemplars are experienced. Perturbations of old memories by newly arriving test items generate a performance curve that closely matches the performance pattern of human subjects. The model also clarifies exemplar-based accounts of data concerning amnesia.Defense Advanced Projects Research Agency SyNaPSE program (Hewlett-Packard Company, DARPA HR0011-09-3-0001; HRL Laboratories LLC #801881-BS under HR0011-09-C-0011); Science of Learning Centers program of the National Science Foundation (NSF SBE-0354378

Boston University Institutional Repository (OpenBU)

Mean-Field Theory of Meta-Learning

Author: Byvatov E
Dariusz Plewczynski
Dolezal J
Hotz C S
Plewczynski D
Plewczynski D
Publication venue: 'IOP Publishing'
Publication date: 09/10/2009
Field of study

We discuss here the mean-field theory for a cellular automata model of meta-learning. The meta-learning is the process of combining outcomes of individual learning procedures in order to determine the final decision with higher accuracy than any single learning method. Our method is constructed from an ensemble of interacting, learning agents, that acquire and process incoming information using various types, or different versions of machine learning algorithms. The abstract learning space, where all agents are located, is constructed here using a fully connected model that couples all agents with random strength values. The cellular automata network simulates the higher level integration of information acquired from the independent learning trials. The final classification of incoming input data is therefore defined as the stationary state of the meta-learning system using simple majority rule, yet the minority clusters that share opposite classification outcome can be observed in the system. Therefore, the probability of selecting proper class for a given input data, can be estimated even without the prior knowledge of its affiliation. The fuzzy logic can be easily introduced into the system, even if learning agents are build from simple binary classification machine learning algorithms by calculating the percentage of agreeing agents.Comment: 23 page

arXiv.org e-Print Archive

Crossref

Speech Development by Imitation

Author: Balkenius Christian
Breidegard Bjorn
Publication venue: Lund University Cognitive Studies
Publication date: 01/01/2003
Field of study

The Double Cone Model (DCM) is a model of how the brain transforms sensory input to motor commands through successive stages of data compression and expansion. We have tested a subset of the DCM on speech recognition, production and imitation. The experiments show that the DCM is a good candidate for an artificial speech processing system that can develop autonomously. We show that the DCM can learn a repertoire of speech sounds by listening to speech input. It is also able to link the individual elements of speech to sequences that can be recognized or reproduced, thus allowing the system to imitate spoken language

CiteSeerX

CogPrints Cognitive Sciences Eprint Archive