130,468 research outputs found
K-LITE: Learning Transferable Visual Models with External Knowledge
Recent state-of-the-art computer vision systems are trained from natural
language supervision, ranging from simple object category names to descriptive
captions. This free form of supervision ensures high generality and usability
of the learned visual models, based on extensive heuristics on data collection
to cover as many visual concepts as possible. Alternatively, learning with
external knowledge about images is a promising way which leverages a much more
structured source of supervision. In this paper, we propose K-LITE
(Knowledge-augmented Language-Image Training and Evaluation), a simple strategy
to leverage external knowledge to build transferable visual systems: In
training, it enriches entities in natural language with WordNet and Wiktionary
knowledge, leading to an efficient and scalable approach to learning image
representations that can understand both visual concepts and their knowledge;
In evaluation, the natural language is also augmented with external knowledge
and then used to reference learned visual concepts (or describe new ones) to
enable zero-shot and few-shot transfer of the pre-trained models. We study the
performance of K-LITE on two important computer vision problems, image
classification and object detection, benchmarking on 20 and 13 different
existing datasets, respectively. The proposed knowledge-augmented models show
significant improvement in transfer learning performance over existing methods.Comment: Preprint. The first three authors contribute equall
User Interface and Interaction Design Considerations for Collaborative Learning Using Augmented Reality Learning Object
Abstract. Most education is too often about teaching and not enough about learning. It is because students are forced to take whatever it is given to them without considering what they think about it, in other words, they passively take the given knowledge. This paper presents early investigation about interface and interaction design considerations for effective collaborative learning by taking account individual learning preferences and collaborative learning
characteristics of engineering students. In our investigation, we follow Felder Silverman Learning Style Model and conducted a test measured using Index Learning Style. As a result, we discovered that engineering students tend to be active, sensory, visual, and sequential. Therefore, we implement augmented reality views to satisfy students’ learning preferences toward content presentation (visual learner). It is also because augmented reality can give rich information toward real objects/environment. For collaborative characteristics, we studied past research on collaborative learning regarding its characteristics that affects learning effectiveness. Besides, our proposed design also considered the user interface principle which provides a guidance to effectively implement our consideration into an interface
Video augmentation to support video-based learning
Multimedia content and video-based learning are expected to take a central role in the post-pandemic world. Thus, providing new advanced interfaces and services that further exploit their potential becomes of paramount importance. A challenging area deals with developing intelligent visual interfaces that integrate the knowledge extracted from multimedia materials into educational applications. In this respect, we designed a web-based video player that is aimed to support video consumption by exploiting the knowledge extracted from the video in terms of concepts explained in the video and prerequisite relations between them. This knowledge is used to augment the video lesson through visual feedback methods. Specifically, in this paper we investigate the use of two types of visual feedback, i.e. an augmented transcript and a dynamic concept map (map of concept's flow), to improve video comprehension in the first-watch learning context. Our preliminary findings suggest that both the methods help the learner to focus on the relevant concepts and their related contents. The augmented transcript has an higher impact on immediate comprehension compared to the map of concepts' flow, even though the latter is expected to be more powerful to support other tasks such as exploration and in-depth analysis of the concepts in the video
Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection
Multi-label image classification is a fundamental but challenging task
towards general visual understanding. Existing methods found the region-level
cues (e.g., features from RoIs) can facilitate multi-label classification.
Nevertheless, such methods usually require laborious object-level annotations
(i.e., object labels and bounding boxes) for effective learning of the
object-level visual features. In this paper, we propose a novel and efficient
deep framework to boost multi-label classification by distilling knowledge from
weakly-supervised detection task without bounding box annotations.
Specifically, given the image-level annotations, (1) we first develop a
weakly-supervised detection (WSD) model, and then (2) construct an end-to-end
multi-label image classification framework augmented by a knowledge
distillation module that guides the classification model by the WSD model
according to the class-level predictions for the whole image and the
object-level visual features for object RoIs. The WSD model is the teacher
model and the classification model is the student model. After this cross-task
knowledge distillation, the performance of the classification model is
significantly improved and the efficiency is maintained since the WSD model can
be safely discarded in the test phase. Extensive experiments on two large-scale
datasets (MS-COCO and NUS-WIDE) show that our framework achieves superior
performances over the state-of-the-art methods on both performance and
efficiency.Comment: accepted by ACM Multimedia 2018, 9 pages, 4 figures, 5 table
Augmented Reality for Presenting Real-Time Data During Students' Laboratory Work: Comparing a Head-Mounted Display With a Separate Display
Multimedia learning theories suggest presenting associated pieces of information in
spatial and temporal contiguity. New technologies like Augmented Reality allow for
realizing these principles in science laboratory courses by presenting virtual real-time
information during hands-on experimentation. Spatial integration can be achieved by
pinning virtual representations of measurement data to corresponding real components.
In the present study, an Augmented Reality-based presentation format was realized
via a head-mounted display and contrasted to a separate display, which provided a
well-arranged data matrix in spatial distance to the real components and was therefore
expected to result in a spatial split-attention effect. Two groups of engineering students
(N = 107; Augmented Reality vs. separate display) performed six experiments exploring
fundamental laws of electric circuits. Cognitive load and conceptual knowledge
acquisition were assessed as main outcome variables. In contrast to our hypotheses
and previous findings, the Augmented Reality group did not report lower extraneous load
and the separate display group showed higher learning gains. The pre- and posttest
assessing conceptual knowledge were monitored by eye tracking. Results indicate
that the condition affected the visual relevancy of circuit diagrams to final problem
completion. The unexpected reverse effects could be traced back to emphasizing
coherence formation processes regarding multiple measurements
Influence of augmented feedback on learning upper extremity tasks after stroke
With upcoming innovative technologies more possibilities arise in the application of augmented feedback in rehabilitation therapy of the hemiparetic arm after stroke. The effect of different aspects and types of augmented feedback on motor functions and motor activities of the hemiparetic arm after stroke are studied in a systematic literature review. Based on current literature it was not possible to determine which combinations of aspects and types of augmented feedback are most essential for a beneficial effect on motor activities and motor functions of the hemiparetic arm after stroke. This was due to the combination of multiple aspects and types of augmented feedback in the included studies. Knowledge about the actual use of position feedback during practice in reaching training after stroke is obtained from a training experiment in five stroke survivors. During training, subjects performed reaching movements over a predefined path, when deviating shoulder and elbow joints received position feedback using restraining forces. Although augmented feedback use was limited, kinematic outcome measures and movement performance during training increased in all subjects. In an experimental study knowledge about the influence of different feedback conditions on motor learning in healthy elderly and stroke survivors was obtained. Repetitive reaching movements were performed with a visual distortion of hand movements in three feedback conditions: concurrent (cKP) and terminal knowledge of performance (tKP), and terminal knowledge of results (tKR). The highest potential for learning and consolidation was achieved with cKP in both groups. Remarkable was the presence of several subjects showing limited amounts of learning, independent of the provided feedback. The effect of reaching direction on visuomotor learning was studied in healthy young and elderly subjects, and stroke survivors. Repetitive reaching movements to five directions during an adaptation to a visuomotor rotation were performed. Significant higher amount of adaptation in the movement towards the contralateral part of the body compared to reaching towards other directions was observed for the young subjects. No significant differences in learning between directions was observed for the elderly, and only a higher deviation at the start of the learning phase for stroke survivors in one direction were found
Attribute Prototype Network for Zero-Shot Learning
From the beginning of zero-shot learning research, visual attributes have
been shown to play an important role. In order to better transfer
attribute-based knowledge from known to unknown classes, we argue that an image
representation with integrated attribute localization ability would be
beneficial for zero-shot learning. To this end, we propose a novel zero-shot
representation learning framework that jointly learns discriminative global and
local features using only class-level attributes. While a visual-semantic
embedding layer learns global features, local features are learned through an
attribute prototype network that simultaneously regresses and decorrelates
attributes from intermediate features. We show that our locality augmented
image representations achieve a new state-of-the-art on three zero-shot
learning benchmarks. As an additional benefit, our model points to the visual
evidence of the attributes in an image, e.g. for the CUB dataset, confirming
the improved attribute localization ability of our image representation.Comment: NeurIPS 2020. The code is publicly available at
https://wenjiaxu.github.io/APN-ZSL
Travel to Southeast Asia: Learning About Southeast Asia through Augmented Reality
‘Travel to Southeast Asia’ application is one of Augmented Reality (AR) technology. Learn about Southeast Asia is very important and has been emphasized in high schools. However, there are some students who are bored and not interested in learning geography especially related to foreign countries due to lack of exposure and information about it. So, the main purpose of this project is to develop an educational application based on Augmented Reality for students to interest them in learning about Southeast Asia. Students also can identify eleven countries in Southeast Asia by using this application and gain some knowledge related to the countries. Unity software is the main software to develop this Augmented Reality because Unity engine can support the high quality of audio and visual effects to ease the development of the project. This project developed by using ADDIE Model as a dynamic and flexible guideline for building effective training and performance support tools. This research was evaluated by among 15 students from Sultan Idris Education University who pursuing a Bachelor of Education (Geography) with Honor by using quantitative methods through online questionnaire. This questionnaire was distributed to the respondents for evaluation based on Usefulness, Satisfaction and Ease of Use (USE) questionnaire. As a result of the questionnaire, majority of respondents gave positive feedback and interested with ‘Travel to Southeast Asia’ application. Based on the research, learning about Southeast Asia using Augmented Reality provides better knowledge and understanding
Augmenting visual information in knowledge graphs for recommendations
Knowledge graphs (KGs) have been popularly used in recommender systems to leverage high-order connections between users and items. Typically, KGs are constructed based on semantic information derived from metadata. However, item images are also highly useful, especially for those domains where visual factors are influential such as fashion items. In this paper, we propose an approach to augment visual information extracted by popularly used image feature extraction methods into KGs. Specifically, we introduce visually-augmented KGs where the extracted information is integrated by using visual factor entities and visual relations. Moreover, to leverage the augmented KGs, a user representation learning approach is proposed to learn hybrid user profiles that combine both semantic and visual preferences. The proposed approaches have been applied in top- recommendation tasks on two real-world datasets. The results show that the augmented KGs and the representation learning approach can improve the recommendation performance. They also show that the augmented KGs are applicable in the state-of-the-art KG-based recommender system as well
- …