130,468 research outputs found

    K-LITE: Learning Transferable Visual Models with External Knowledge

    Full text link
    Recent state-of-the-art computer vision systems are trained from natural language supervision, ranging from simple object category names to descriptive captions. This free form of supervision ensures high generality and usability of the learned visual models, based on extensive heuristics on data collection to cover as many visual concepts as possible. Alternatively, learning with external knowledge about images is a promising way which leverages a much more structured source of supervision. In this paper, we propose K-LITE (Knowledge-augmented Language-Image Training and Evaluation), a simple strategy to leverage external knowledge to build transferable visual systems: In training, it enriches entities in natural language with WordNet and Wiktionary knowledge, leading to an efficient and scalable approach to learning image representations that can understand both visual concepts and their knowledge; In evaluation, the natural language is also augmented with external knowledge and then used to reference learned visual concepts (or describe new ones) to enable zero-shot and few-shot transfer of the pre-trained models. We study the performance of K-LITE on two important computer vision problems, image classification and object detection, benchmarking on 20 and 13 different existing datasets, respectively. The proposed knowledge-augmented models show significant improvement in transfer learning performance over existing methods.Comment: Preprint. The first three authors contribute equall

    User Interface and Interaction Design Considerations for Collaborative Learning Using Augmented Reality Learning Object

    Get PDF
    Abstract. Most education is too often about teaching and not enough about learning. It is because students are forced to take whatever it is given to them without considering what they think about it, in other words, they passively take the given knowledge. This paper presents early investigation about interface and interaction design considerations for effective collaborative learning by taking account individual learning preferences and collaborative learning characteristics of engineering students. In our investigation, we follow Felder Silverman Learning Style Model and conducted a test measured using Index Learning Style. As a result, we discovered that engineering students tend to be active, sensory, visual, and sequential. Therefore, we implement augmented reality views to satisfy students’ learning preferences toward content presentation (visual learner). It is also because augmented reality can give rich information toward real objects/environment. For collaborative characteristics, we studied past research on collaborative learning regarding its characteristics that affects learning effectiveness. Besides, our proposed design also considered the user interface principle which provides a guidance to effectively implement our consideration into an interface

    Video augmentation to support video-based learning

    Get PDF
    Multimedia content and video-based learning are expected to take a central role in the post-pandemic world. Thus, providing new advanced interfaces and services that further exploit their potential becomes of paramount importance. A challenging area deals with developing intelligent visual interfaces that integrate the knowledge extracted from multimedia materials into educational applications. In this respect, we designed a web-based video player that is aimed to support video consumption by exploiting the knowledge extracted from the video in terms of concepts explained in the video and prerequisite relations between them. This knowledge is used to augment the video lesson through visual feedback methods. Specifically, in this paper we investigate the use of two types of visual feedback, i.e. an augmented transcript and a dynamic concept map (map of concept's flow), to improve video comprehension in the first-watch learning context. Our preliminary findings suggest that both the methods help the learner to focus on the relevant concepts and their related contents. The augmented transcript has an higher impact on immediate comprehension compared to the map of concepts' flow, even though the latter is expected to be more powerful to support other tasks such as exploration and in-depth analysis of the concepts in the video

    Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection

    Full text link
    Multi-label image classification is a fundamental but challenging task towards general visual understanding. Existing methods found the region-level cues (e.g., features from RoIs) can facilitate multi-label classification. Nevertheless, such methods usually require laborious object-level annotations (i.e., object labels and bounding boxes) for effective learning of the object-level visual features. In this paper, we propose a novel and efficient deep framework to boost multi-label classification by distilling knowledge from weakly-supervised detection task without bounding box annotations. Specifically, given the image-level annotations, (1) we first develop a weakly-supervised detection (WSD) model, and then (2) construct an end-to-end multi-label image classification framework augmented by a knowledge distillation module that guides the classification model by the WSD model according to the class-level predictions for the whole image and the object-level visual features for object RoIs. The WSD model is the teacher model and the classification model is the student model. After this cross-task knowledge distillation, the performance of the classification model is significantly improved and the efficiency is maintained since the WSD model can be safely discarded in the test phase. Extensive experiments on two large-scale datasets (MS-COCO and NUS-WIDE) show that our framework achieves superior performances over the state-of-the-art methods on both performance and efficiency.Comment: accepted by ACM Multimedia 2018, 9 pages, 4 figures, 5 table

    Augmented Reality for Presenting Real-Time Data During Students' Laboratory Work: Comparing a Head-Mounted Display With a Separate Display

    Get PDF
    Multimedia learning theories suggest presenting associated pieces of information in spatial and temporal contiguity. New technologies like Augmented Reality allow for realizing these principles in science laboratory courses by presenting virtual real-time information during hands-on experimentation. Spatial integration can be achieved by pinning virtual representations of measurement data to corresponding real components. In the present study, an Augmented Reality-based presentation format was realized via a head-mounted display and contrasted to a separate display, which provided a well-arranged data matrix in spatial distance to the real components and was therefore expected to result in a spatial split-attention effect. Two groups of engineering students (N = 107; Augmented Reality vs. separate display) performed six experiments exploring fundamental laws of electric circuits. Cognitive load and conceptual knowledge acquisition were assessed as main outcome variables. In contrast to our hypotheses and previous findings, the Augmented Reality group did not report lower extraneous load and the separate display group showed higher learning gains. The pre- and posttest assessing conceptual knowledge were monitored by eye tracking. Results indicate that the condition affected the visual relevancy of circuit diagrams to final problem completion. The unexpected reverse effects could be traced back to emphasizing coherence formation processes regarding multiple measurements

    Influence of augmented feedback on learning upper extremity tasks after stroke

    Get PDF
    With upcoming innovative technologies more possibilities arise in the application of augmented feedback in rehabilitation therapy of the hemiparetic arm after stroke. The effect of different aspects and types of augmented feedback on motor functions and motor activities of the hemiparetic arm after stroke are studied in a systematic literature review. Based on current literature it was not possible to determine which combinations of aspects and types of augmented feedback are most essential for a beneficial effect on motor activities and motor functions of the hemiparetic arm after stroke. This was due to the combination of multiple aspects and types of augmented feedback in the included studies. Knowledge about the actual use of position feedback during practice in reaching training after stroke is obtained from a training experiment in five stroke survivors. During training, subjects performed reaching movements over a predefined path, when deviating shoulder and elbow joints received position feedback using restraining forces. Although augmented feedback use was limited, kinematic outcome measures and movement performance during training increased in all subjects. In an experimental study knowledge about the influence of different feedback conditions on motor learning in healthy elderly and stroke survivors was obtained. Repetitive reaching movements were performed with a visual distortion of hand movements in three feedback conditions: concurrent (cKP) and terminal knowledge of performance (tKP), and terminal knowledge of results (tKR). The highest potential for learning and consolidation was achieved with cKP in both groups. Remarkable was the presence of several subjects showing limited amounts of learning, independent of the provided feedback. The effect of reaching direction on visuomotor learning was studied in healthy young and elderly subjects, and stroke survivors. Repetitive reaching movements to five directions during an adaptation to a visuomotor rotation were performed. Significant higher amount of adaptation in the movement towards the contralateral part of the body compared to reaching towards other directions was observed for the young subjects. No significant differences in learning between directions was observed for the elderly, and only a higher deviation at the start of the learning phase for stroke survivors in one direction were found

    Attribute Prototype Network for Zero-Shot Learning

    Full text link
    From the beginning of zero-shot learning research, visual attributes have been shown to play an important role. In order to better transfer attribute-based knowledge from known to unknown classes, we argue that an image representation with integrated attribute localization ability would be beneficial for zero-shot learning. To this end, we propose a novel zero-shot representation learning framework that jointly learns discriminative global and local features using only class-level attributes. While a visual-semantic embedding layer learns global features, local features are learned through an attribute prototype network that simultaneously regresses and decorrelates attributes from intermediate features. We show that our locality augmented image representations achieve a new state-of-the-art on three zero-shot learning benchmarks. As an additional benefit, our model points to the visual evidence of the attributes in an image, e.g. for the CUB dataset, confirming the improved attribute localization ability of our image representation.Comment: NeurIPS 2020. The code is publicly available at https://wenjiaxu.github.io/APN-ZSL

    Travel to Southeast Asia: Learning About Southeast Asia through Augmented Reality

    Get PDF
    ‘Travel to Southeast Asia’ application is one of Augmented Reality (AR) technology. Learn about Southeast Asia is very important and has been emphasized in high schools. However, there are some students who are bored and not interested in learning geography especially related to foreign countries due to lack of exposure and information about it. So, the main purpose of this project is to develop an educational application based on Augmented Reality for students to interest them in learning about Southeast Asia. Students also can identify eleven countries in Southeast Asia by using this application and gain some knowledge related to the countries. Unity software is the main software to develop this Augmented Reality because Unity engine can support the high quality of audio and visual effects to ease the development of the project. This project developed by using ADDIE Model as a dynamic and flexible guideline for building effective training and performance support tools. This research was evaluated by among 15 students from Sultan Idris Education University who pursuing a Bachelor of Education (Geography) with Honor by using quantitative methods through online questionnaire. This questionnaire was distributed to the respondents for evaluation based on Usefulness, Satisfaction and Ease of Use (USE) questionnaire. As a result of the questionnaire, majority of respondents gave positive feedback and interested with ‘Travel to Southeast Asia’ application. Based on the research, learning about Southeast Asia using Augmented Reality provides better knowledge and understanding

    Augmenting visual information in knowledge graphs for recommendations

    Get PDF
    Knowledge graphs (KGs) have been popularly used in recommender systems to leverage high-order connections between users and items. Typically, KGs are constructed based on semantic information derived from metadata. However, item images are also highly useful, especially for those domains where visual factors are influential such as fashion items. In this paper, we propose an approach to augment visual information extracted by popularly used image feature extraction methods into KGs. Specifically, we introduce visually-augmented KGs where the extracted information is integrated by using visual factor entities and visual relations. Moreover, to leverage the augmented KGs, a user representation learning approach is proposed to learn hybrid user profiles that combine both semantic and visual preferences. The proposed approaches have been applied in top-NN recommendation tasks on two real-world datasets. The results show that the augmented KGs and the representation learning approach can improve the recommendation performance. They also show that the augmented KGs are applicable in the state-of-the-art KG-based recommender system as well
    corecore