47 research outputs found

    MENINGKATKAN MOTIVASI DAN PRESTASI BELAJAR SISWA MELALUI PEMBELAJARAN BERDIFERENSIASI

    Get PDF
    Sekolah merupakan tempat proses belajar mengajar antara guru dan peserta didik. Peserta didik selalu disibukkan dengan kegiatan belajr mengajar dari pagi hingga sore sehingga mengakibatkan motivasi belajar peserta didik menurun. Peningkatkan motivasi belajar dapat dilakukan dengan menggunakan pembelajaran berdiferensiasi. Peningkatan motivasi belajar akan mempengaruhi hasil belajar peserta didik. Penelitian ini bertujuan untuk mengatahui peningkatakan motivasi dan hasil belajar peserta didik menggunakan pembelajaran berdiferensiasi. Penelitian ini merupakan penelitian tindakan kelas dengan pengumpulan data melalui observasi dan tes tulis berupa pre test serta post test. Penelitian ini menemukan bahwa pembelajaran berdiferensiasi dapat meningkatkan motivasi belajar secara signifikan sehingga mempengaruhi hasil belajarnya. Kesimpulan dari penelitian menunjukkan bahwa dengan menggunakan pembelajaran berdiferensiasi ini dapat meningkatkan motivasi dan hasil belajar peserta didik dimana pada pembelajaran pra siklus memperoleh presentase ketuntasan sebanyak 31%, kemudian setelah dilakukan tindakan pada siklus 1 mengalami kenaikan sebanyak 62%. Dilakukan kembali tindakan lanjutan pada siklus 2 hingga mdncapai nilai ketuntasan sebanyak 91%. Sedangkan pada motivasi belajar peserta didik mengalami peningkatkan dilihat dari hasil pengamatan pada kelima indikator motivasi belajar

    Collaborative Group Learning

    Full text link
    Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima. However, previous approaches typically struggle with drastically aggravated student homogenization when the number of students rises. In this paper, we propose Collaborative Group Learning, an efficient framework that aims to diversify the feature representation and conduct an effective regularization. Intuitively, similar to the human group study mechanism, we induce students to learn and exchange different parts of course knowledge as collaborative groups. First, each student is established by randomly routing on a modular neural network, which facilitates flexible knowledge communication between students due to random levels of representation sharing and branching. Second, to resist the student homogenization, students first compose diverse feature sets by exploiting the inductive bias from sub-sets of training data, and then aggregate and distill different complementary knowledge by imitating a random sub-group of students at each time step. Overall, the above mechanisms are beneficial for maximizing the student population to further improve the model generalization without sacrificing computational efficiency. Empirical evaluations on both image and text tasks indicate that our method significantly outperforms various state-of-the-art collaborative approaches whilst enhancing computational efficiency.Comment: Accepted by AAAI 2021; Camera ready versio

    Cross-Layer Distillation with Semantic Calibration

    Full text link
    Recently proposed knowledge distillation approaches based on feature-map transfer validate that intermediate layers of a teacher model can serve as effective targets for training a student model to obtain better generalization ability. Existing studies mainly focus on particular representation forms for knowledge transfer between manually specified pairs of teacher-student intermediate layers. However, semantics of intermediate layers may vary in different networks and manual association of layers might lead to negative regularization caused by semantic mismatch between certain teacher-student layer pairs. To address this problem, we propose Semantic Calibration for Cross-layer Knowledge Distillation (SemCKD), which automatically assigns proper target layers of the teacher model for each student layer with an attention mechanism. With a learned attention distribution, each student layer distills knowledge contained in multiple layers rather than a single fixed intermediate layer from the teacher model for appropriate cross-layer supervision in training. Consistent improvements over state-of-the-art approaches are observed in extensive experiments with various network architectures for teacher and student models, demonstrating the effectiveness and flexibility of the proposed attention based soft layer association mechanism for cross-layer distillation.Comment: AAAI-202

    PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation

    Full text link
    We propose a novel knowledge distillation methodology for compressing deep neural networks. One of the most efficient methods for knowledge distillation is hint distillation, where the student model is injected with information (hints) from several different layers of the teacher model. Although the selection of hint points can drastically alter the compression performance, there is no systematic approach for selecting them, other than brute-force hyper-parameter search. We propose a clustering based hint selection methodology, where the layers of teacher model are clustered with respect to several metrics and the cluster centers are used as the hint points. The proposed approach is validated in CIFAR-100 dataset, where ResNet-110 network was used as the teacher model. Our results show that hint points selected by our algorithm results in superior compression performance with respect to state-of-the-art knowledge distillation algorithms on the same student models and datasets

    Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty

    Full text link
    Knowledge distillation is an effective paradigm for boosting the performance of pocket-size model, especially when multiple teacher models are available, the student would break the upper limit again. However, it is not economical to train diverse teacher models for the disposable distillation. In this paper, we introduce a new concept dubbed Avatars for distillation, which are the inference ensemble models derived from the teacher. Concretely, (1) For each iteration of distillation training, various Avatars are generated by a perturbation transformation. We validate that Avatars own higher upper limit of working capacity and teaching ability, aiding the student model in learning diverse and receptive knowledge perspectives from the teacher model. (2) During the distillation, we propose an uncertainty-aware factor from the variance of statistical differences between the vanilla teacher and Avatars, to adjust Avatars' contribution on knowledge transfer adaptively. Avatar Knowledge Distillation AKD is fundamentally different from existing methods and refines with the innovative view of unequal training. Comprehensive experiments demonstrate the effectiveness of our Avatars mechanism, which polishes up the state-of-the-art distillation methods for dense prediction without more extra computational cost. The AKD brings at most 0.7 AP gains on COCO 2017 for Object Detection and 1.83 mIoU gains on Cityscapes for Semantic Segmentation, respectively.Comment: Accepted by ACM MM 202
    corecore