47 research outputs found
MENINGKATKAN MOTIVASI DAN PRESTASI BELAJAR SISWA MELALUI PEMBELAJARAN BERDIFERENSIASI
Sekolah merupakan tempat proses belajar mengajar antara guru dan peserta didik. Peserta didik selalu disibukkan dengan kegiatan belajr mengajar dari pagi hingga sore sehingga mengakibatkan motivasi belajar peserta didik menurun. Peningkatkan motivasi belajar dapat dilakukan dengan menggunakan pembelajaran berdiferensiasi. Peningkatan motivasi belajar akan mempengaruhi hasil belajar peserta didik. Penelitian ini bertujuan untuk mengatahui peningkatakan motivasi dan hasil belajar peserta didik menggunakan pembelajaran berdiferensiasi. Penelitian ini merupakan penelitian tindakan kelas dengan pengumpulan data melalui observasi dan tes tulis berupa pre test serta post test. Penelitian ini menemukan bahwa pembelajaran berdiferensiasi dapat meningkatkan motivasi belajar secara signifikan sehingga mempengaruhi hasil belajarnya. Kesimpulan dari penelitian menunjukkan bahwa dengan menggunakan pembelajaran berdiferensiasi ini dapat meningkatkan motivasi dan hasil belajar peserta didik dimana pada pembelajaran pra siklus memperoleh presentase ketuntasan sebanyak 31%, kemudian setelah dilakukan tindakan pada siklus 1 mengalami kenaikan sebanyak 62%. Dilakukan kembali tindakan lanjutan pada siklus 2 hingga mdncapai nilai ketuntasan sebanyak 91%. Sedangkan pada motivasi belajar peserta didik mengalami peningkatkan dilihat dari hasil pengamatan pada kelima indikator motivasi belajar
Collaborative Group Learning
Collaborative learning has successfully applied knowledge transfer to guide a
pool of small student networks towards robust local minima. However, previous
approaches typically struggle with drastically aggravated student
homogenization when the number of students rises. In this paper, we propose
Collaborative Group Learning, an efficient framework that aims to diversify the
feature representation and conduct an effective regularization. Intuitively,
similar to the human group study mechanism, we induce students to learn and
exchange different parts of course knowledge as collaborative groups. First,
each student is established by randomly routing on a modular neural network,
which facilitates flexible knowledge communication between students due to
random levels of representation sharing and branching. Second, to resist the
student homogenization, students first compose diverse feature sets by
exploiting the inductive bias from sub-sets of training data, and then
aggregate and distill different complementary knowledge by imitating a random
sub-group of students at each time step. Overall, the above mechanisms are
beneficial for maximizing the student population to further improve the model
generalization without sacrificing computational efficiency. Empirical
evaluations on both image and text tasks indicate that our method significantly
outperforms various state-of-the-art collaborative approaches whilst enhancing
computational efficiency.Comment: Accepted by AAAI 2021; Camera ready versio
Cross-Layer Distillation with Semantic Calibration
Recently proposed knowledge distillation approaches based on feature-map
transfer validate that intermediate layers of a teacher model can serve as
effective targets for training a student model to obtain better generalization
ability. Existing studies mainly focus on particular representation forms for
knowledge transfer between manually specified pairs of teacher-student
intermediate layers. However, semantics of intermediate layers may vary in
different networks and manual association of layers might lead to negative
regularization caused by semantic mismatch between certain teacher-student
layer pairs. To address this problem, we propose Semantic Calibration for
Cross-layer Knowledge Distillation (SemCKD), which automatically assigns proper
target layers of the teacher model for each student layer with an attention
mechanism. With a learned attention distribution, each student layer distills
knowledge contained in multiple layers rather than a single fixed intermediate
layer from the teacher model for appropriate cross-layer supervision in
training. Consistent improvements over state-of-the-art approaches are observed
in extensive experiments with various network architectures for teacher and
student models, demonstrating the effectiveness and flexibility of the proposed
attention based soft layer association mechanism for cross-layer distillation.Comment: AAAI-202
PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation
We propose a novel knowledge distillation methodology for compressing deep
neural networks. One of the most efficient methods for knowledge distillation
is hint distillation, where the student model is injected with information
(hints) from several different layers of the teacher model. Although the
selection of hint points can drastically alter the compression performance,
there is no systematic approach for selecting them, other than brute-force
hyper-parameter search. We propose a clustering based hint selection
methodology, where the layers of teacher model are clustered with respect to
several metrics and the cluster centers are used as the hint points. The
proposed approach is validated in CIFAR-100 dataset, where ResNet-110 network
was used as the teacher model. Our results show that hint points selected by
our algorithm results in superior compression performance with respect to
state-of-the-art knowledge distillation algorithms on the same student models
and datasets
Avatar Knowledge Distillation: Self-ensemble Teacher Paradigm with Uncertainty
Knowledge distillation is an effective paradigm for boosting the performance
of pocket-size model, especially when multiple teacher models are available,
the student would break the upper limit again. However, it is not economical to
train diverse teacher models for the disposable distillation. In this paper, we
introduce a new concept dubbed Avatars for distillation, which are the
inference ensemble models derived from the teacher. Concretely, (1) For each
iteration of distillation training, various Avatars are generated by a
perturbation transformation. We validate that Avatars own higher upper limit of
working capacity and teaching ability, aiding the student model in learning
diverse and receptive knowledge perspectives from the teacher model. (2) During
the distillation, we propose an uncertainty-aware factor from the variance of
statistical differences between the vanilla teacher and Avatars, to adjust
Avatars' contribution on knowledge transfer adaptively. Avatar Knowledge
Distillation AKD is fundamentally different from existing methods and refines
with the innovative view of unequal training. Comprehensive experiments
demonstrate the effectiveness of our Avatars mechanism, which polishes up the
state-of-the-art distillation methods for dense prediction without more extra
computational cost. The AKD brings at most 0.7 AP gains on COCO 2017 for Object
Detection and 1.83 mIoU gains on Cityscapes for Semantic Segmentation,
respectively.Comment: Accepted by ACM MM 202