1,824 research outputs found

    Data Optimization in Deep Learning: A Survey

    Full text link
    Large-scale, high-quality data are considered an essential factor for the successful application of many deep learning techniques. Meanwhile, numerous real-world deep learning tasks still have to contend with the lack of sufficient amounts of high-quality data. Additionally, issues such as model robustness, fairness, and trustworthiness are also closely related to training data. Consequently, a huge number of studies in the existing literature have focused on the data aspect in deep learning tasks. Some typical data optimization techniques include data augmentation, logit perturbation, sample weighting, and data condensation. These techniques usually come from different deep learning divisions and their theoretical inspirations or heuristic motivations may seem unrelated to each other. This study aims to organize a wide range of existing data optimization methodologies for deep learning from the previous literature, and makes the effort to construct a comprehensive taxonomy for them. The constructed taxonomy considers the diversity of split dimensions, and deep sub-taxonomies are constructed for each dimension. On the basis of the taxonomy, connections among the extensive data optimization methods for deep learning are built in terms of four aspects. We probe into rendering several promising and interesting future directions. The constructed taxonomy and the revealed connections will enlighten the better understanding of existing methods and the design of novel data optimization techniques. Furthermore, our aspiration for this survey is to promote data optimization as an independent subdivision of deep learning. A curated, up-to-date list of resources related to data optimization in deep learning is available at \url{https://github.com/YaoRujing/Data-Optimization}

    Learning from imperfect data : incremental learning and Few-shot Learning

    Get PDF
    In recent years, artificial intelligence (AI) has achieved great success in many fields, e.g., computer vision, speech recognition, recommendation engines, and neural language processing. Although impressive advances have been made, AI algorithms still suffer from an important limitation: they rely on large-scale datasets. In contrast, human beings naturally possess the ability to learn novel knowledge from real-world and imperfect data such as a small number of samples or a non-static continual data stream. Attaining such an ability is particularly appealing. Specifically, an ideal AI system with human-level intelligence should work with the following imperfect data scenarios. 1)~The training data distribution changes while learning. In many real scenarios, data are streaming, might disappear after a given period of time, or even can not be stored at all due to storage constraints or privacy issues. As a consequence, the old knowledge is over-written, a phenomenon called catastrophic forgetting. 2)~The annotations of the training data are sparse. There are also many scenarios where we do not have access to the specific large-scale data of interest due to privacy and security reasons. As a consequence, the deep models overfit the training data distribution and are very likely to make wrong decisions when they encounter rare cases. Therefore, the goal of this thesis is to tackle the challenges and develop AI algorithms that can be trained with imperfect data. To achieve the above goal, we study three topics in this thesis. 1)~Learning with continual data without forgetting (i.e., incremental learning). 2)~Learning with limited data without overfitting (i.e., few-shot learning). 3)~Learning with imperfect data in real-world applications (e.g., incremental object detection). Our key idea is learning to learn/optimize. Specifically, we use advanced learning and optimization techniques to design data-driven methods to dynamically adapt the key elements in AI algorithms, e.g., selection of data, memory allocation, network architecture, essential hyperparameters, and control of knowledge transfer. We believe that the adaptive and dynamic design of system elements will significantly improve the capability of deep learning systems under limited data or continual streams, compared to the systems with fixed and non-optimized elements. More specifically, we first study how to overcome the catastrophic forgetting problem by learning to optimize exemplar data, allocate memory, aggregate neural networks, and optimize key hyperparameters. Then, we study how to improve the generalization ability of the model and tackle the overfitting problem by learning to transfer knowledge and ensemble deep models. Finally, we study how to apply incremental learning techniques to the recent top-performance transformer-based architecture for a more challenging and realistic vision, incremental object detection.Künstliche Intelligenz (KI) hat in den letzten Jahren in vielen Bereichen große Erfolge erzielt, z. B. Computer Vision, Spracherkennung, Empfehlungsmaschinen und neuronale Sprachverarbeitung. Obwohl beeindruckende Fortschritte erzielt wurden, leiden KI-Algorithmen immer noch an einer wichtigen Einschränkung: Sie sind auf umfangreiche Datensätze angewiesen. Im Gegensatz dazu besitzen Menschen von Natur aus die Fähigkeit, neuartiges Wissen aus realen und unvollkommenen Daten wie einer kleinen Anzahl von Proben oder einem nicht statischen kontinuierlichen Datenstrom zu lernen. Das Erlangen einer solchen Fähigkeit ist besonders reizvoll. Insbesondere sollte ein ideales KI-System mit Intelligenz auf menschlicher Ebene mit den folgenden unvollkommenen Datenszenarien arbeiten. 1)~Die Verteilung der Trainingsdaten ändert sich während des Lernens. In vielen realen Szenarien werden Daten gestreamt, können nach einer bestimmten Zeit verschwinden oder können aufgrund von Speicherbeschränkungen oder Datenschutzproblemen überhaupt nicht gespeichert werden. Infolgedessen wird das alte Wissen überschrieben, ein Phänomen, das als katastrophales Vergessen bezeichnet wird. 2)~Die Anmerkungen der Trainingsdaten sind spärlich. Es gibt auch viele Szenarien, in denen wir aus Datenschutz- und Sicherheitsgründen keinen Zugriff auf die spezifischen großen Daten haben, die von Interesse sind. Infolgedessen passen die tiefen Modelle zu stark an die Verteilung der Trainingsdaten an und treffen sehr wahrscheinlich falsche Entscheidungen, wenn sie auf seltene Fälle stoßen. Daher ist das Ziel dieser Arbeit, die Herausforderungen anzugehen und KI-Algorithmen zu entwickeln, die mit unvollkommenen Daten trainiert werden können. Um das obige Ziel zu erreichen, untersuchen wir in dieser Arbeit drei Themen. 1)~Lernen mit kontinuierlichen Daten ohne Vergessen (d. h. inkrementelles Lernen). 2) ~ Lernen mit begrenzten Daten ohne Überanpassung (d. h. Lernen mit wenigen Schüssen). 3) ~ Lernen mit unvollkommenen Daten in realen Anwendungen (z. B. inkrementelle Objekterkennung). Unser Leitgedanke ist Lernen lernen/optimieren. Insbesondere verwenden wir fortschrittliche Lern- und Optimierungstechniken, um datengesteuerte Methoden zu entwerfen, um die Schlüsselelemente in KI-Algorithmen dynamisch anzupassen, z. B. Auswahl von Daten, Speicherzuweisung, Netzwerkarchitektur, wesentliche Hyperparameter und Steuerung des Wissenstransfers. Wir glauben, dass das adaptive und dynamische Design von Systemelementen die Leistungsfähigkeit von Deep-Learning-Systemen bei begrenzten Daten oder kontinuierlichen Streams im Vergleich zu Systemen mit festen und nicht optimierten Elementen erheblich verbessern wird. Genauer gesagt untersuchen wir zunächst, wie das katastrophale Vergessensproblem überwunden werden kann, indem wir lernen, Beispieldaten zu optimieren, Speicher zuzuweisen, neuronale Netze zu aggregieren und wichtige Hyperparameter zu optimieren. Dann untersuchen wir, wie die Verallgemeinerungsfähigkeit des Modells verbessert und das Overfitting-Problem angegangen werden kann, indem wir lernen, Wissen zu übertragen und tiefe Modelle in Ensembles zusammenzufassen. Schließlich untersuchen wir, wie man inkrementelle Lerntechniken auf die jüngste transformatorbasierte Hochleistungsarchitektur für eine anspruchsvollere und realistischere Vision, inkrementelle Objekterkennung, anwendet

    Knowledge Distillation and Continual Learning for Optimized Deep Neural Networks

    Get PDF
    Over the past few years, deep learning (DL) has been achieving state-of-theart performance on various human tasks such as speech generation, language translation, image segmentation, and object detection. While traditional machine learning models require hand-crafted features, deep learning algorithms can automatically extract discriminative features and learn complex knowledge from large datasets. This powerful learning ability makes deep learning models attractive to both academia and big corporations. Despite their popularity, deep learning methods still have two main limitations: large memory consumption and catastrophic knowledge forgetting. First, DL algorithms use very deep neural networks (DNNs) with many billion parameters, which have a big model size and a slow inference speed. This restricts the application of DNNs in resource-constraint devices such as mobile phones and autonomous vehicles. Second, DNNs are known to suffer from catastrophic forgetting. When incrementally learning new tasks, the model performance on old tasks significantly drops. The ability to accommodate new knowledge while retaining previously learned knowledge is called continual learning. Since the realworld environments in which the model operates are always evolving, a robust neural network needs to have this continual learning ability for adapting to new changes

    Differentiated instruction around the world. A global inclusive insight

    Get PDF
    With increasingly diverse student populations in schools, the establishment of inclusive classrooms has become a top international priority. Teachers around the world are urged to differentiate their instruction in order to support all students’ learning needs. Although there is research on the topic, there are still important gaps to explore, especially the underrepresented international research output. This book tackles such limitations and provides a first ever publication concerning global insights into differentiated instruction. A total of 14 countries from 5 continents provide empirical evidence, theoretical and practical approaches to the topic. The book wraps up with a contribution from Prof. Dr. John Hattie, University of Melbourne, who shares eight theses to help the continuing debate and research on differentiated instruction. (DIPF/Orig.

    A Global Inclusive Insight

    Get PDF
    The publication of this work was supported by the Open Access Publication Fund of Humboldt-Universität zu Berlin.14 different countries, various research methods, 1 topic: Differentiated Instruction. With increasingly diverse student populations in schools, the establishment of inclusive classrooms has become a top international priority. Teachers around the world are urged to differentiate their instruction in order to support all students’ learning needs. Although there is research on the topic, there are still important gaps to explore, especially the underrepresented international research output. This book tackles such limitations and provides a first ever publication concerning global insights into differentiated instruction. A total of 14 countries from 5 continents provide empirical evidence, theoretical and practical approaches to the topic. The book wraps up with a contribution from Prof. Dr. John Hattie, University of Melbourne, who shares eight theses to help the continuing debate and research on differentiated instruction.Peer Reviewe

    Federated Learning in Computer Vision

    Get PDF
    Federated Learning (FL) has recently emerged as a novel machine learning paradigm allowing to preserve privacy and to account for the distributed nature of the learning process in many real-world settings. Computer vision tasks deal with huge datasets often with critical privacy issues, therefore many federated learning approaches have been presented to exploit its distributed and privacy-preserving nature. Firstly, this paper introduces the different FL settings used in computer vision and the main challenges that need to be tackled. Then, it provides a comprehensive overview of the different strategies used for FL in vision applications and presents several different approaches for image classification, object detection, semantic segmentation and for focused settings in face recognition and medical imaging. For the various approaches the considered FL setting, the employed data and methodologies and the achieved results are thoroughly discussed

    Counternarratives of Students with Dis/abilities in One Rural School District

    Get PDF
    This is an inquiry into the educational experience of students with dis/abilities who are excluded from the general education classroom in one rural Georgia school district. Theoretically my dissertation research builds on critical disability studies (Erevelles 2000, 2002, 2005, 2015; also Annamma 2018; Tremain 2005), critical geography (Harvey 2000; Helfenbein, Jr. 2004; Soja 1989, 2010), and curriculum studies (Maudlin 2008; Snowber 2016; Springgay & Freedman 2008; Swanson 2008). Methodologically building on counternarrative inquiry (Bell 1999; Delgado 1989; He & Ayers 2009; He & Ross 2015; He, Ross, & Seay 2015; Solórzano & Yosso 2002), art-based research (Barone & Eisner 2006; Coles 1992; also Bae-Dimitriadis 2020), and those conducting research with children with dis/abilities (Aslamazova, Yurina Kochendova & Krasnova 2016; Søndergaard & Reventlow 2019; Jenkin, Wilson, Murfitt, Clarke, Campain, & Stockman, 2015; Maxwell 2006), I explore the counternarratives of three students with significant dis/abilities, Kara, Alvin, and Derek, to counter master narratives, which devalue, dehumanize, and disenfranchise them. I propose an embodied curriculum within a beloved community (hooks, 1996) and infused with a pedagogy of heart (Freire, 1997) as a replacement to the current curriculum of exclusion and despair. Six findings have emerged from my dissertation research: (1) When conducting research with students with dis/abilities, researchers must create a safe and welcoming space in which their confidentiality is protected, and their stories are told through a comfortable medium. (2) Arts-based research transgresses traditional dissertation inquiries to tell the silenced narrative of students with dis/abilities and liberate their voice from the constraints of ableism. (3) Counternarratives empower children with dis/abilities to share valuable insights into their educational experience and speak against the master-narrative of ableism and privilege that often disenfranchises and dehumanizes them as deficient and inferior and failures. (4) Exclusion in education damages the sense of worth and belonging of students with dis/abilities, furthers their marginalization, and sabotages their potential in school and life. (5) There is a demand to engender an embodied curriculum within a beloved community and infused with a pedagogy of heart that disrupts the ableism inherent in dominant educational structures, practices, and policies for students with intellectual dis/abilities which prevent them from reaching graduation and thriving in life. (6) Instead of imprisoning the bodies and minds of students with dis/abilities, educators must work with other educational workers such as teachers, administrators, educational staff, parents, students, community workers, and policy makers to develop a culturally relevant pedagogy of caring and justice, cultivate a culturally inspiring school environment, and create hopes, dreams, and equal opportunities for students with dis/abilities and all others to reach their highest potential (Siddle-Walker, 1996)

    Inclusive Classrooms: From Access to Engagement

    Get PDF

    Is adolescents’ progress in reading comprehension served by particular attributional views in addition to learning the reading comprehension strategies of reciprocal teaching? A mixed-methods intervention study

    No full text
    A mixed-methods quasi-experimental design was used to identify relationships between adolescent students’ attributions for their reading performance and their reading achievement by gathering baseline data from year 9 and 10 students (n = 175) and then investigating the effects of two stages of intervention on a treatment group (n = 22) and a comparison group (n = 16). The first stage of intervention used the instructional activity of reciprocal teaching to teach students cognitive strategies to improve reading comprehension. The second stage of the intervention combined on-going reciprocal teaching with attributional-retraining, aimed at to developing internal attributions for reading performance; specifically effort-related attributions rather than attributions focussing on ability. A baseline sample (which included the treatment and comparison samples as well as students from the wider year 9 and 10 cohort) completed a questionnaire about their attributions for their reading performance. There was no evidence of the hypothesised correlation between a measure of students’ incremental mindset (internal, unstable and controllable attribution) and standardised measures of reading comprehension. Analysis of the attribution data for the baseline sample showed evidence that internal and external attributions are not, as theorised, two ends of the same continuum, rather they are separate constructs, albeit negatively correlated. The treatment and comparison groups completed a standardised reading comprehension test and the attribution questionnaire at four time points: pre-intervention; between the two stages of intervention; post-intervention; and delayed post-intervention. A sub-sample of six students, representing a spectrum of reading achievement was interviewed to develop a better understanding of the responses provided in the questionnaire. The combined interventions had no significant effect on students’ attributions for their reading performance or on their reading comprehension achievement. Conversely, the first stage of the intervention, reciprocal teaching, did have a significant effect on the treatment group’s reading comprehension achievement immediately following the intervention and the group were observed eagerly participating in the activity with significantly increased engagement. The combined qualitative and quantitative data from the interventions provided evidence about the complexity of adolescents’ attributional beliefs. Students responded with a wide variety of beliefs that did not conform to the theorised pattern of attributional beliefs. The findings raise questions about how students form attributions for their successes and failures, in particular the direction of the causal relationship between achievement and attributional beliefs

    Is adolescents’ progress in reading comprehension served by particular attributional views in addition to learning the reading comprehension strategies of reciprocal teaching? A mixed-methods intervention study

    No full text
    A mixed-methods quasi-experimental design was used to identify relationships between adolescent students’ attributions for their reading performance and their reading achievement by gathering baseline data from year 9 and 10 students (n = 175) and then investigating the effects of two stages of intervention on a treatment group (n = 22) and a comparison group (n = 16). The first stage of intervention used the instructional activity of reciprocal teaching to teach students cognitive strategies to improve reading comprehension. The second stage of the intervention combined on-going reciprocal teaching with attributional-retraining, aimed at to developing internal attributions for reading performance; specifically effort-related attributions rather than attributions focussing on ability. A baseline sample (which included the treatment and comparison samples as well as students from the wider year 9 and 10 cohort) completed a questionnaire about their attributions for their reading performance. There was no evidence of the hypothesised correlation between a measure of students’ incremental mindset (internal, unstable and controllable attribution) and standardised measures of reading comprehension. Analysis of the attribution data for the baseline sample showed evidence that internal and external attributions are not, as theorised, two ends of the same continuum, rather they are separate constructs, albeit negatively correlated. The treatment and comparison groups completed a standardised reading comprehension test and the attribution questionnaire at four time points: pre-intervention; between the two stages of intervention; post-intervention; and delayed post-intervention. A sub-sample of six students, representing a spectrum of reading achievement was interviewed to develop a better understanding of the responses provided in the questionnaire. The combined interventions had no significant effect on students’ attributions for their reading performance or on their reading comprehension achievement. Conversely, the first stage of the intervention, reciprocal teaching, did have a significant effect on the treatment group’s reading comprehension achievement immediately following the intervention and the group were observed eagerly participating in the activity with significantly increased engagement. The combined qualitative and quantitative data from the interventions provided evidence about the complexity of adolescents’ attributional beliefs. Students responded with a wide variety of beliefs that did not conform to the theorised pattern of attributional beliefs. The findings raise questions about how students form attributions for their successes and failures, in particular the direction of the causal relationship between achievement and attributional beliefs
    • …
    corecore