5,173 research outputs found
LIRA: Lifelong Image Restoration from Unknown Blended Distortions
Most existing image restoration networks are designed in a disposable way and
catastrophically forget previously learned distortions when trained on a new
distortion removal task. To alleviate this problem, we raise the novel lifelong
image restoration problem for blended distortions. We first design a base
fork-join model in which multiple pre-trained expert models specializing in
individual distortion removal task work cooperatively and adaptively to handle
blended distortions. When the input is degraded by a new distortion, inspired
by adult neurogenesis in human memory system, we develop a neural growing
strategy where the previously trained model can incorporate a new expert branch
and continually accumulate new knowledge without interfering with learned
knowledge. Experimental results show that the proposed approach can not only
achieve state-of-the-art performance on blended distortions removal tasks in
both PSNR/SSIM metrics, but also maintain old expertise while learning new
restoration tasks.Comment: ECCV2020 accepte
Modular Deep Learning
Transfer learning has recently become the dominant paradigm of machine
learning. Pre-trained models fine-tuned for downstream tasks achieve better
performance with fewer labelled examples. Nonetheless, it remains unclear how
to develop models that specialise towards multiple tasks without incurring
negative interference and that generalise systematically to non-identically
distributed tasks. Modular deep learning has emerged as a promising solution to
these challenges. In this framework, units of computation are often implemented
as autonomous parameter-efficient modules. Information is conditionally routed
to a subset of modules and subsequently aggregated. These properties enable
positive transfer and systematic generalisation by separating computation from
routing and updating modules locally. We offer a survey of modular
architectures, providing a unified view over several threads of research that
evolved independently in the scientific literature. Moreover, we explore
various additional purposes of modularity, including scaling language models,
causal inference, programme induction, and planning in reinforcement learning.
Finally, we report various concrete applications where modularity has been
successfully deployed such as cross-lingual and cross-modal knowledge transfer.
Related talks and projects to this survey, are available at
https://www.modulardeeplearning.com/
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
Transformer-based Large Language Models (LLMs) have been applied in diverse
areas such as knowledge bases, human interfaces, and dynamic agents, and
marking a stride towards achieving Artificial General Intelligence (AGI).
However, current LLMs are predominantly pretrained on short text snippets,
which compromises their effectiveness in processing the long-context prompts
that are frequently encountered in practical scenarios. This article offers a
comprehensive survey of the recent advancement in Transformer-based LLM
architectures aimed at enhancing the long-context capabilities of LLMs
throughout the entire model lifecycle, from pre-training through to inference.
We first delineate and analyze the problems of handling long-context input and
output with the current Transformer-based models. We then provide a taxonomy
and the landscape of upgrades on Transformer architecture to solve these
problems. Afterwards, we provide an investigation on wildly used evaluation
necessities tailored for long-context LLMs, including datasets, metrics, and
baseline models, as well as optimization toolkits such as libraries,
frameworks, and compilers to boost the efficacy of LLMs across different stages
in runtime. Finally, we discuss the challenges and potential avenues for future
research. A curated repository of relevant literature, continuously updated, is
available at https://github.com/Strivin0311/long-llms-learning.Comment: 40 pages, 3 figures, 4 table
Multi-scale convolutional neural network for automated AMD classification using retinal OCT images
BACKGROUND AND OBJECTIVE: Age-related macular degeneration (AMD) is the most common cause of blindness in developed countries, especially in people over 60 years of age. The workload of specialists and the healthcare system in this field has increased in recent years mainly due to three reasons: 1) increased use of retinal optical coherence tomography (OCT) imaging technique, 2) prevalence of population aging worldwide, and 3) chronic nature of AMD. Recent advancements in the field of deep learning have provided a unique opportunity for the development of fully automated diagnosis frameworks. Considering the presence of AMD-related retinal pathologies in varying sizes in OCT images, our objective was to propose a multi-scale convolutional neural network (CNN) that can capture inter-scale variations and improve performance using a feature fusion strategy across convolutional blocks.
METHODS: Our proposed method introduces a multi-scale CNN based on the feature pyramid network (FPN) structure. This method is used for the reliable diagnosis of normal and two common clinical characteristics of dry and wet AMD, namely drusen and choroidal neovascularization (CNV). The proposed method is evaluated on the national dataset gathered at Hospital (NEH) for this study, consisting of 12649 retinal OCT images from 441 patients, and the UCSD public dataset, consisting of 108312 OCT images from 4686 patients.
RESULTS: Experimental results show the superior performance of our proposed multi-scale structure over several well-known OCT classification frameworks. This feature combination strategy has proved to be effective on all tested backbone models, with improvements ranging from 0.4% to 3.3%. In addition, gradual learning has proved to be effective in improving performance in two consecutive stages. In the first stage, the performance was boosted from 87.2%±2.5% to 92.0%±1.6% using pre-trained ImageNet weights. In the second stage, another performance boost from 92.0%±1.6% to 93.4%±1.4% was observed as a result of fine-tuning the previous model on the UCSD dataset. Lastly, generating heatmaps provided additional proof for the effectiveness of our multi-scale structure, enabling the detection of retinal pathologies appearing in different sizes.
CONCLUSION: The promising quantitative results of the proposed architecture, along with qualitative evaluations through generating heatmaps, prove the suitability of the proposed method to be used as a screening tool in healthcare centers assisting ophthalmologists in making better diagnostic decisions
Mapping (Dis-)Information Flow about the MH17 Plane Crash
Digital media enables not only fast sharing of information, but also
disinformation. One prominent case of an event leading to circulation of
disinformation on social media is the MH17 plane crash. Studies analysing the
spread of information about this event on Twitter have focused on small,
manually annotated datasets, or used proxys for data annotation. In this work,
we examine to what extent text classifiers can be used to label data for
subsequent content analysis, in particular we focus on predicting pro-Russian
and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though
we find that a neural classifier improves over a hashtag based baseline,
labeling pro-Russian and pro-Ukrainian content with high precision remains a
challenging problem. We provide an error analysis underlining the difficulty of
the task and identify factors that might help improve classification in future
work. Finally, we show how the classifier can facilitate the annotation task
for human annotators
Machine Learning in Robotic Ultrasound Imaging: Challenges and Perspectives
This article reviews the recent advances in intelligent robotic ultrasound
(US) imaging systems. We commence by presenting the commonly employed robotic
mechanisms and control techniques in robotic US imaging, along with their
clinical applications. Subsequently, we focus on the deployment of machine
learning techniques in the development of robotic sonographers, emphasizing
crucial developments aimed at enhancing the intelligence of these systems. The
methods for achieving autonomous action reasoning are categorized into two sets
of approaches: those relying on implicit environmental data interpretation and
those using explicit interpretation. Throughout this exploration, we also
discuss practical challenges, including those related to the scarcity of
medical data, the need for a deeper understanding of the physical aspects
involved, and effective data representation approaches. Moreover, we conclude
by highlighting the open problems in the field and analyzing different possible
perspectives on how the community could move forward in this research area.Comment: Accepted by Annual Review of Control, Robotics, and Autonomous
System
- …