165 research outputs found
Data efficient deep learning for medical image analysis: A survey
The rapid evolution of deep learning has significantly advanced the field of
medical image analysis. However, despite these achievements, the further
enhancement of deep learning models for medical image analysis faces a
significant challenge due to the scarcity of large, well-annotated datasets. To
address this issue, recent years have witnessed a growing emphasis on the
development of data-efficient deep learning methods. This paper conducts a
thorough review of data-efficient deep learning methods for medical image
analysis. To this end, we categorize these methods based on the level of
supervision they rely on, encompassing categories such as no supervision,
inexact supervision, incomplete supervision, inaccurate supervision, and only
limited supervision. We further divide these categories into finer
subcategories. For example, we categorize inexact supervision into multiple
instance learning and learning with weak annotations. Similarly, we categorize
incomplete supervision into semi-supervised learning, active learning, and
domain-adaptive learning and so on. Furthermore, we systematically summarize
commonly used datasets for data efficient deep learning in medical image
analysis and investigate future research directions to conclude this survey.Comment: Under Revie
Prioritising references for systematic reviews with RobotAnalyst: A user study
Screening references is a time-consuming step necessary for systematic reviews and guideline development. Previous studies have shown that human effort can be reduced by using machine learning software to prioritise large reference collections such that most of the relevant references are identified before screening is completed. We describe and evaluate RobotAnalyst, a Web-based software system that combines text-mining and machine learning algorithms for organising references by their content and actively prioritising them based on a relevancy classification model trained and updated throughout the process. We report an evaluation over 22 reference collections (most are related to public health topics) screened using RobotAnalyst with a total of 43 610 abstract-level decisions. The number of references that needed to be screened to identify 95% of the abstract-level inclusions for the evidence review was reduced on 19 of the 22 collections. Significant gains over random sampling were achieved for all reviews conducted with active prioritisation, as compared with only two of five when prioritisation was not used. RobotAnalyst's descriptive clustering and topic modelling functionalities were also evaluated by public health analysts. Descriptive clustering provided more coherent organisation than topic modelling, and the content of the clusters was apparent to the users across a varying number of clusters. This is the first large-scale study using technology-assisted screening to perform new reviews, and the positive results provide empirical evidence that RobotAnalyst can accelerate the identification of relevant studies. The results also highlight the issue of user complacency and the need for a stopping criterion to realise the work savings
The Emerging Trends of Multi-Label Learning
Exabytes of data are generated daily by humans, leading to the growing need
for new efforts in dealing with the grand challenges for multi-label learning
brought by big data. For example, extreme multi-label classification is an
active and rapidly growing research area that deals with classification tasks
with an extremely large number of classes or labels; utilizing massive data
with limited supervision to build a multi-label classification model becomes
valuable for practical applications, etc. Besides these, there are tremendous
efforts on how to harvest the strong learning capability of deep learning to
better capture the label dependencies in multi-label learning, which is the key
for deep learning to address real-world classification tasks. However, it is
noted that there has been a lack of systemic studies that focus explicitly on
analyzing the emerging trends and new challenges of multi-label learning in the
era of big data. It is imperative to call for a comprehensive survey to fulfill
this mission and delineate future research directions and new applications.Comment: Accepted to TPAMI 202
Recent Advances of Local Mechanisms in Computer Vision: A Survey and Outlook of Recent Work
Inspired by the fact that human brains can emphasize discriminative parts of
the input and suppress irrelevant ones, substantial local mechanisms have been
designed to boost the development of computer vision. They can not only focus
on target parts to learn discriminative local representations, but also process
information selectively to improve the efficiency. In terms of application
scenarios and paradigms, local mechanisms have different characteristics. In
this survey, we provide a systematic review of local mechanisms for various
computer vision tasks and approaches, including fine-grained visual
recognition, person re-identification, few-/zero-shot learning, multi-modal
learning, self-supervised learning, Vision Transformers, and so on.
Categorization of local mechanisms in each field is summarized. Then,
advantages and disadvantages for every category are analyzed deeply, leaving
room for exploration. Finally, future research directions about local
mechanisms have also been discussed that may benefit future works. To the best
our knowledge, this is the first survey about local mechanisms on computer
vision. We hope that this survey can shed light on future research in the
computer vision field
Learning Mid-Level Representations for Visual Recognition
The objective of this thesis is to enhance visual recognition for objects and scenes
through the development of novel mid-level representations and appendent learning
algorithms. In particular, this work is focusing on category level recognition which
is still a very challenging and mainly unsolved task. One crucial component in visual
recognition systems is the representation of objects and scenes. However, depending on
the representation, suitable learning strategies need to be developed that make it possible
to learn new categories automatically from training data. Therefore, the aim of this thesis
is to extend low-level representations by mid-level representations and to develop suitable
learning mechanisms.
A popular kind of mid-level representations are higher order statistics such as
self-similarity and co-occurrence statistics. While these descriptors are satisfying the
demand for higher-level object representations, they are also exhibiting very large and ever
increasing dimensionality. In this thesis a new object representation, based on curvature
self-similarity, is suggested that goes beyond the currently popular approximation of
objects using straight lines. However, like all descriptors using second order statistics,
it also exhibits a high dimensionality. Although improving discriminability, the high
dimensionality becomes a critical issue due to lack of generalization ability and curse
of dimensionality. Given only a limited amount of training data, even sophisticated
learning algorithms such as the popular kernel methods are not able to suppress noisy or
superfluous dimensions of such high-dimensional data. Consequently, there is a natural
need for feature selection when using present-day informative features and, particularly,
curvature self-similarity. We therefore suggest an embedded feature selection method for
support vector machines that reduces complexity and improves generalization capability
of object models. The proposed curvature self-similarity representation is successfully
integrated together with the embedded feature selection in a widely used state-of-the-art
object detection framework.
The influence of higher order statistics for category level object recognition, is further
investigated by learning co-occurrences between foreground and background, to reduce
the number of false detections. While the suggested curvature self-similarity descriptor
is improving the model for more detailed description of the foreground, higher order
statistics are now shown to be also suitable for explicitly modeling the background.
This is of particular use for the popular chamfer matching technique, since it is prone
to accidental matches in dense clutter. As clutter only interferes with the foreground
model contour, we learn where to place the background contours with respect to the
foreground object boundary. The co-occurrence of background contours is integrated
into a max-margin framework. Thus the suggested approach combines the advantages of
accurately detecting object parts via chamfer matching and the robustness of max-margin
learning.
While chamfer matching is very efficient technique for object detection, parts are only
detected based on a simple distance measure. Contrary to that, mid-level parts and
patches are explicitly trained to distinguish true positives in the foreground from false
positives in the background. Due to the independence of mid-level patches and parts it
is possible to train a large number of instance specific part classifiers. This is contrary
to the current most powerful discriminative approaches that are typically only feasible
for a small number of parts, as they are modeling the spatial dependencies between
them. Due to their number, we cannot directly train a powerful classifier to combine
all parts. Instead, parts are randomly grouped into fewer, overlapping compositions that
are trained using a maximum-margin approach. In contrast to the common rationale of
compositional approaches, we do not aim for semantically meaningful ensembles. Rather
we seek randomized compositions that are discriminative and generalize over all instances
of a category. Compositions are all combined by a non-linear decision function which is
completing the powerful hierarchy of discriminative classifiers.
In summary, this thesis is improving visual recognition of objects and scenes, by
developing novel mid-level representations on top of different kinds of low-level
representations. Furthermore, it investigates in the development of suitable learning
algorithms, to deal with the new challenges that are arising form the novel object
representations presented in this work
Pretrained Transformers for Text Ranking: BERT and Beyond
The goal of text ranking is to generate an ordered list of texts retrieved
from a corpus in response to a query. Although the most common formulation of
text ranking is search, instances of the task can also be found in many natural
language processing applications. This survey provides an overview of text
ranking with neural network architectures known as transformers, of which BERT
is the best-known example. The combination of transformers and self-supervised
pretraining has been responsible for a paradigm shift in natural language
processing (NLP), information retrieval (IR), and beyond. In this survey, we
provide a synthesis of existing work as a single point of entry for
practitioners who wish to gain a better understanding of how to apply
transformers to text ranking problems and researchers who wish to pursue work
in this area. We cover a wide range of modern techniques, grouped into two
high-level categories: transformer models that perform reranking in multi-stage
architectures and dense retrieval techniques that perform ranking directly.
There are two themes that pervade our survey: techniques for handling long
documents, beyond typical sentence-by-sentence processing in NLP, and
techniques for addressing the tradeoff between effectiveness (i.e., result
quality) and efficiency (e.g., query latency, model and index size). Although
transformer architectures and pretraining techniques are recent innovations,
many aspects of how they are applied to text ranking are relatively well
understood and represent mature techniques. However, there remain many open
research questions, and thus in addition to laying out the foundations of
pretrained transformers for text ranking, this survey also attempts to
prognosticate where the field is heading
Precision Standard Model Phenomenology for High Energy Processes
The present status of particle physics is that the Standard Model has been completed with the discovery of theHiggs boson in 2012, but there is a multitude of phenomena in nature which is not accounted for by this model.Researchers are investigating possibilities for detecting new physics at the current particle physics facilities, withthe Large Hadron Collider (LHC) at the frontier. As no significant sign of new physics has been observed as oftoday, precision phenomenology becomes increasingly important. This thesis and the four papers included in itcontribute to this field of precision predictions for various important processes at the LHC.In paper I and paper IV, the Drell-Yan process is investigated, and specifically, the decay coefficients whichparameterize the spherical distribution of the outgoing leptons in the process. In the first work, we investigatethe next-to-leading-order (NLO) electroweak corrections to the coefficients of the neutral-current process. In thesecond work, a similar study, but including also next-to-next-to-leading-order quantum chromodynamic (QCD)corrections, is performed for the decay coefficients of the charged-current Drell-Yan process. The latter processand the corresponding coefficients are of great importance for measuring the W boson mass at the LHC.In paper II, the top quark pair production and the spin correlations for the process are investigated. The spincorrelation information of the top quarks may reveal underlying new physics when probed at high precision.Therefore, this work computes approximate complete-NLO corrections, including electroweak corrections to thespin correlation coefficients and related leptonic distributions, contributing to the state-of-the-art high precisionStandard Model predictions for these observables.Finally, paper III is the theoretical base of a crucial improvement to matrix-element generators. We proposein this paper to utilize a next-to-leading-colour truncation of the colour matrix in the large-Nc limit, in order toreduce the complexity of the cross section computation when a large number of QCD patrons are involved in theprocess. The results suggest that such a truncation of the colour expansion will facilitate for efficient computationof multi-jet events, which are a dominant background for many important processes and new physics searches athadron colliders, such as the LHC
- …