95,499 research outputs found
Lightweight learning from label proportions on satellite imagery
This work addresses the challenge of producing chip level predictions on
satellite imagery when only label proportions at a coarser spatial geometry are
available, typically from statistical or aggregated data from administrative
divisions (such as municipalities or communes). This kind of tabular data is
usually widely available in many regions of the world and application areas
and, thus, its exploitation may contribute to leverage the endemic scarcity of
fine grained labelled data in Earth Observation (EO). This can be framed as a
Learning from Label Proportions (LLP) problem setup. LLP applied to EO data is
still an emerging field and performing comparative studies in applied scenarios
remains a challenge due to the lack of standardized datasets. In this work,
first, we show how simple deep learning and probabilistic methods generally
perform better than standard more complex ones, providing a surprising level of
finer grained spatial detail when trained with much coarser label proportions.
Second, we provide a set of benchmarking datasets enabling comparative LLP
applied to EO, providing both fine grained labels and aggregated data according
to existing administrative divisions. Finally, we argue how this approach might
be valuable when considering on-orbit inference and training. Source code is
available at https://github.com/rramosp/llpeoComment: 16 pages, 13 figure
beta-risk: a New Surrogate Risk for Learning from Weakly Labeled Data
International audienceDuring the past few years, the machine learning community has paid attention to developing new methods for learning from weakly labeled data. This field covers different settings like semi-supervised learning, learning with label proportions, multi-instance learning, noise-tolerant learning, etc. This paper presents a generic framework to deal with these weakly labeled scenarios. We introduce the \betarisk as a generalized formulation of the standard empirical risk based on surrogate margin-based loss functions. This risk allows us to express the reliability on the labels and to derive different kinds of learning algorithms. We specifically focus on SVMs and propose a soft margin \betasvm algorithm which behaves better that the state of the art
Proportion constrained weakly supervised histopathology image classification
Multiple instance learning (MIL) deals with data grouped into bags of instances, of which only the global
information is known. In recent years, this weakly supervised learning paradigm has become very popular in
histological image analysis because it alleviates the burden of labeling all cancerous regions of large Whole
Slide Images (WSIs) in detail. However, these methods require large datasets to perform properly, and many
approaches only focus on simple binary classification. This often does not match the real-world problems
where multi-label settings are frequent and possible constraints must be taken into account. In this work, we
propose a novel multi-label MIL formulation based on inequality constraints that is able to incorporate prior
knowledge about instance proportions. Our method has a theoretical foundation in optimization with logbarrier
extensions, applied to bag-level class proportions. This encourages the model to respect the proportion
ordering during training. Extensive experiments on a new public dataset of prostate cancer WSIs analysis,
SICAP-MIL, demonstrate that using the prior proportion information we can achieve instance-level results
similar to supervised methods on datasets of similar size. In comparison with prior MIL settings, our method
allows for ∼ 13% improvements in instance-level accuracy, and ∼ 3% in the multi-label mean area under the
ROC curve at the bag-level.Spanish Government PID2019-105142RB-C2European Commission 860627Generalitat Valenciana/European Union through the European Regional Development Fund (ERDF) of the Valencian Community IDIFEDER/2020/030Universitat Politecnica de Valenci
- …