69,280 research outputs found
Multiple Instance Learning: A Survey of Problem Characteristics and Applications
Multiple instance learning (MIL) is a form of weakly supervised learning
where training instances are arranged in sets, called bags, and a label is
provided for the entire bag. This formulation is gaining interest because it
naturally fits various problems and allows to leverage weakly labeled data.
Consequently, it has been used in diverse application fields such as computer
vision and document classification. However, learning from bags raises
important challenges that are unique to MIL. This paper provides a
comprehensive survey of the characteristics which define and differentiate the
types of MIL problems. Until now, these problem characteristics have not been
formally identified and described. As a result, the variations in performance
of MIL algorithms from one data set to another are difficult to explain. In
this paper, MIL problem characteristics are grouped into four broad categories:
the composition of the bags, the types of data distribution, the ambiguity of
instance labels, and the task to be performed. Methods specialized to address
each category are reviewed. Then, the extent to which these characteristics
manifest themselves in key MIL application areas are described. Finally,
experiments are conducted to compare the performance of 16 state-of-the-art MIL
methods on selected problem characteristics. This paper provides insight on how
the problem characteristics affect MIL algorithms, recommendations for future
benchmarking and promising avenues for research
Building the Big Society
Papers are a contribution to the debate and set out the authors ’ views only Localism and the Big Societ
(Machine) Learning to Do More with Less
Determining the best method for training a machine learning algorithm is
critical to maximizing its ability to classify data. In this paper, we compare
the standard "fully supervised" approach (that relies on knowledge of
event-by-event truth-level labels) with a recent proposal that instead utilizes
class ratios as the only discriminating information provided during training.
This so-called "weakly supervised" technique has access to less information
than the fully supervised method and yet is still able to yield impressive
discriminating power. In addition, weak supervision seems particularly well
suited to particle physics since quantum mechanics is incompatible with the
notion of mapping an individual event onto any single Feynman diagram. We
examine the technique in detail -- both analytically and numerically -- with a
focus on the robustness to issues of mischaracterizing the training samples.
Weakly supervised networks turn out to be remarkably insensitive to systematic
mismodeling. Furthermore, we demonstrate that the event level outputs for
weakly versus fully supervised networks are probing different kinematics, even
though the numerical quality metrics are essentially identical. This implies
that it should be possible to improve the overall classification ability by
combining the output from the two types of networks. For concreteness, we apply
this technology to a signature of beyond the Standard Model physics to
demonstrate that all these impressive features continue to hold in a scenario
of relevance to the LHC.Comment: 32 pages, 12 figures. Example code is provided at
https://github.com/bostdiek/PublicWeaklySupervised . v3: Version published in
JHEP, discussion adde
Undocumented Migrants in Resistance against Detention: Comparative Observations on Germany and France
Although the immigration policies of Germany and France share a similarly restrictive approach, the manner in which migrants protest against such policies and resist against their implementation is strikingly different. This is particularly obvious for undocumented migrants. In France, collective action of undocumented migrants has received increasing public attention over the last two decades, and detention centres have been a foremost target of such action. Resistance against detention prior to deportation culminated in achieving the closure of the country's biggest detention centre in 2008. To the contrary, undocumented migrants have hardly ever protested against their condition in Germany. Although collective action against immigration policies has reached a new level with the “Refugee Tent Action” occupying public space in Berlin and elsewhere since 2012, it continues to focus mainly on the living conditions of asylum seekers, not undocumented migrants. This discrepancy may be explained with the existence of different institutional conditions for collective action, i.e. such political opportunity structures that refer to state regulations and measures. A comparative analysis of these conditions shows that weaker resistance against immigration detention in Germany may be due to the existence of comparably more repressive and controlling immigration laws, a flexible toleration status that provides its holders with basic social security, and the scarcity of options for legalisation. The combination of harsh repression and little prospect for legalisation makes resistance appear much riskier. The risks are greater yet for holders of a toleration status since its delivery is, to some extent, subject to administrative discretion. The toleration status thus tends to divide the people susceptible to engage in collective action. The knowledge of these differences may help undocumented migrants and their supporters in both countries to develop more effective strategies of resistance against restrictive policies
A Dataset for Movie Description
Descriptive video service (DVS) provides linguistic descriptions of movies
and allows visually impaired people to follow a movie along with their peers.
Such descriptions are by design mainly visual and thus naturally form an
interesting data source for computer vision and computational linguistics. In
this work we propose a novel dataset which contains transcribed DVS, which is
temporally aligned to full length HD movies. In addition we also collected the
aligned movie scripts which have been used in prior work and compare the two
different sources of descriptions. In total the Movie Description dataset
contains a parallel corpus of over 54,000 sentences and video snippets from 72
HD movies. We characterize the dataset by benchmarking different approaches for
generating video descriptions. Comparing DVS to scripts, we find that DVS is
far more visual and describes precisely what is shown rather than what should
happen according to the scripts created prior to movie production
Rethinking the International Monetary System: an overview
Monetary policy ; International finance
- …