Search CORE

45,021 research outputs found

An empirical evaluation of imbalanced data strategies from a practitioner's point of view

Author: Franceschinell Rodrigo A.
Wainer Jacques
Publication venue
Publication date: 16/10/2018
Field of study

This research tested the following well known strategies to deal with binary imbalanced data on 82 different real life data sets (sampled to imbalance rates of 5%, 3%, 1%, and 0.1%): class weight, SMOTE, Underbagging, and a baseline (just the base classifier). As base classifiers we used SVM with RBF kernel, random forests, and gradient boosting machines and we measured the quality of the resulting classifier using 6 different metrics (Area under the curve, Accuracy, F-measure, G-mean, Matthew's correlation coefficient and Balanced accuracy). The best strategy strongly depends on the metric used to measure the quality of the classifier. For AUC and accuracy class weight and the baseline perform better; for F-measure and MCC, SMOTE performs better; and for G-mean and balanced accuracy, underbagging

arXiv.org e-Print Archive

Detecting animals in African Savanna with UAVs and the crowds

Author: Joost Stéphane
Rey Nicolas
Tuia Devis
Volpi Michele
Publication venue: 'Elsevier BV'
Publication date: 06/09/2017
Field of study

Unmanned aerial vehicles (UAVs) offer new opportunities for wildlife monitoring, with several advantages over traditional field-based methods. They have readily been used to count birds, marine mammals and large herbivores in different environments, tasks which are routinely performed through manual counting in large collections of images. In this paper, we propose a semi-automatic system able to detect large mammals in semi-arid Savanna. It relies on an animal-detection system based on machine learning, trained with crowd-sourced annotations provided by volunteers who manually interpreted sub-decimeter resolution color images. The system achieves a high recall rate and a human operator can then eliminate false detections with limited effort. Our system provides good perspectives for the development of data-driven management practices in wildlife conservation. It shows that the detection of large mammals in semi-arid Savanna can be approached by processing data provided by standard RGB cameras mounted on affordable fixed wings UAVs

arXiv.org e-Print Archive

Wageningen University & Research Publications

Mass Litigation Governance in the Post-Class Action Era: The Problems and Promise of Non-removable State Actions in Multi-district Litigation

Author: Glover J. Maria
Publication venue: Scholarship @ GEORGETOWN LAW
Publication date: 17/04/2014
Field of study

Given a string of decisions restricting the use and availability of the class action device, the world of mass litigation may well be moving into a post-class action era. In this era, newer devices of aggregation—perhaps principally among them multi-district litigation (“MDL”)—increasingly will be called upon to meet the age-old mass litigation goal of achieving global peace of numerous claims arising out of a related, widespread harm. Indeed, coordination of pretrial proceedings in the MDL frequently facilitates the achievement of this peace, given the reality that cases, once consolidated in the MDL, often settle en masse. However, one clear obstacle to the achievement of aggregate peace in the MDL, one that also plagues the achievement of that peace in the class action world, is our federal system of substantive and procedural law. In the MDL context, the problem arises because litigation involving state-law claims and non-diverse parties, which are not removable from state court, cannot be transferred to the MDL court. Despite their prevalence, little scholarly attention has been devoted to non-removable state-court actions in MDL. The few responses to this issue have largely focused upon the efficiencies that could be gained through increased, and perhaps total, consolidation of all related cases or, short of consolidation, through heightened coordination of pre-trial proceedings between state and federal judges. This article questions whether these responses have led reform proposals in the wrong direction, and instead takes a different view. Rather than argue for increased consolidation, I offer for further consideration the possible ways in which the happenstantial existence of parallel tracks of related state and federal cases actually hold promise, if properly harnessed, as mechanisms for achieving the goals of aggregate litigation and for disciplining the contours of global settlements of mass disputes. In particular, I explore the possibility that the existence of parallel state and federal cases—frequently viewed as an obstacle to global resolution of claims unable to be consolidated in a single forum—may well fortuitously provide an opportunity to achieve the sorts of mass litigation resolution envisioned but unsuccessfully attempted in the class action context. In so doing, this article adds new thoughts and theories to the specific debate regarding parallel state and federal claims in MDL, as well as to the larger debate about mass litigation governance in a post-class action world

bepress Legal Repository

Georgetown Law Scholarly Commons

A Solution to the Galactic Foreground Problem for LISA

Author: C. S. Jensen
D. Gamerman
G. Schwarz
Jeff Crowder
Neil J. Cornish
P. J. Green
S. Geman
V. M. Lipunov
Publication venue: 'American Physical Society (APS)'
Publication date: 17/11/2006
Field of study

Low frequency gravitational wave detectors, such as the Laser Interferometer Space Antenna (LISA), will have to contend with large foregrounds produced by millions of compact galactic binaries in our galaxy. While these galactic signals are interesting in their own right, the unresolved component can obscure other sources. The science yield for the LISA mission can be improved if the brighter and more isolated foreground sources can be identified and regressed from the data. Since the signals overlap with one another we are faced with a ``cocktail party'' problem of picking out individual conversations in a crowded room. Here we present and implement an end-to-end solution to the galactic foreground problem that is able to resolve tens of thousands of sources from across the LISA band. Our algorithm employs a variant of the Markov Chain Monte Carlo (MCMC) method, which we call the Blocked Annealed Metropolis-Hastings (BAM) algorithm. Following a description of the algorithm and its implementation, we give several examples ranging from searches for a single source to searches for hundreds of overlapping sources. Our examples include data sets from the first round of Mock LISA Data Challenges.Comment: 19 pages, 27 figure

arXiv.org e-Print Archive

Crossref

DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model

Author: Andres Bjoern
Andriluka Mykhaylo
Insafutdinov Eldar
Pishchulin Leonid
Schiele Bernt
Publication venue
Publication date: 01/01/2016
Field of study

The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people. To that end we contribute on three fronts. We propose (1) improved body part detectors that generate effective bottom-up proposals for body parts; (2) novel image-conditioned pairwise terms that allow to assemble the proposals into a variable number of consistent body part configurations; and (3) an incremental optimization strategy that explores the search space more efficiently thus leading both to better performance and significant speed-up factors. Evaluation is done on two single-person and two multi-person pose estimation benchmarks. The proposed approach significantly outperforms best known multi-person pose estimation results while demonstrating competitive performance on the task of single person pose estimation. Models and code available at http://pose.mpi-inf.mpg.deComment: ECCV'16. High-res version at https://www.d2.mpi-inf.mpg.de/sites/default/files/insafutdinov16arxiv.pd

arXiv.org e-Print Archive

Crossref

CISPA – Helmholtz-Zentrum für Informationssicherheit