16,160 research outputs found
Guided Machine Learning for power grid segmentation
The segmentation of large scale power grids into zones is crucial for control
room operators when managing the grid complexity near real time. In this paper
we propose a new method in two steps which is able to automatically do this
segmentation, while taking into account the real time context, in order to help
them handle shifting dynamics. Our method relies on a "guided" machine learning
approach. As a first step, we define and compute a task specific "Influence
Graph" in a guided manner. We indeed simulate on a grid state chosen
interventions, representative of our task of interest (managing active power
flows in our case). For visualization and interpretation, we then build a
higher representation of the grid relevant to this task by applying the graph
community detection algorithm \textit{Infomap} on this Influence Graph. To
illustrate our method and demonstrate its practical interest, we apply it on
commonly used systems, the IEEE-14 and IEEE-118. We show promising and original
interpretable results, especially on the previously well studied RTS-96 system
for grid segmentation. We eventually share initial investigation and results on
a large-scale system, the French power grid, whose segmentation had a
surprising resemblance with RTE's historical partitioning
Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images
We propose a novel attention gate (AG) model for medical image analysis that
automatically learns to focus on target structures of varying shapes and sizes.
Models trained with AGs implicitly learn to suppress irrelevant regions in an
input image while highlighting salient features useful for a specific task.
This enables us to eliminate the necessity of using explicit external
tissue/organ localisation modules when using convolutional neural networks
(CNNs). AGs can be easily integrated into standard CNN models such as VGG or
U-Net architectures with minimal computational overhead while increasing the
model sensitivity and prediction accuracy. The proposed AG models are evaluated
on a variety of tasks, including medical image classification and segmentation.
For classification, we demonstrate the use case of AGs in scan plane detection
for fetal ultrasound screening. We show that the proposed attention mechanism
can provide efficient object localisation while improving the overall
prediction performance by reducing false positives. For segmentation, the
proposed architecture is evaluated on two large 3D CT abdominal datasets with
manual annotations for multiple organs. Experimental results show that AG
models consistently improve the prediction performance of the base
architectures across different datasets and training sizes while preserving
computational efficiency. Moreover, AGs guide the model activations to be
focused around salient regions, which provides better insights into how model
predictions are made. The source code for the proposed AG models is publicly
available.Comment: Accepted for Medical Image Analysis (Special Issue on Medical Imaging
with Deep Learning). arXiv admin note: substantial text overlap with
arXiv:1804.03999, arXiv:1804.0533
Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding
This work addresses the problem of semantic scene understanding under dense
fog. Although considerable progress has been made in semantic scene
understanding, it is mainly related to clear-weather scenes. Extending
recognition methods to adverse weather conditions such as fog is crucial for
outdoor applications. In this paper, we propose a novel method, named
Curriculum Model Adaptation (CMAda), which gradually adapts a semantic
segmentation model from light synthetic fog to dense real fog in multiple
steps, using both synthetic and real foggy data. In addition, we present three
other main stand-alone contributions: 1) a novel method to add synthetic fog to
real, clear-weather scenes using semantic input; 2) a new fog density
estimator; 3) the Foggy Zurich dataset comprising real foggy images,
with pixel-level semantic annotations for images with dense fog. Our
experiments show that 1) our fog simulation slightly outperforms a
state-of-the-art competing simulation with respect to the task of semantic
foggy scene understanding (SFSU); 2) CMAda improves the performance of
state-of-the-art models for SFSU significantly by leveraging unlabeled real
foggy data. The datasets and code are publicly available.Comment: final version, ECCV 201
Deep Bilateral Learning for Real-Time Image Enhancement
Performance is a critical challenge in mobile image processing. Given a
reference imaging pipeline, or even human-adjusted pairs of images, we seek to
reproduce the enhancements and enable real-time evaluation. For this, we
introduce a new neural network architecture inspired by bilateral grid
processing and local affine color transforms. Using pairs of input/output
images, we train a convolutional neural network to predict the coefficients of
a locally-affine model in bilateral space. Our architecture learns to make
local, global, and content-dependent decisions to approximate the desired image
transformation. At runtime, the neural network consumes a low-resolution
version of the input image, produces a set of affine transformations in
bilateral space, upsamples those transformations in an edge-preserving fashion
using a new slicing node, and then applies those upsampled transformations to
the full-resolution image. Our algorithm processes high-resolution images on a
smartphone in milliseconds, provides a real-time viewfinder at 1080p
resolution, and matches the quality of state-of-the-art approximation
techniques on a large class of image operators. Unlike previous work, our model
is trained off-line from data and therefore does not require access to the
original operator at runtime. This allows our model to learn complex,
scene-dependent transformations for which no reference implementation is
available, such as the photographic edits of a human retoucher.Comment: 12 pages, 14 figures, Siggraph 201
Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation
We address the problem of semantic nighttime image segmentation and improve
the state-of-the-art, by adapting daytime models to nighttime without using
nighttime annotations. Moreover, we design a new evaluation framework to
address the substantial uncertainty of semantics in nighttime images. Our
central contributions are: 1) a curriculum framework to gradually adapt
semantic segmentation models from day to night through progressively darker
times of day, exploiting cross-time-of-day correspondences between daytime
images from a reference map and dark images to guide the label inference in the
dark domains; 2) a novel uncertainty-aware annotation and evaluation framework
and metric for semantic segmentation, including image regions beyond human
recognition capability in the evaluation in a principled fashion; 3) the Dark
Zurich dataset, comprising 2416 unlabeled nighttime and 2920 unlabeled twilight
images with correspondences to their daytime counterparts plus a set of 201
nighttime images with fine pixel-level annotations created with our protocol,
which serves as a first benchmark for our novel evaluation. Experiments show
that our map-guided curriculum adaptation significantly outperforms
state-of-the-art methods on nighttime sets both for standard metrics and our
uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals
that selective invalidation of predictions can improve results on data with
ambiguous content such as our benchmark and profit safety-oriented applications
involving invalid inputs.Comment: IEEE T-PAMI 202
A New Estimator of Intrinsic Dimension Based on the Multipoint Morisita Index
The size of datasets has been increasing rapidly both in terms of number of
variables and number of events. As a result, the empty space phenomenon and the
curse of dimensionality complicate the extraction of useful information. But,
in general, data lie on non-linear manifolds of much lower dimension than that
of the spaces in which they are embedded. In many pattern recognition tasks,
learning these manifolds is a key issue and it requires the knowledge of their
true intrinsic dimension. This paper introduces a new estimator of intrinsic
dimension based on the multipoint Morisita index. It is applied to both
synthetic and real datasets of varying complexities and comparisons with other
existing estimators are carried out. The proposed estimator turns out to be
fairly robust to sample size and noise, unaffected by edge effects, able to
handle large datasets and computationally efficient
The Whole World in Your Hand: Active and Interactive Segmentation
Object segmentation is a fundamental problem
in computer vision and a powerful resource for
development. This paper presents three embodied approaches to the visual segmentation of objects. Each approach to segmentation is aided
by the presence of a hand or arm in the proximity of the object to be segmented. The first
approach is suitable for a robotic system, where
the robot can use its arm to evoke object motion. The second method operates on a wearable system, viewing the world from a human's
perspective, with instrumentation to help detect
and segment objects that are held in the wearer's
hand. The third method operates when observing
a human teacher, locating periodic motion (finger/arm/object waving or tapping) and using it
as a seed for segmentation. We show that object segmentation can serve as a key resource for
development by demonstrating methods that exploit high-quality object segmentations to develop
both low-level vision capabilities (specialized feature detectors) and high-level vision capabilities
(object recognition and localization)
- …