6,181 research outputs found
Bridging the Gap Between Layout Pattern Sampling and Hotspot Detection via Batch Active Learning
Layout hotpot detection is one of the main steps in modern VLSI design. A
typical hotspot detection flow is extremely time consuming due to the
computationally expensive mask optimization and lithographic simulation. Recent
researches try to facilitate the procedure with a reduced flow including
feature extraction, training set generation and hotspot detection, where
feature extraction methods and hotspot detection engines are deeply studied.
However, the performance of hotspot detectors relies highly on the quality of
reference layout libraries which are costly to obtain and usually predetermined
or randomly sampled in previous works. In this paper, we propose an active
learning-based layout pattern sampling and hotspot detection flow, which
simultaneously optimizes the machine learning model and the training set that
aims to achieve similar or better hotspot detection performance with much
smaller number of training instances. Experimental results show that our
proposed method can significantly reduce lithography simulation overhead while
attaining satisfactory detection accuracy on designs under both DUV and EUV
lithography technologies.Comment: 8 pages, 7 figure
Data Efficient Lithography Modeling with Transfer Learning and Active Data Selection
Lithography simulation is one of the key steps in physical verification,
enabled by the substantial optical and resist models. A resist model bridges
the aerial image simulation to printed patterns. While the effectiveness of
learning-based solutions for resist modeling has been demonstrated, they are
considerably data-demanding. Meanwhile, a set of manufactured data for a
specific lithography configuration is only valid for the training of one single
model, indicating low data efficiency. Due to the complexity of the
manufacturing process, obtaining enough data for acceptable accuracy becomes
very expensive in terms of both time and cost, especially during the evolution
of technology generations when the design space is intensively explored. In
this work, we propose a new resist modeling framework for contact layers,
utilizing existing data from old technology nodes and active selection of data
in a target technology node, to reduce the amount of data required from the
target lithography configuration. Our framework based on transfer learning and
active learning techniques is effective within a competitive range of accuracy,
i.e., 3-10X reduction on the amount of training data with comparable accuracy
to the state-of-the-art learning approach
Photo-Realistic Monocular Gaze Redirection Using Generative Adversarial Networks
Gaze redirection is the task of changing the gaze to a desired direction for
a given monocular eye patch image. Many applications such as videoconferencing,
films, games, and generation of training data for gaze estimation require
redirecting the gaze, without distorting the appearance of the area surrounding
the eye and while producing photo-realistic images. Existing methods lack the
ability to generate perceptually plausible images. In this work, we present a
novel method to alleviate this problem by leveraging generative adversarial
training to synthesize an eye image conditioned on a target gaze direction. Our
method ensures perceptual similarity and consistency of synthesized images to
the real images. Furthermore, a gaze estimation loss is used to control the
gaze direction accurately. To attain high-quality images, we incorporate
perceptual and cycle consistency losses into our architecture. In extensive
evaluations we show that the proposed method outperforms state-of-the-art
approaches in terms of both image quality and redirection precision. Finally,
we show that generated images can bring significant improvement for the gaze
estimation task if used to augment real training data.Comment: Published on ICCV201
Monochromatic CT Image Reconstruction from Current-Integrating Data via Deep Learning
In clinical CT, the x-ray source emits polychromatic x-rays, which are
detected in the current-integrating mode. This physical process is accurately
described by an energy-dependent non-linear integral model on the basis of the
Beer-Lambert law. However, the non-linear model is too complicated to be
directly solved for the image reconstruction, and is often approximated to a
linear integral model in the form of the Radon transform, basically ignoring
energy-dependent information. This model approximation would generate
inaccurate quantification of attenuation image and significant beam-hardening
artifacts. In this paper, we develop a deep-learning-based CT image
reconstruction method to address the mismatch of computing model to physical
model. Our method learns a nonlinear transformation from big data to correct
measured projection data to accurately match the linear integral model, realize
monochromatic imaging and overcome beam hardening effectively. The
deep-learning network is trained and tested using clinical dual-energy dataset
to demonstrate the feasibility of the proposed methodology. Results show that
the proposed method can achieve a high accuracy of the projection correction
with a relative error of less than 0.2%
Face Alignment Using K-Cluster Regression Forests With Weighted Splitting
In this work we present a face alignment pipeline based on two novel methods:
weighted splitting for K-cluster Regression Forests and 3D Affine Pose
Regression for face shape initialization. Our face alignment method is based on
the Local Binary Feature framework, where instead of standard regression
forests and pixel difference features used in the original method, we use our
K-cluster Regression Forests with Weighted Splitting (KRFWS) and Pyramid HOG
features. We also use KRFWS to perform Affine Pose Regression (APR) and
3D-Affine Pose Regression (3D-APR), which intend to improve the face shape
initialization. APR applies a rigid 2D transform to the initial face shape that
compensates for inaccuracy in the initial face location, size and in-plane
rotation. 3D-APR estimates the parameters of a 3D transform that additionally
compensates for out-of-plane rotation. The resulting pipeline, consisting of
APR and 3D-APR followed by face alignment, shows an improvement of 20% over
standard LBF on the challenging IBUG dataset, and state-of-theart accuracy on
the entire 300-W dataset.Comment: Postprint of an article published in IEEE Signal Processing Letters
in 2016. A video explaining the method:
https://www.youtube.com/watch?v=F4tgihZLrY
Robust Data-Driven Zero-Velocity Detection for Foot-Mounted Inertial Navigation
We present two novel techniques for detecting zero-velocity events to improve
foot-mounted inertial navigation. Our first technique augments a classical
zero-velocity detector by incorporating a motion classifier that adaptively
updates the detector's threshold parameter. Our second technique uses a long
short-term memory (LSTM) recurrent neural network to classify zero-velocity
events from raw inertial data, in contrast to the majority of zero-velocity
detection methods that rely on basic statistical hypothesis testing. We
demonstrate that both of our proposed detectors achieve higher accuracies than
existing detectors for trajectories including walking, running, and
stair-climbing motions. Additionally, we present a straightforward data
augmentation method that is able to extend the LSTM-based model to different
inertial sensors without the need to collect new training data.Comment: 10 pages, 7 figure
Teaching Robots to Do Object Assembly using Multi-modal 3D Vision
The motivation of this paper is to develop a smart system using multi-modal
vision for next-generation mechanical assembly. It includes two phases where in
the first phase human beings teach the assembly structure to a robot and in the
second phase the robot finds objects and grasps and assembles them using AI
planning. The crucial part of the system is the precision of 3D visual
detection and the paper presents multi-modal approaches to meet the
requirements: AR markers are used in the teaching phase since human beings can
actively control the process. Point cloud matching and geometric constraints
are used in the robot execution phase to avoid unexpected noises. Experiments
are performed to examine the precision and correctness of the approaches. The
study is practical: The developed approaches are integrated with graph
model-based motion planning, implemented on an industrial robots and applicable
to real-world scenarios
Physical Adversarial Textures that Fool Visual Object Tracking
We present a system for generating inconspicuous-looking textures that, when
displayed in the physical world as digital or printed posters, cause visual
object tracking systems to become confused. For instance, as a target being
tracked by a robot's camera moves in front of such a poster, our generated
texture makes the tracker lock onto it and allows the target to evade. This
work aims to fool seldom-targeted regression tasks, and in particular compares
diverse optimization strategies: non-targeted, targeted, and a new family of
guided adversarial losses. While we use the Expectation Over Transformation
(EOT) algorithm to generate physical adversaries that fool tracking models when
imaged under diverse conditions, we compare the impacts of different
conditioning variables, including viewpoint, lighting, and appearances, to find
practical attack setups with high resulting adversarial strength and
convergence speed. We further showcase textures optimized solely using
simulated scenes can confuse real-world tracking systems.Comment: Accepted to the International Conference on Computer Vision (ICCV)
201
A vision based system for underwater docking
Autonomous underwater vehicles (AUVs) have been deployed for underwater
exploration. However, its potential is confined by its limited on-board battery
energy and data storage capacity. This problem has been addressed using docking
systems by underwater recharging and data transfer for AUVs. In this work, we
propose a vision based framework for underwater docking following these
systems. The proposed framework comprises two modules; (i) a detection module
which provides location information on underwater docking stations in 2D images
captured by an on-board camera, and (ii) a pose estimation module which
recovers the relative 3D position and orientation between docking stations and
AUVs from the 2D images. For robust and credible detection of docking stations,
we propose a convolutional neural network called Docking Neural Network (DoNN).
For accurate pose estimation, a perspective-n-point algorithm is integrated
into our framework. In order to examine our framework in underwater docking
tasks, we collected a dataset of 2D images, named Underwater Docking Images
Dataset (UDID), in an experimental water pool. To the best of our knowledge,
UDID is the first publicly available underwater docking dataset. In the
experiments, we first evaluate performance of the proposed detection module on
UDID and its deformed variations. Next, we assess the accuracy of the pose
estimation module by ground experiments, since it is not feasible to obtain
true relative position and orientation between docking stations and AUVs under
water. Then, we examine the pose estimation module by underwater experiments in
our experimental water pool. Experimental results show that the proposed
framework can be used to detect docking stations and estimate their relative
pose efficiently and successfully, compared to the state-of-the-art baseline
systems
Automated Evaluation of Semantic Segmentation Robustness for Autonomous Driving
One of the fundamental challenges in the design of perception systems for
autonomous vehicles is validating the performance of each algorithm under a
comprehensive variety of operating conditions. In the case of vision-based
semantic segmentation, there are known issues when encountering new scenarios
that are sufficiently different to the training data. In addition, even small
variations in environmental conditions such as illumination and precipitation
can affect the classification performance of the segmentation model. Given the
reliance on visual information, these effects often translate into poor
semantic pixel classification which can potentially lead to catastrophic
consequences when driving autonomously. This paper presents a novel method for
analysing the robustness of semantic segmentation models and provides a number
of metrics to evaluate the classification performance over a variety of
environmental conditions. The process incorporates an additional sensor (lidar)
to automate the process, eliminating the need for labour-intensive hand
labelling of validation data. The system integrity can be monitored as the
performance of the vision sensors are validated against a different sensor
modality. This is necessary for detecting failures that are inherent to vision
technology. Experimental results are presented based on multiple datasets
collected at different times of the year with different environmental
conditions. These results show that the semantic segmentation performance
varies depending on the weather, camera parameters, existence of shadows, etc..
The results also demonstrate how the metrics can be used to compare and
validate the performance after making improvements to a model, and compare the
performance of different networks
- …