1,310 research outputs found
Binospec Software System
Binospec is a high-throughput, 370 to 1000 nm, imaging spectrograph that
addresses two adjacent 8' by 15' fields of view. Binospec was commissioned in
late 2017 at the f/5 focus of the 6.5m MMT and is now available to all MMT
observers. Here we describe the Binospec software used for observation
planning, instrument control, and data reduction. The software and control
systems incorporate a high level of automation to minimize observer workload.
Instrument configuration and observation sequencing is implemented using a
database-driven approach to maximize observatory efficiency. A web-based
interface allows users to define observations, monitor status, and retrieve
data products.Comment: PASP in press; 22 pages; 12 figure
DECam integration tests on telescope simulator
The Dark Energy Survey (DES) is a next generation optical survey aimed at
measuring the expansion history of the universe using four probes: weak
gravitational lensing, galaxy cluster counts, baryon acoustic oscillations, and
Type Ia supernovae. To perform the survey, the DES Collaboration is building
the Dark Energy Camera (DECam), a 3 square degree, 570 Megapixel CCD camera
which will be mounted at the Blanco 4-meter telescope at the Cerro Tololo
Inter- American Observatory. DES will survey 5000 square degrees of the
southern galactic cap in 5 filters (g, r, i, z, Y). DECam will be comprised of
74 250 micron thick fully depleted CCDs: 62 2k x 4k CCDs for imaging and 12 2k
x 2k CCDs for guiding and focus. Construction of DECam is nearing completion.
In order to verify that the camera meets technical specifications for DES and
to reduce the time required to commission the instrument, we have constructed a
full sized telescope simulator and performed full system testing and
integration prior to shipping. To complete this comprehensive test phase we
have simulated a DES observing run in which we have collected 4 nights worth of
data. We report on the results of these unique tests performed for the DECam
and its impact on the experiments progress.Comment: Proceedings of the 2nd International Conference on Technology and
Instrumentation in Particle Physics (TIPP 2011). To appear in Physics
Procedia. 8 pages, 3 figure
ProtoDESI: First On-Sky Technology Demonstration for the Dark Energy Spectroscopic Instrument
The Dark Energy Spectroscopic Instrument (DESI) is under construction to
measure the expansion history of the universe using the baryon acoustic
oscillations technique. The spectra of 35 million galaxies and quasars over
14,000 square degrees will be measured during a 5-year survey. A new prime
focus corrector for the Mayall telescope at Kitt Peak National Observatory will
deliver light to 5,000 individually targeted fiber-fed robotic positioners. The
fibers in turn feed ten broadband multi-object spectrographs. We describe the
ProtoDESI experiment, that was installed and commissioned on the 4-m Mayall
telescope from August 14 to September 30, 2016. ProtoDESI was an on-sky
technology demonstration with the goal to reduce technical risks associated
with aligning optical fibers with targets using robotic fiber positioners and
maintaining the stability required to operate DESI. The ProtoDESI prime focus
instrument, consisting of three fiber positioners, illuminated fiducials, and a
guide camera, was installed behind the existing Mosaic corrector on the Mayall
telescope. A Fiber View Camera was mounted in the Cassegrain cage of the
telescope and provided feedback metrology for positioning the fibers. ProtoDESI
also provided a platform for early integration of hardware with the DESI
Instrument Control System that controls the subsystems, provides communication
with the Telescope Control System, and collects instrument telemetry data.
Lacking a spectrograph, ProtoDESI monitored the output of the fibers using a
Fiber Photometry Camera mounted on the prime focus instrument. ProtoDESI was
successful in acquiring targets with the robotically positioned fibers and
demonstrated that the DESI guiding requirements can be met.Comment: Accepted versio
Optical Gravitational Lensing Experiment: OGLE-2
We describe a new 1.3 m Warsaw telescope located at Las Campanas Observatory and new instruments of the second phase of the Optical Gravitational Lensing Experiment - OGLE-2. Results of first observations are also presented
VRSTC: Occlusion-Free Video Person Re-Identification
Video person re-identification (re-ID) plays an important role in
surveillance video analysis. However, the performance of video re-ID
degenerates severely under partial occlusion. In this paper, we propose a novel
network, called Spatio-Temporal Completion network (STCnet), to explicitly
handle partial occlusion problem. Different from most previous works that
discard the occluded frames, STCnet can recover the appearance of the occluded
parts. For one thing, the spatial structure of a pedestrian frame can be used
to predict the occluded body parts from the unoccluded body parts of this
frame. For another, the temporal patterns of pedestrian sequence provide
important clues to generate the contents of occluded parts. With the
Spatio-temporal information, STCnet can recover the appearance for the occluded
parts, which could be leveraged with those unoccluded parts for more accurate
video re-ID. By combining a re-ID network with STCnet, a video re-ID framework
robust to partial occlusion (VRSTC) is proposed. Experiments on three
challenging video re-ID databases demonstrate that the proposed approach
outperforms the state-of-the-art.Comment: 10 pages, 6 figures, 5 tables. Accepted by CVPR 201
LMVP: Video Predictor with Leaked Motion Information
We propose a Leaked Motion Video Predictor (LMVP) to predict future frames by
capturing the spatial and temporal dependencies from given inputs. The motion
is modeled by a newly proposed component, motion guider, which plays the role
of both learner and teacher. Specifically, it {\em learns} the temporal
features from real data and {\em guides} the generator to predict future
frames. The spatial consistency in video is modeled by an adaptive filtering
network. To further ensure the spatio-temporal consistency of the prediction, a
discriminator is also adopted to distinguish the real and generated frames.
Further, the discriminator leaks information to the motion guider and the
generator to help the learning of motion. The proposed LMVP can effectively
learn the static and temporal features in videos without the need for human
labeling. Experiments on synthetic and real data demonstrate that LMVP can
yield state-of-the-art results
Improving Sequence-to-Sequence Learning via Optimal Transport
Sequence-to-sequence models are commonly trained via maximum likelihood
estimation (MLE). However, standard MLE training considers a word-level
objective, predicting the next word given the previous ground-truth partial
sentence. This procedure focuses on modeling local syntactic patterns, and may
fail to capture long-range semantic structure. We present a novel solution to
alleviate these issues. Our approach imposes global sequence-level guidance via
new supervision based on optimal transport, enabling the overall
characterization and preservation of semantic features. We further show that
this method can be understood as a Wasserstein gradient flow trying to match
our model to the ground truth sequence distribution. Extensive experiments are
conducted to validate the utility of the proposed approach, showing consistent
improvements over a wide variety of NLP tasks, including machine translation,
abstractive text summarization, and image captioning
Improved Diffusion-based Image Colorization via Piggybacked Models
Image colorization has been attracting the research interests of the
community for decades. However, existing methods still struggle to provide
satisfactory colorized results given grayscale images due to a lack of
human-like global understanding of colors. Recently, large-scale Text-to-Image
(T2I) models have been exploited to transfer the semantic information from the
text prompts to the image domain, where text provides a global control for
semantic objects in the image. In this work, we introduce a colorization model
piggybacking on the existing powerful T2I diffusion model. Our key idea is to
exploit the color prior knowledge in the pre-trained T2I diffusion model for
realistic and diverse colorization. A diffusion guider is designed to
incorporate the pre-trained weights of the latent diffusion model to output a
latent color prior that conforms to the visual semantics of the grayscale
input. A lightness-aware VQVAE will then generate the colorized result with
pixel-perfect alignment to the given grayscale image. Our model can also
achieve conditional colorization with additional inputs (e.g. user hints and
texts). Extensive experiments show that our method achieves state-of-the-art
performance in terms of perceptual quality.Comment: project page: https://piggyback-color.github.io
Auto-GNN: Neural Architecture Search of Graph Neural Networks
Graph neural networks (GNN) has been successfully applied to operate on the
graph-structured data. Given a specific scenario, rich human expertise and
tremendous laborious trials are usually required to identify a suitable GNN
architecture. It is because the performance of a GNN architecture is
significantly affected by the choice of graph convolution components, such as
aggregate function and hidden dimension. Neural architecture search (NAS) has
shown its potential in discovering effective deep architectures for learning
tasks in image and language modeling. However, existing NAS algorithms cannot
be directly applied to the GNN search problem. First, the search space of GNN
is different from the ones in existing NAS work. Second, the representation
learning capacity of GNN architecture changes obviously with slight
architecture modifications. It affects the search efficiency of traditional
search methods. Third, widely used techniques in NAS such as parameter sharing
might become unstable in GNN.
To bridge the gap, we propose the automated graph neural networks (AGNN)
framework, which aims to find an optimal GNN architecture within a predefined
search space. A reinforcement learning based controller is designed to greedily
validate architectures via small steps. AGNN has a novel parameter sharing
strategy that enables homogeneous architectures to share parameters, based on a
carefully-designed homogeneity definition. Experiments on real-world benchmark
datasets demonstrate that the GNN architecture identified by AGNN achieves the
best performance, comparing with existing handcrafted models and tradistional
search methods
More Grounded Image Captioning by Distilling Image-Text Matching Model
Visual attention not only improves the performance of image captioners, but
also serves as a visual interpretation to qualitatively measure the caption
rationality and model transparency. Specifically, we expect that a captioner
can fix its attentive gaze on the correct objects while generating the
corresponding words. This ability is also known as grounded image captioning.
However, the grounding accuracy of existing captioners is far from
satisfactory. To improve the grounding accuracy while retaining the captioning
quality, it is expensive to collect the word-region alignment as strong
supervision. To this end, we propose a Part-of-Speech (POS) enhanced image-text
matching model (SCAN \cite{lee2018stacked}): POS-SCAN, as the effective
knowledge distillation for more grounded image captioning. The benefits are
two-fold: 1) given a sentence and an image, POS-SCAN can ground the objects
more accurately than SCAN; 2) POS-SCAN serves as a word-region alignment
regularization for the captioner's visual attention module. By showing
benchmark experimental results, we demonstrate that conventional image
captioners equipped with POS-SCAN can significantly improve the grounding
accuracy without strong supervision. Last but not the least, we explore the
indispensable Self-Critical Sequence Training (SCST) \cite{Rennie_2017_CVPR} in
the context of grounded image captioning and show that the image-text matching
score can serve as a reward for more grounded captioning
\footnote{https://github.com/YuanEZhou/Grounded-Image-Captioning}.Comment: Accepted by CVPR 202
- …