397 research outputs found
FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling
We consider the problem of task-agnostic feature upsampling in dense
prediction where an upsampling operator is required to facilitate both
region-sensitive tasks like semantic segmentation and detail-sensitive tasks
such as image matting. Existing upsampling operators often can work well in
either type of the tasks, but not both. In this work, we present FADE, a novel,
plug-and-play, and task-agnostic upsampling operator. FADE benefits from three
design choices: i) considering encoder and decoder features jointly in
upsampling kernel generation; ii) an efficient semi-shift convolutional
operator that enables granular control over how each feature point contributes
to upsampling kernels; iii) a decoder-dependent gating mechanism for enhanced
detail delineation. We first study the upsampling properties of FADE on toy
data and then evaluate it on large-scale semantic segmentation and image
matting. In particular, FADE reveals its effectiveness and task-agnostic
characteristic by consistently outperforming recent dynamic upsampling
operators in different tasks. It also generalizes well across convolutional and
transformer architectures with little computational overhead. Our work
additionally provides thoughtful insights on what makes for task-agnostic
upsampling. Code is available at: http://lnkiy.in/fade_inComment: Accepted to ECCV 2022. Code is available at http://lnkiy.in/fade_i
Learning to Upsample by Learning to Sample
We present DySample, an ultra-lightweight and effective dynamic upsampler.
While impressive performance gains have been witnessed from recent kernel-based
dynamic upsamplers such as CARAFE, FADE, and SAPA, they introduce much
workload, mostly due to the time-consuming dynamic convolution and the
additional sub-network used to generate dynamic kernels. Further, the need for
high-res feature guidance of FADE and SAPA somehow limits their application
scenarios. To address these concerns, we bypass dynamic convolution and
formulate upsampling from the perspective of point sampling, which is more
resource-efficient and can be easily implemented with the standard built-in
function in PyTorch. We first showcase a naive design, and then demonstrate how
to strengthen its upsampling behavior step by step towards our new upsampler,
DySample. Compared with former kernel-based dynamic upsamplers, DySample
requires no customized CUDA package and has much fewer parameters, FLOPs, GPU
memory, and latency. Besides the light-weight characteristics, DySample
outperforms other upsamplers across five dense prediction tasks, including
semantic segmentation, object detection, instance segmentation, panoptic
segmentation, and monocular depth estimation. Code is available at
https://github.com/tiny-smart/dysample.Comment: Accepted by ICCV 202
Local strategies for China's carbon mitigation: An investigation of Chinese city-level CO2 emissions
This paper provides a systematic analysis that identifies the driving forces of carbon dioxide (CO2) emissions of 286 Chinese prefecture-level cities in 2012. The regression analysis confirms the economic scale and structure effects on cities' CO2 emissions in China. If China's annual economic growth continues at the rate of 7%, CO2 emissions will increase by about 6% annually. In addition, climate conditions, urbanization and public investment in R&D are identified as important driving forces to increase the CO2 emissions of Chinese cities. While an increment of the urbanization rate by 1% increases the CO2 emissions by about 0.9%; An increase in R&D investment by 1% can help reduce CO2 emissions by 0.21%. As cities in our study vary greatly based on their industry composition, development stage and geographical location, the patterns of their CO2 emissions are also variable. Our study improves the comprehensiveness and accuracy of previous carbon accounting method by distinguishing the scope 1 and scope 2 CO2 emissions and establishing a high spatial resolution dataset of CO2 emissions (CHRED). The analysis covers almost all Chinese prefectural cities and derives useful implications for China's low carbon development
Inversion of inherent optical properties in optically complex waters using sentinel-3A/OLCI images: A case study using China\u27s three largest freshwater lakes
Inherent optical properties (IOPs) play an important role in underwater light field, and are difficult to estimate accurately using satellite data in optically complex waters. To study water quality in appropriate temporal and spatial scales, it is necessary to develop methods to obtain IOPs form space-based observation with quantified uncertainties. Field-measured IOP data (N = 405) were collected from 17 surveys between 2011 and 2017 in the three major largest freshwater lakes of China (Lake Chaohu, Lake Taihu, and Lake Hongze) in the lower reaches of the Yangtze River and Huai River (LYHR). Here we provide a case-study on how to use in-situ observation of IOPs to devise an improved algorithm for retrieval of IOPs. We then apply this algorithm to observation with Sentinel-3A OLCI (Ocean and Land Colour Instrument, corrected with our improved AC scheme), and use in-situ data to show that the algorithm performs better than the standard OLCI IOP product. We use the satellite derived products to study the spatial and seasonal distributions of IOPs and concentrations of optically active constituents in these three lakes, including chlorophyll-a (Chla) and suspended particulate matter (SPM), using all cloud-free OLCI images (115 scenes) over the lakes in the LYHR basin in 2017. Our study provides a strategy for using local and remote observations to obtain important water quality parameters necessary to manage resources such as reservoirs, lakes and coastal waters
Determination of the downwelling diffuse attenuation coefficient of lakewater with the sentinel-3A OLCI
The Ocean and Land Color Imager (OLCI) on the Sentinel-3A satellite, which was launched by the European Space Agency in 2016, is a new-generation water color sensor with a spatial resolution of 300 m and 21 bands in the range of 400-1020 nm. The OLCI is important to the expansion of remote sensing monitoring of inland waters using water color satellite data. In this study, we developed a dual band ratio algorithm for the downwelling diffuse attenuation coefficient at 490 nm (Kd(490)) for the waters of Lake Taihu, a large shallow lake in China, based on data measured during seven surveys conducted between 2008 and 2017 in combination with Sentinel-3A-OLCI data. The results show that: (1) Compared to the available Kd(490) estimation algorithms, the dual band ratio (681 nm/560 nm and 754 nm/560 nm) algorithm developed in this study had a higher estimation accuracy (N = 26, coefficient of determination (R2) = 0.81, root-mean-square error (RMSE) = 0.99m-1and mean absolute percentage error (MAPE) = 19.55%) and validation accuracy (N = 14, R2= 0.83, RMSE = 1.06 m-1and MAPE = 27.30%), making it more suitable for turbid inland waters; (2) A comparison of the OLCI Kd(490) product and a similar Moderate Resolution Imaging Spectroradiometer (MODIS) product reveals a high consistency between the OLCI and MODIS products in terms of the spatial distribution of Kd(490). However, the OLCI product has a smoother spatial distribution and finer textural characteristics than the MODIS product and contains notably higher-quality data; (3) The Kd(490) values for Lake Taihu exhibit notable spatial and temporal variations. Kd(490) is higher in seasons with relatively high wind speeds and in open waters that are prone to wind- and wave-induced sediment resuspension. Finally, the Sentinel-3A-OLCI has a higher spatial resolution and is equipped with a relatively wide dynamic range of spectral bands suitable for inland waters. The Sentinel-3B satellite will be launched soon and, together with the Sentinel-3A satellite, will form a two-satellite network with the ability to make observations twice every three days. This satellite network will have a wider range of application and play an important role in the monitoring of inland waters with complex optical properties
The comparison of optical variability of broad-line Seyfert 1 and narrow-line Seyfert 1 galaxies from the view of Pan-STARRS
By means of the data sets of the Panoramic Survey Telescope and Rapid
Response System (Pan-STARRS), we investigate the relationship between the
variability amplitude and luminosity at 5100 \AA, black hole mass, Eddington
ratio, ( the ratio of the flux of Fe II line within
4435-4685 \AA ~to the broad proportion of line) as well as (the ratio of the flux [O III] line to the total line)
of the broad line Seyfert 1 (BLS1) and narrow line Seyfert 1 (NLS1) galaxies
sample in g,r,i,z and y bands, respectively. We also analyze the similarities
and differences of the variability characteristics between the BLS1 galaxies
and NLS1 galaxies. The results are listed as follows. (1). The cumulative
probability distribution of the variability amplitude shows that NLS1 galaxies
are lower than that in BLS1 galaxies. (2). We analyze the dependence of the
variability amplitude with the luminosity at 5100 \AA, black hole mass,
Eddington ratio, and , respectively. We find
significantly negative correlations between the variability amplitude and
Eddington ratio, insignificant correlations with the luminosity at 5100 \AA.
The results also show significantly positive correlations with the black hole
mass and , significantly negative correlations with which are consistent with Rakshit and Stalin(2017) in low redshift bins
(z<0.4) and Ai et al.(2010). (3). The relationship between the variability
amplitude and the radio loudness is investigated for 155 BLS1 galaxies and 188
NLS1 galaxies. No significant correlations are found in our results.Comment: 10 pages, 5 figures, accepted by Astrophysics and Space Science, in
Pres
SAPA: Similarity-Aware Point Affiliation for Feature Upsampling
We introduce point affiliation into feature upsampling, a notion that
describes the affiliation of each upsampled point to a semantic cluster formed
by local decoder feature points with semantic similarity. By rethinking point
affiliation, we present a generic formulation for generating upsampling
kernels. The kernels encourage not only semantic smoothness but also boundary
sharpness in the upsampled feature maps. Such properties are particularly
useful for some dense prediction tasks such as semantic segmentation. The key
idea of our formulation is to generate similarity-aware kernels by comparing
the similarity between each encoder feature point and the spatially associated
local region of decoder features. In this way, the encoder feature point can
function as a cue to inform the semantic cluster of upsampled feature points.
To embody the formulation, we further instantiate a lightweight upsampling
operator, termed Similarity-Aware Point Affiliation (SAPA), and investigate its
variants. SAPA invites consistent performance improvements on a number of dense
prediction tasks, including semantic segmentation, object detection, depth
estimation, and image matting. Code is available at:
https://github.com/poppinace/sapaComment: Accepted to NeurIPS 2022. Code is available at
https://github.com/poppinace/sap
Geometry Aligned Variational Transformer for Image-conditioned Layout Generation
Layout generation is a novel task in computer vision, which combines the
challenges in both object localization and aesthetic appraisal, widely used in
advertisements, posters, and slides design. An accurate and pleasant layout
should consider both the intra-domain relationship within layout elements and
the inter-domain relationship between layout elements and the image. However,
most previous methods simply focus on image-content-agnostic layout generation,
without leveraging the complex visual information from the image. To this end,
we explore a novel paradigm entitled image-conditioned layout generation, which
aims to add text overlays to an image in a semantically coherent manner.
Specifically, we propose an Image-Conditioned Variational Transformer (ICVT)
that autoregressively generates various layouts in an image. First,
self-attention mechanism is adopted to model the contextual relationship within
layout elements, while cross-attention mechanism is used to fuse the visual
information of conditional images. Subsequently, we take them as building
blocks of conditional variational autoencoder (CVAE), which demonstrates
appealing diversity. Second, in order to alleviate the gap between layout
elements domain and visual domain, we design a Geometry Alignment module, in
which the geometric information of the image is aligned with the layout
representation. In addition, we construct a large-scale advertisement poster
layout designing dataset with delicate layout and saliency map annotations.
Experimental results show that our model can adaptively generate layouts in the
non-intrusive area of the image, resulting in a harmonious layout design.Comment: To be published in ACM MM 202
- …